Re: fw_update lock_db should exit when parent exits

2023-08-24 Thread Andrew Hewus Fresh
On Thu, Aug 24, 2023 at 06:53:27AM -0600, Todd C. Miller wrote:
> On Wed, 23 Aug 2023 18:23:40 -0700, Andrew Hewus Fresh wrote:
> 
> > I would have to see an example of doing that between ksh and perl.
> 
> Standard output should already be a pipe in the perl process by
> virtue of running as a co-process.  In theory you should be able
> to poll on it checking for POLLHUP.  Since our pipes are actually
> bidiretional we can cheat and use select.  Something like this:

I hadn't considered waiting for STDOUT to be readable, that makes total
sense.  Thanks.

I'll probably wait to commit these fw_update fixes until I'm back on
Wednesday for a long weekend away from computers.



Re: dt(4), hardclock(9): move interval, profile providers to dedicated callback

2023-08-24 Thread Martin Pieuchot
On 23/08/23(Wed) 18:52, Scott Cheloha wrote:
> This is the next patch in the clock interrupt reorganization series.

Thanks for your diff.  I'm sorry but it is really hard for me to help
review this diff because there is still no man page for this API+subsystem.

Can we start with that please?

> This patch moves the entry points for the interval and profile dt(4)
> providers from the hardclock(9) to a dedicated clock interrupt
> callback, dt_prov_profile_intr(), in dev/dt/dt_prov_profile.c.
> 
> - To preserve current behavior, (1) both provider entrypoints have
>   been moved into a single callback function, (2) the interrupt runs at
>   the same frequency as the hardclock, and (3) the interrupt is
>   staggered to co-occur with the hardclock on a given CPU.

The only behavior that needs to be preserved is the output of dumping
stacks.  That means DT_FA_PROFILE and DT_FA_STATIC certainly needs to
be adapted with this change.  You can figure that out by looking at the
output of /usr/src/share/btrace/kprofile.bt without and with this diff.

Please generate a FlameGraph to make sure they're still the same.

Apart from that I'd prefer if we could skip the mechanical change and
go straight to what dt(4) needs.  Otherwise we will have to re-design
everything.   If you don't want to do this work, then leave it and tell
me what you need and what is your plan so I can help you and do it
myself.

dt(4) needs a way to schedule two different kind of periodic timeouts
with the higher precision possible.  It is currently plugged to hardclock
because there is nothing better.

The current code assumes the periodic entry points are external to dt(4).
This diff moves them in the middle of dt(4) but keeps the existing flow
which makes the code very convoluted. 
A starting point to understand the added complexity it so see that the
DT_ENTER() macro are no longer needed if we move the entry points inside
dt(4).

The first periodic timeout is dt_prov_interval_enter().  It could be
implemented as a per-PCB timeout_add_nsec(9).  The drawback of this
approach is that it uses too much code in the kernel which is a problem
when instrumenting the kernel itself.  Every subsystem used by dt(4) is
impossible to instrument with btrace(8).

The second periodic timeout it dt_prov_profile_enter().  It is similar
to the previous one and has to run on every CPU.

Both are currently bound to tick, but we want per-PCB time resolution.
We can get rid of `dp_nticks' and `dp_maxtick' if we control when the
timeouts fires.

> - Each dt_pcb gets a provider-specific "dp_clockintr" pointer.  If the
>   PCB's implementing provider needs a clock interrupt to do its work, it
>   stores the handle in dp_clockintr.  The handle, if any, is established
>   during dtpv_alloc() and disestablished during dtpv_dealloc().

Sorry, but as I said I don't understand what is a clockintr.  How does it
fit in the kernel and how is it supposed to be used?

Why have it per PCB and not per provider or for the whole driver instead?
Per-PCB implies that if I run 3 different profiling on a 32 CPU machines
I now have 96 different clockintr.  Is it what we want?  If so, why not
simply use timeout(9)?  What's the difference?

>   One alternative is to start running the clock interrupts when they
>   are allocated in dtpv_alloc() and stop them when they are freed in
>   dtpv_dealloc().  This is wasteful, though: the PCBs are not recording
>   yet, so the interrupts won't perform any useful work until the
>   controlling process enables recording.
> 
>   An additional pair of provider hooks, e.g. "dtpv_record_start" and
>   "dtpv_record_stop", might resolve this.

Another alternative would be to have a single hardclock-like handler for
dt(4).  Sorry, I don't understand the big picture behind clockintr, so I
can't help.  

> - We haven't needed to destroy clock interrupts yet, so the
>   clockintr_disestablish() function used in this patch is new.

So maybe that's fishy.  Are they supposed to be destroyed?  What is the
goal of this API?



vmd: fix i8259 race condition, vioblk hang

2023-08-24 Thread Dave Voutila
mbuhl@ found an issue where the emulated virtio block device can
hang. The tl;dr: the emulated pic never injects an interrupt and the
vioblk(4) driver in the guest starves, waiting to be told to check the
virtqueue.

Flipping the bit in the i8259 using gdb causes the spice to flow once
again.

This diff fixes two related things (so could be committed in 2 parts):

1. the irq deassert on a virtio pci interrupt status register read
   should occur synchronously by the vcpu thread in the vm and not
   async.

2. always raise the irr bit in the pic, regardless of mask. The mask is
   used when finding an irq to ack and shouldn't be used to determine if
   the irr bit is raised. AFAICT from the old i8259 data sheets, masking
   should have no effect on if the irr bit is set.

This is holding up another diff I want to share that reduces latency in
interrupts, but it's shown to make this i8259 race condition worse, so
I'd like to give folks a few days to check if the below diff causes
regressions. Once this is committed, I'll feel safe sharing the latency
diff with tech@.

Any ok's pending a few days for folks to test?

-dv


diff refs/heads/master cdba90a89ee6e77b52c2b6955d36a260a23a3b85
commit - ad1cd1152fddbf55189657a2df9f2468409698ab
commit + cdba90a89ee6e77b52c2b6955d36a260a23a3b85
blob - f7862f5e9d17f96f5260358271fab8f253b26505
blob + b6891c52c824d7b8c69e67e5323772152b1ed844
--- usr.sbin/vmd/i8259.c
+++ usr.sbin/vmd/i8259.c
@@ -222,20 +222,16 @@ i8259_assert_irq(uint8_t irq)
 {
mutex_lock(_mtx);
if (irq <= 7) {
-   if (!ISSET(pics[MASTER].imr, 1 << irq)) {
-   SET(pics[MASTER].irr, 1 << irq);
-   pics[MASTER].asserted = 1;
-   }
+   SET(pics[MASTER].irr, 1 << irq);
+   pics[MASTER].asserted = 1;
} else {
irq -= 8;
-   if (!ISSET(pics[SLAVE].imr, 1 << irq)) {
-   SET(pics[SLAVE].irr, 1 << irq);
-   pics[SLAVE].asserted = 1;
+   SET(pics[SLAVE].irr, 1 << irq);
+   pics[SLAVE].asserted = 1;

-   /* Assert cascade IRQ on master PIC */
-   SET(pics[MASTER].irr, 1 << 2);
-   pics[MASTER].asserted = 1;
-   }
+   /* Assert cascade IRQ on master PIC */
+   SET(pics[MASTER].irr, 1 << 2);
+   pics[MASTER].asserted = 1;
}
mutex_unlock(_mtx);
 }
blob - 7bc76c4daed427dae82688452ec21985be679bc4
blob + 4d321fd4ff23514ffa7a4bee0eeb7a9324020d80
--- usr.sbin/vmd/vioblk.c
+++ usr.sbin/vmd/vioblk.c
@@ -39,7 +39,8 @@ static uint32_t handle_io_read(struct viodev_msg *, st
 extern struct vmd_vm *current_vm;

 static const char *disk_type(int);
-static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *);
+static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *,
+int8_t *);
 static int handle_io_write(struct viodev_msg *, struct virtio_dev *);
 void vioblk_notify_rx(struct vioblk_dev *);
 int vioblk_notifyq(struct vioblk_dev *);
@@ -723,6 +724,7 @@ handle_sync_io(int fd, short event, void *arg)
struct viodev_msg msg;
struct imsg imsg;
ssize_t n;
+   int8_t intr = INTR_STATE_NOOP;

if (event & EV_READ) {
if ((n = imsg_read(ibuf)) == -1 && errno != EAGAIN)
@@ -770,8 +772,9 @@ handle_sync_io(int fd, short event, void *arg)
}
case VIODEV_MSG_IO_READ:
/* Read IO: make sure to send a reply */
-   msg.data = handle_io_read(, dev);
+   msg.data = handle_io_read(, dev, );
msg.data_valid = 1;
+   msg.state = intr;
imsg_compose_event(iev, IMSG_DEVOP_MSG, 0, 0, -1, ,
sizeof(msg));
break;
@@ -844,7 +847,7 @@ handle_io_read(struct viodev_msg *msg, struct virtio_d
 }

 static uint32_t
-handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev)
+handle_io_read(struct viodev_msg *msg, struct virtio_dev *dev, int8_t *intr)
 {
struct vioblk_dev *vioblk = >vioblk;
uint8_t sz = msg->io_sz;
@@ -1001,7 +1004,8 @@ handle_io_read(struct viodev_msg *msg, struct virtio_d
case VIRTIO_CONFIG_ISR_STATUS:
data = vioblk->cfg.isr_status;
vioblk->cfg.isr_status = 0;
-   virtio_deassert_pic_irq(dev, 0);
+   if (intr != NULL)
+   *intr = INTR_STATE_DEASSERT;
break;
default:
return (0x);
blob - c16ad2635ea622fd7fce48b5145e2199dd451346
blob + 5c2a943ac7ad02dfbc4793f5caab3200a3ced803
--- usr.sbin/vmd/vionet.c
+++ usr.sbin/vmd/vionet.c
@@ -52,7 +52,8 @@ static uint32_t handle_io_read(struct viodev_msg *, st

 static int vionet_rx(struct vionet_dev *);
 static void vionet_rx_event(int, short, void 

Re: ping.c modifications proof of concept

2023-08-24 Thread Peter J. Philipp
On Thu, Aug 24, 2023 at 09:22:07AM -0400, A Tammy wrote:
> I don't think having a daemon for ping (or other trivial network
> operations) might be the best design. There's nothing about the service
> that demands a continuously running process in the background.
> 
> Aisha

Ok Aisha, thanks.  Well if there is want for this you guys have a skeleton
anyhow to work with.

Good day! :-)

-peter

-- 
Over thirty years experience on Unix-like Operating Systems starting with QNX.



Re: ping.c modifications proof of concept

2023-08-24 Thread A Tammy
On 8/24/23 05:59, Peter J. Philipp wrote:
> Hi,
>
> I have modified ping(8) to grab a raw descriptor from a daemon over AF_UNIX
> sockets.  This seems to work.  While what I call the sun daemon needs to be
> tightened a lot more it should work to make people understand my concept.
>
> benefits:
> we lose inet pledge
> we lose the setuid to root bit
> root can bypass this entirely so it works in single user mode
> it can be broadened to other programs such as traceroute
>
> drawbacks:
> not fully tested
> sund needs more tightening or there is a security problem
> if sund dies ping doesn't work for regular users


I don't think having a daemon for ping (or other trivial network
operations) might be the best design. There's nothing about the service
that demands a continuously running process in the background.

Aisha


> Here is a demonstration:
> pjp@polarstern$ ls -l /tmp/ping
> -rwxr-xr-x  2 root  wheel  1442864 Aug 24 11:38 /tmp/ping
> pjp@polarstern$ /tmp/ping -D -c 1 127.0.0.1
> PING 127.0.0.1 (127.0.0.1): 56 data bytes
> 64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0.073 ms
>
> --- 127.0.0.1 ping statistics ---
> 1 packets transmitted, 1 packets received, 0.0% packet loss
> round-trip min/avg/max/std-dev = 0.073/0.073/0.073/0.000 ms
> pjp@polarstern$ /tmp/ping6 -D -c 1 centroid.eu
> PING centroid.eu (2a03:6000:6f68:631::170): 56 data bytes
> 64 bytes from 2a03:6000:6f68:631::170: icmp_seq=0 hlim=54 time=31.059 ms
>
> --- centroid.eu ping statistics ---
> 1 packets transmitted, 1 packets received, 0.0% packet loss
> round-trip min/avg/max/std-dev = 31.059/31.059/31.059/0.000 ms
>
> Here is the sund.c code (needs improving):
>
>
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> #include 
> #include 
>
> #include 
> #include 
> #include 
> #include 
>
>
> #define EFFECTIVE_PINGUSER1000
> #define EFFECTIVE_PINGGROUP   1000
>
> #define SUND_PATH "/var/run/sund.sock"
>
> void desc_write(int, int);
>
> int
> main(void)
> {
>   int so, new, sel;
>   int raw, error;
>
>   struct addrinfo hints, *res;
>   struct sockaddr_un sun;
>   fd_set rset;
>
>   uid_t euid;
>   gid_t egid;
>
>   char buf[INET6_ADDRSTRLEN + 1];
>   socklen_t sunsz = sizeof(struct sockaddr_un);
>   size_t l;
>
>   unveil(SUND_PATH, "rwc");
>   unveil(NULL, NULL);
>
>   unlink(SUND_PATH);
>
>
>   so = socket(AF_UNIX, SOCK_STREAM, 0);
>   if (so < 0) {
>   perror("socket");
>   exit(1);
>   }
>
>   memset(, 0, sizeof(sun));
>   sun.sun_family = AF_UNIX;   
>   sun.sun_len = sizeof(struct sockaddr_un);
>   
>   strlcpy(sun.sun_path, "/var/run/sund.sock", sizeof(sun.sun_path));
>
>   if (bind(so, (struct sockaddr *), sizeof(sun)) == -1) {
>   perror("bind");
>   exit(1);
>   }
>   chmod(SUND_PATH, 0666);
>
>   listen(so, 5);
>   daemon(0, 0);
>
>   if (pledge("stdio inet sendfd", NULL) == -1) {
>   perror("pledge");
>   exit(1);
>   }
>
>   for (;;) {
>   FD_ZERO();
>   FD_SET(so, );
>   
>   sel = select(so + 1, , NULL, NULL, NULL);
>   switch (sel) {
>   case -1:
>   perror("select");
>   continue;
>   default:
>   break;
>   }
>
>   if (FD_ISSET(so, )) {
>   new = accept(so, (struct sockaddr *), );
>   if (new == -1) {
>   perror("accept");
>   continue;
>   }
>   
>   if (getpeereid(new, , ) == -1) {
>   perror("getpeereid");
>   close(new);
>   continue;
>   }
>
>
>   if ((euid != EFFECTIVE_PINGUSER) &&
>   (egid != EFFECTIVE_PINGGROUP)) {
>   close(new);
>   continue;
>   }
>
>   if ((l = recv(new, buf, sizeof(buf), 0)) == -1) {
>   close(new);
>   continue;
>   }
>   
>   buf[l] = '\0';
>
>   memset(, 0, sizeof(hints));
>   if (strchr(buf, '.') != NULL) {
>   hints.ai_family = AF_INET;
>   } else {
>   hints.ai_family = AF_INET6;
>   }
>
>   hints.ai_flags = AI_NUMERICHOST;
>
>   if ((error = getaddrinfo(buf,"53",,)) != 0) {
>   fprintf(stderr, "getaddrinfo: %s\n",
>   gai_strerror(error));
> 

Re: fw_update lock_db should exit when parent exits

2023-08-24 Thread Todd C . Miller
On Wed, 23 Aug 2023 18:23:40 -0700, Andrew Hewus Fresh wrote:

> I would have to see an example of doing that between ksh and perl.

Standard output should already be a pipe in the perl process by
virtue of running as a co-process.  In theory you should be able
to poll on it checking for POLLHUP.  Since our pipes are actually
bidiretional we can cheat and use select.  Something like this:

 - todd

lock_db() {
[ "${LOCKPID:-}" ] && return 0

# The installer doesn't have perl, so we can't lock there
[ -e /usr/bin/perl ] || return 0

set -o monitor
perl <<'EOL' |&
use v5.16;
use warnings;
no lib ('/usr/local/libdata/perl5/site_perl');
use OpenBSD::PackageInfo qw< lock_db >;

$|=1;

lock_db(0);

say $$;
my $rin = my $win = my $ein = '';
vec($ein, fileno(STDOUT), 1) = 1;
vec($rin, fileno(STDOUT), 1) = 1;
my $nfound = select(my $rout = $rin, my $wout = $win, my $eout 
= $ein, undef);
EOL
set +o monitor

read -rp LOCKPID

return 0
}



ping.c modifications proof of concept

2023-08-24 Thread Peter J. Philipp
Hi,

I have modified ping(8) to grab a raw descriptor from a daemon over AF_UNIX
sockets.  This seems to work.  While what I call the sun daemon needs to be
tightened a lot more it should work to make people understand my concept.

benefits:
we lose inet pledge
we lose the setuid to root bit
root can bypass this entirely so it works in single user mode
it can be broadened to other programs such as traceroute

drawbacks:
not fully tested
sund needs more tightening or there is a security problem
if sund dies ping doesn't work for regular users

Here is a demonstration:
pjp@polarstern$ ls -l /tmp/ping
-rwxr-xr-x  2 root  wheel  1442864 Aug 24 11:38 /tmp/ping
pjp@polarstern$ /tmp/ping -D -c 1 127.0.0.1
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0.073 ms

--- 127.0.0.1 ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.073/0.073/0.073/0.000 ms
pjp@polarstern$ /tmp/ping6 -D -c 1 centroid.eu
PING centroid.eu (2a03:6000:6f68:631::170): 56 data bytes
64 bytes from 2a03:6000:6f68:631::170: icmp_seq=0 hlim=54 time=31.059 ms

--- centroid.eu ping statistics ---
1 packets transmitted, 1 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 31.059/31.059/31.059/0.000 ms

Here is the sund.c code (needs improving):


#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 
#include 

#include 
#include 
#include 
#include 


#define EFFECTIVE_PINGUSER  1000
#define EFFECTIVE_PINGGROUP 1000

#define SUND_PATH   "/var/run/sund.sock"

void desc_write(int, int);

int
main(void)
{
int so, new, sel;
int raw, error;

struct addrinfo hints, *res;
struct sockaddr_un sun;
fd_set rset;

uid_t euid;
gid_t egid;

char buf[INET6_ADDRSTRLEN + 1];
socklen_t sunsz = sizeof(struct sockaddr_un);
size_t l;

unveil(SUND_PATH, "rwc");
unveil(NULL, NULL);

unlink(SUND_PATH);


so = socket(AF_UNIX, SOCK_STREAM, 0);
if (so < 0) {
perror("socket");
exit(1);
}

memset(, 0, sizeof(sun));
sun.sun_family = AF_UNIX;   
sun.sun_len = sizeof(struct sockaddr_un);

strlcpy(sun.sun_path, "/var/run/sund.sock", sizeof(sun.sun_path));

if (bind(so, (struct sockaddr *), sizeof(sun)) == -1) {
perror("bind");
exit(1);
}
chmod(SUND_PATH, 0666);

listen(so, 5);
daemon(0, 0);

if (pledge("stdio inet sendfd", NULL) == -1) {
perror("pledge");
exit(1);
}

for (;;) {
FD_ZERO();
FD_SET(so, );

sel = select(so + 1, , NULL, NULL, NULL);
switch (sel) {
case -1:
perror("select");
continue;
default:
break;
}

if (FD_ISSET(so, )) {
new = accept(so, (struct sockaddr *), );
if (new == -1) {
perror("accept");
continue;
}

if (getpeereid(new, , ) == -1) {
perror("getpeereid");
close(new);
continue;
}


if ((euid != EFFECTIVE_PINGUSER) &&
(egid != EFFECTIVE_PINGGROUP)) {
close(new);
continue;
}

if ((l = recv(new, buf, sizeof(buf), 0)) == -1) {
close(new);
continue;
}

buf[l] = '\0';

memset(, 0, sizeof(hints));
if (strchr(buf, '.') != NULL) {
hints.ai_family = AF_INET;
} else {
hints.ai_family = AF_INET6;
}

hints.ai_flags = AI_NUMERICHOST;

if ((error = getaddrinfo(buf,"53",,)) != 0) {
fprintf(stderr, "getaddrinfo: %s\n",
gai_strerror(error));
close(new);
continue;
}

if ((raw = socket(res->ai_family, SOCK_RAW, 
res->ai_family == AF_INET ? IPPROTO_ICMP :
IPPROTO_ICMPV6)) == -1) {

Re: [patch] netcat: support --crlf

2023-08-24 Thread Pietro Cerutti

On Aug 24 2023, 01:02 UTC, Damien Miller  wrote:

On Wed, 23 Aug 2023, Pietro Cerutti wrote:


Hi,

here at FreeBSD, we vendor in your netcat with a few local modifications.

I'm working on adding support to --crlf. I have a diff against the FreeBSD
version here: https://reviews.freebsd.org/D41489

I'd like this to be upstreamed. If there's consensus, I'll prepare a patch
against OpenBSD's version and send it over.


What is the motivation for this option beyond "Linux has it"?


The motivation is that several network protocols are line oriented with 
CRLF as line terminators. SMTP and HTTP are among the most popular.

I don't come from Linux so "Linux has it" was not among my motivations.

Correct me if I'm wrong but it seems trivial to do this conversion 
without writing more code by sticking tr in a pipe with nc.


Can you please provide an example? I can't see how to make tr convert 
LFs not preceeded by CRs to CRLFs.



OpenBSD's nc doesn't use getopt_long at present and I'm not sure there
would be appetite to do it for a single new flag. I note that nc on the
Debian machine I have at hand does -C but doesn't recognise --crlf.
IMO the long option therefore just adds incompatibility.


That's a fair point - I should have noticed that. If you want the 
feature, we can pick a new short option and I'm happy to make the 
FreeBSD version compatible with that.




[djm@dvm ~]$ nc --crlf
nc: invalid option -- '-'
usage: nc [-46CDdFhklNnrStUuvZz] [-I length] [-i interval] [-M ttl]
 [-m minttl] [-O length] [-P proxy_username] [-p source_port]
 [-q seconds] [-s sourceaddr] [-T keyword] [-V rtable] [-W recvlimit]
 [-w timeout] [-X proxy_protocol] [-x proxy_address[:port]]
 [destination] [port]
[djm@dvm ~]$ uname -a
Linux dvm 6.1.0-10-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.38-1 (2023-07-14) 
x86_64 GNU/Linux


FWIW, it works on RHEL 7.9

--
Pietro Cerutti
I have pledged to give 10% of income to effective charities
and invite you to join me - https://givingwhatwecan.org