ftp(1) double free

2020-06-20 Thread Jeremie Courreges-Anglas


Small bug, but still...

naddy@ reported a double free when interrupting (^C) ftp(1) at the end
or url_get().  If that happens, the SIGINT handler longjmps and the
cleanup path is taken a second time.  To avoid this, restore the
previous SIGINT handler in each possible error path.  I chose to restore
it earlier in the successful path to avoid too much duplication.
file_get() gets the same treatment.

Note that naddy hit this because of another bug causing ftp_close()
to stall, and this depends on the server you're talking to.  You can
insert a sleep(10); call before "return (rval);" to help reproduce the
crash.

ok?


Index: fetch.c
===
RCS file: /d/cvs/src/usr.bin/ftp/fetch.c,v
retrieving revision 1.195
diff -u -p -p -u -r1.195 fetch.c
--- fetch.c 20 Jun 2020 09:59:48 -  1.195
+++ fetch.c 21 Jun 2020 00:12:10 -
@@ -261,6 +261,7 @@ file_get(const char *path, const char *o
for (cp = buf; len > 0; len -= wlen, cp += wlen) {
if ((wlen = write(out, cp, len)) == -1) {
warn("Writing %s", savefile);
+   signal(SIGINT, oldintr);
signal(SIGINFO, oldinti);
goto cleanup_copy;
}
@@ -274,6 +275,7 @@ file_get(const char *path, const char *o
}
}
save_errno = errno;
+   signal(SIGINT, oldintr);
signal(SIGINFO, oldinti);
if (hash && !progress && bytes > 0) {
if (bytes < mark)
@@ -288,7 +290,6 @@ file_get(const char *path, const char *o
progressmeter(1, NULL);
if (verbose)
ptransfer(0);
-   (void)signal(SIGINT, oldintr);
 
rval = 0;
 
@@ -1032,6 +1033,7 @@ noslash:
oldinti = signal(SIGINFO, psummary);
if (chunked) {
error = save_chunked(fin, tls, out, buf, buflen);
+   signal(SIGINT, oldintr);
signal(SIGINFO, oldinti);
if (error == -1)
goto cleanup_url_get;
@@ -1041,6 +1043,7 @@ noslash:
for (cp = buf; len > 0; len -= wlen, cp += wlen) {
if ((wlen = write(out, cp, len)) == -1) {
warn("Writing %s", savefile);
+   signal(SIGINT, oldintr);
signal(SIGINFO, oldinti);
goto cleanup_url_get;
}
@@ -1054,6 +1057,7 @@ noslash:
}
}
save_errno = errno;
+   signal(SIGINT, oldintr);
signal(SIGINFO, oldinti);
if (hash && !progress && bytes > 0) {
if (bytes < mark)
@@ -1080,7 +1084,6 @@ noslash:
 
if (verbose)
ptransfer(0);
-   (void)signal(SIGINT, oldintr);
 
rval = 0;
goto cleanup_url_get;


-- 
jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF  DDCC 0DFA 74AE 1524 E7EE



Re: userland clock_gettime proof of concept

2020-06-20 Thread Christian Weisgerber
On 2020-06-20, Christian Weisgerber  wrote:

> I can't get this revision of the diff to work on amd64:
> * patch source
> * build and install kernel, reboot
> * make build
> * reboot -> "Process (pid 1) got signal 11"
>
> I'm at a loss.  As part of the "make build", the new libc is installed
> and dynamically linked programs should already be using the userland
> gettime calls.  Clearly this works.  So why does init fail on the
> next reboot?

I can recover by extracting ./sbin/init from a snapshot in the
installer.  After that, the system comes up fine in multiuser mode.
Nothing else appears to be affected, apart from init.

For a while, I had a reproducible situation.

When you call init(8) as a normal user in multiuser mode, it will
just exit with "init: Operation not permitted".  Instead it would
segfault!  I kept tweaking lib/libc/dlfcn/init.c, rebuilding and
reinstalling libc.a, rebuilding init, and watching it segfault.
None of the debug write(2)s I inserted would produce any output,
it seemed to die before ever reaching _libc_preinit().  I finally
ktraced it:

 12420 ktrace   RET   ktrace 0
 12420 ktrace   CALL  execve(0x7f7ec412,0x7f7ec298,0x7f7ec2a8)
 12420 ktrace   NAMI  "./obj/init"
 12420 ktrace   ARGS  
[0] = "./obj/init"
 12420 init RET   execve 0
 12420 init PSIG  SIGSEGV SIG_DFL code SEGV_MAPERR<1> addr=0x0 trapno=6
 12420 init NAMI  "init.core"

There's not even a kbind(2) there.

Then I removed the clearly useless debug write()s... and since then
I have a hard time reproducing the problem.

It doesn't make any sense.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: userland clock_gettime proof of concept

2020-06-20 Thread Mark Kettenis
> From: Christian Weisgerber 
> Date: Sat, 20 Jun 2020 19:57:06 - (UTC)
> 
> On 2020-06-19, Paul Irofti  wrote:
> 
> > I have addressed your comments bellow, except for the CPU skew one. That
> > code disables TSC for all CPUs, not just for PRIMARY. Would you like to
> > walk and add code for every CPU to check the drift and then disable the
> > TSC? It seems a little too much...
> >
> > [diff]
> 
> I can't get this revision of the diff to work on amd64:
> * patch source
> * build and install kernel, reboot
> * make build
> * reboot -> "Process (pid 1) got signal 11"
> 
> I'm at a loss.  As part of the "make build", the new libc is installed
> and dynamically linked programs should already be using the userland
> gettime calls.  Clearly this works.  So why does init fail on the
> next reboot?

Do other static binaries work after your make build finishes?

Maybe this is because the _timekeep = NULL -> _timekeep change I
suggested that Paul added?



lfence for rdtsc

2020-06-20 Thread Mark Kettenis
RDTSC is not a serializing instruction; to make sure we get the TSC
value corresponding to the position of RDTSC in te instruction stream
we need a barrier.  Linux uses LFENCE on machines where it is
available.  FreeBSD seems to prefer MFENCE for AMD CPUs but uses
LFENCE for Intel CPUs.  For now my thinjing is that what's good enough
for Linux should be good enough for us.  And on amd64 LFENCE is always
available.

This diff reduces the scatter in the skew values.  Before I had
occasional outliers of more than 200 cycles.  Now the maximem values I see are 
around 60 cycles.

I din't changes the rdtsc() call that reads the timecounter.  But
maybe that one should change as well?  Bit of a tradeof between
performance and accoracy I think.

This also changes the skew print message (stolen from what Theo put in
snaps).  Printing the CPU number makes it easier to get statistics for
a specific CPU.  Diff also enabled the debug message.  Maybe it should
be committed this way and then disabled again later such that we can
get some statistics?

comments?  ok?


Index: arch/amd64/amd64/tsc.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
retrieving revision 1.16
diff -u -p -r1.16 tsc.c
--- arch/amd64/amd64/tsc.c  6 Apr 2020 00:01:08 -   1.16
+++ arch/amd64/amd64/tsc.c  20 Jun 2020 20:01:46 -
@@ -100,9 +100,9 @@ get_tsc_and_timecount(struct timecounter
int i;
 
for (i = 0; i < RECALIBRATE_MAX_RETRIES; i++) {
-   tsc1 = rdtsc();
+   tsc1 = rdtsc_lfence();
n = (tc->tc_get_timecount(tc) & tc->tc_counter_mask);
-   tsc2 = rdtsc();
+   tsc2 = rdtsc_lfence();
 
if ((tsc2 - tsc1) < RECALIBRATE_SMI_THRESHOLD) {
*count = n;
@@ -216,8 +216,9 @@ tsc_get_timecount(struct timecounter *tc
 void
 tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq)
 {
+#define TSC_DEBUG
 #ifdef TSC_DEBUG
-   printf("%s: TSC skew=%lld observed drift=%lld\n", __func__,
+   printf("%s: TSC skew=%lld observed drift=%lld\n", ci->ci_dev->dv_xname,
(long long)ci->ci_tsc_skew, (long long)tsc_drift_observed);
 #endif
 
@@ -276,12 +277,12 @@ tsc_read_bp(struct cpu_info *ci, uint64_
 
/* Flag it and read our TSC. */
atomic_setbits_int(>ci_flags, CPUF_SYNCTSC);
-   bptsc = (rdtsc() >> 1);
+   bptsc = (rdtsc_lfence() >> 1);
 
/* Wait for remote to complete, and read ours again. */
while ((ci->ci_flags & CPUF_SYNCTSC) != 0)
membar_consumer();
-   bptsc += (rdtsc() >> 1);
+   bptsc += (rdtsc_lfence() >> 1);
 
/* Wait for the results to come in. */
while (tsc_sync_cpu == ci)
@@ -317,11 +318,11 @@ tsc_post_ap(struct cpu_info *ci)
/* Wait for go-ahead from primary. */
while ((ci->ci_flags & CPUF_SYNCTSC) == 0)
membar_consumer();
-   tsc = (rdtsc() >> 1);
+   tsc = (rdtsc_lfence() >> 1);
 
/* Instruct primary to read its counter. */
atomic_clearbits_int(>ci_flags, CPUF_SYNCTSC);
-   tsc += (rdtsc() >> 1);
+   tsc += (rdtsc_lfence() >> 1);
 
/* Post result.  Ensure the whole value goes out atomically. */
(void)atomic_swap_64(_sync_val, tsc);
Index: arch/amd64/include/cpufunc.h
===
RCS file: /cvs/src/sys/arch/amd64/include/cpufunc.h,v
retrieving revision 1.34
diff -u -p -r1.34 cpufunc.h
--- arch/amd64/include/cpufunc.h28 Jun 2019 21:54:05 -  1.34
+++ arch/amd64/include/cpufunc.h20 Jun 2020 20:01:46 -
@@ -292,6 +292,15 @@ rdtsc(void)
 }
 
 static __inline u_int64_t
+rdtsc_lfence(void)
+{
+   uint32_t hi, lo;
+
+   __asm volatile("lfence; rdtsc" : "=d" (hi), "=a" (lo));
+   return (((uint64_t)hi << 32) | (uint64_t) lo);
+}
+
+static __inline u_int64_t
 rdpmc(u_int pmc)
 {
uint32_t hi, lo;



Re: userland clock_gettime proof of concept

2020-06-20 Thread Christian Weisgerber
On 2020-06-19, Paul Irofti  wrote:

> I have addressed your comments bellow, except for the CPU skew one. That
> code disables TSC for all CPUs, not just for PRIMARY. Would you like to
> walk and add code for every CPU to check the drift and then disable the
> TSC? It seems a little too much...
>
> [diff]

I can't get this revision of the diff to work on amd64:
* patch source
* build and install kernel, reboot
* make build
* reboot -> "Process (pid 1) got signal 11"

I'm at a loss.  As part of the "make build", the new libc is installed
and dynamically linked programs should already be using the userland
gettime calls.  Clearly this works.  So why does init fail on the
next reboot?

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: bcmtmon - Broadcom BCM2711 temperature monitor (RPI4)

2020-06-20 Thread Mark Kettenis
> From: Al Poole 
> Date: Sat, 13 Jun 2020 20:33:11 +0100
> 
> Hi,
> 
> This introduces temperature driver for the Raspberry PI 4 named bcmtmon.
> 
> bcmtmon because of bcmtemp(4) and (!strncmp("bcmt", snsrdev.xname, 4))
> 
> @kettenis has been advising me and answering questions.

Thanks,

I made some small changes such that the driver also works with the
device tree that comes with the Raspberry Pi firmware packages that we
distribute.  I also renamed the file to match the driver name.

Cheers,

Mark



Re: pppx(4): remove panics

2020-06-20 Thread Vitaliy Makkoveev
yasuoka@ privately pointed I was wrong about KASSERT(). I fixed my
diff. These checks are useless and I guess they make understanding
harder so I removed panic() from pppx_if_destoy() too.

Index: sys/net/if_pppx.c
===
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.88
diff -u -p -r1.88 if_pppx.c
--- sys/net/if_pppx.c   18 Jun 2020 14:20:12 -  1.88
+++ sys/net/if_pppx.c   20 Jun 2020 11:32:58 -
@@ -696,14 +696,10 @@ pppx_add_session(struct pppx_dev *pxd, s
pxi->pxi_key.pxik_protocol = req->pr_protocol;
pxi->pxi_dev = pxd;
 
-   /* this is safe without splnet since we're not modifying it */
-   if (RBT_FIND(pppx_ifs, _ifs, pxi) != NULL) {
+   if (RBT_INSERT(pppx_ifs, _ifs, pxi) != NULL) {
error = EADDRINUSE;
goto out;
}
-
-   if (RBT_INSERT(pppx_ifs, _ifs, pxi) != NULL)
-   panic("%s: pppx_ifs modified while lock was held", __func__);
LIST_INSERT_HEAD(>pxd_pxis, pxi, pxi_list);
 
snprintf(ifp->if_xname, sizeof(ifp->if_xname), "%s%d", "pppx", unit);
@@ -776,8 +772,7 @@ pppx_add_session(struct pppx_dev *pxd, s
return (error);
 
 remove:
-   if (RBT_REMOVE(pppx_ifs, _ifs, pxi) == NULL)
-   panic("%s: pppx_ifs modified while lock was held", __func__);
+   RBT_REMOVE(pppx_ifs, _ifs, pxi);
LIST_REMOVE(pxi, pxi_list);
 out:
pool_put(pppx_if_pl, pxi);
@@ -870,8 +865,7 @@ pppx_if_destroy(struct pppx_dev *pxd, st
NET_LOCK();
 
pipex_rele_session(session);
-   if (RBT_REMOVE(pppx_ifs, _ifs, pxi) == NULL)
-   panic("%s: pppx_ifs modified while lock was held", __func__);
+   RBT_REMOVE(pppx_ifs, _ifs, pxi);
LIST_REMOVE(pxi, pxi_list);
 
pool_put(pppx_if_pl, pxi);



Re: Retire

2020-06-20 Thread Christian Weisgerber
On 2020-06-19, "Theo de Raadt"  wrote:

> Well... they something in ports might still look at them in 
>
> Can someone from ports speak about this?

I have started an amd64 bulk build without .

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



Re: ldpd engine process exits with pledge "cpath"

2020-06-20 Thread wlund
Hi!

Ricardo Mestre  wrote:
> this is getting in my nerves, I made a wrong assumption here and I
> prefer to commit this than backout my previous commit so I'm asking for
> commits for the below.

attached. I hope I understood your intention.

> 
> On 14:43 Fri 19 Jun , Ricardo Mestre wrote:
> > mea culpa, but I'd rather just remove the unlink of the socket.
> > 
> > OK?
> > 
> > Index: control.c
> > ===
> > RCS file: /cvs/src/usr.sbin/ldpd/control.c,v
> > retrieving revision 1.29
> > diff -u -p -u -r1.29 control.c
> > --- control.c   3 Mar 2017 23:30:57 -   1.29
> > +++ control.c   19 Jun 2020 13:40:46 -
> > @@ -98,11 +98,10 @@ control_listen(void)
> >  }
> >  
> >  void
> > -control_cleanup(char *path)
> > +control_cleanup(void)
> >  {
> > accept_del(control_fd);
> > close(control_fd);
> > -   unlink(path);
> >  }
> >  
> >  /* ARGSUSED */
> > Index: control.h
> > ===
> > RCS file: /cvs/src/usr.sbin/ldpd/control.h,v
> > retrieving revision 1.9
> > diff -u -p -u -r1.9 control.h
> > --- control.h   3 Mar 2017 23:30:57 -   1.9
> > +++ control.h   19 Jun 2020 13:40:46 -
> > @@ -32,7 +32,7 @@ extern struct ctl_conns ctl_conns;
> >  
> >  intcontrol_init(char *);
> >  intcontrol_listen(void);
> > -void   control_cleanup(char *);
> > +void   control_cleanup(void);
> >  intcontrol_imsg_relay(struct imsg *);
> >  
> >  #endif /* _CONTROL_H_ */
> > Index: ldpe.c
> > ===
> > RCS file: /cvs/src/usr.sbin/ldpd/ldpe.c,v
> > retrieving revision 1.76
> > diff -u -p -u -r1.76 ldpe.c
> > --- ldpe.c  10 Aug 2019 01:30:53 -  1.76
> > +++ ldpe.c  19 Jun 2020 13:40:46 -
> > @@ -171,7 +171,7 @@ ldpe_shutdown(void)
> > msgbuf_clear(_main->ibuf.w);
> > close(iev_main->ibuf.fd);
> >  
> > -   control_cleanup(global.csock);
> > +   control_cleanup();
> > config_clear(leconf);
> >  
> > if (sysdep.no_pfkey == 0) {
> > 


diff refs/heads/master 1544d4b6d260a07402505aca7822cc1399dacf31
blob - fe0eebdbea6bc8a5090782a5cd91f4c803651caf
blob + 071953e26373fd816177e39876fa1065f7c32835
--- usr.sbin/ldpd/control.c
+++ usr.sbin/ldpd/control.c
@@ -98,11 +98,10 @@ control_listen(void)
 }
 
 void
-control_cleanup(char *path)
+control_cleanup(void)
 {
accept_del(control_fd);
close(control_fd);
-   unlink(path);
 }
 
 /* ARGSUSED */
blob - f4f5525707d23ffb059e229177b0840fc1e31230
blob + 7e74ac093d7e8b16e7b9ea19362d3116d3edf2e9
--- usr.sbin/ldpd/control.h
+++ usr.sbin/ldpd/control.h
@@ -32,7 +32,7 @@ extern struct ctl_conns ctl_conns;
 
 intcontrol_init(char *);
 intcontrol_listen(void);
-void   control_cleanup(char *);
+void   control_cleanup(void);
 intcontrol_imsg_relay(struct imsg *);
 
 #endif /* _CONTROL_H_ */
blob - 6200d552882ca678e391c59922ccf358439ddac4
blob + a8d2a8a9c842d045588defa9a53dcb1caeda765a
--- usr.sbin/ldpd/ldpe.c
+++ usr.sbin/ldpd/ldpe.c
@@ -171,7 +171,7 @@ ldpe_shutdown(void)
msgbuf_clear(_main->ibuf.w);
close(iev_main->ibuf.fd);
 
-   control_cleanup(global.csock);
+   control_cleanup();
config_clear(leconf);
 
if (sysdep.no_pfkey == 0) {



Re: userland clock_gettime proof of concept

2020-06-20 Thread Mark Kettenis
> From: Christian Weisgerber 
> Date: Sat, 20 Jun 2020 01:20:33 - (UTC)
> 
> On 2020-06-19, Mark Kettenis  wrote:
> 
> > I'm talking about *skew*, not drift.  If there is a significant drift
> > you already knock out the TSC.
> >
> > What's needed is:
> >
> > 1. A bit of research of what an acceptable skew is.  My hypothesis is
> >that on many machines with a single socket the TSCs are actually in
> >synch.  But the way we measure the skew isn't 100% accurate so we
> >still get a small skew.  If we sample these values on a couple of
> >machines across a couple of reboots we can probably tell what the
> >uncertainty in the measurement of the skew is and define a cutoff
> >based on that.
> 
> So we need amd64 snapshots to enable TSC_DEBUG, maybe a bit prettier
> like below, and then reports:
> 
> cpu0: Intel(R) Xeon(R) CPU E3-1225 v3 @ 3.20GHz, 3392.69 MHz, 06-3c-03
> 
> cpu0: TSC skew=0 observed drift=0
> cpu1: TSC skew=-24 observed drift=0
> cpu2: TSC skew=-27 observed drift=0
> cpu3: TSC skew=-25 observed drift=0
> 
> cpu0: TSC skew=0 observed drift=0
> cpu1: TSC skew=-1 observed drift=0
> cpu2: TSC skew=0 observed drift=0
> cpu3: TSC skew=25 observed drift=0
> 
> cpu0: TSC skew=0 observed drift=0
> cpu1: TSC skew=-30 observed drift=0
> cpu2: TSC skew=-39 observed drift=0
> cpu3: TSC skew=-41 observed drift=0
> 
> cpu0: TSC skew=0 observed drift=0
> cpu1: TSC skew=-31 observed drift=0
> cpu2: TSC skew=-37 observed drift=0
> cpu3: TSC skew=-39 observed drift=0
> 
> cpu0: TSC skew=0 observed drift=0
> cpu1: TSC skew=-31 observed drift=0
> cpu2: TSC skew=-34 observed drift=0
> cpu3: TSC skew=-23 observed drift=0

Yes, enabling that for a bit would help us get an idea.  We probably
should have it print the TSC frequency as well though.

> Index: sys/arch/amd64/amd64/tsc.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/amd64/tsc.c,v
> retrieving revision 1.16
> diff -u -p -r1.16 tsc.c
> --- sys/arch/amd64/amd64/tsc.c6 Apr 2020 00:01:08 -   1.16
> +++ sys/arch/amd64/amd64/tsc.c19 Jun 2020 23:49:06 -
> @@ -217,7 +217,7 @@ void
>  tsc_timecounter_init(struct cpu_info *ci, uint64_t cpufreq)
>  {
>  #ifdef TSC_DEBUG
> - printf("%s: TSC skew=%lld observed drift=%lld\n", __func__,
> + printf("%s: TSC skew=%lld observed drift=%lld\n", ci->ci_dev->dv_xname,
>   (long long)ci->ci_tsc_skew, (long long)tsc_drift_observed);
>  #endif
>  
> -- 
> Christian "naddy" Weisgerber  na...@mips.inka.de
> 
> 



Re: ldpd engine process exits with pledge "cpath"

2020-06-20 Thread Remi Locherer
On Fri, Jun 19, 2020 at 02:43:00PM +0100, Ricardo Mestre wrote:
> mea culpa, but I'd rather just remove the unlink of the socket.
> 
> OK?

Diff reads OK to me.

We had the same discussion in 2018 for ripd:
https://marc.info/?l=openbsd-tech=154101413029926=2

Note to self: ospfd should get the same treatment ...

> 
> Index: control.c
> ===
> RCS file: /cvs/src/usr.sbin/ldpd/control.c,v
> retrieving revision 1.29
> diff -u -p -u -r1.29 control.c
> --- control.c 3 Mar 2017 23:30:57 -   1.29
> +++ control.c 19 Jun 2020 13:40:46 -
> @@ -98,11 +98,10 @@ control_listen(void)
>  }
>  
>  void
> -control_cleanup(char *path)
> +control_cleanup(void)
>  {
>   accept_del(control_fd);
>   close(control_fd);
> - unlink(path);
>  }
>  
>  /* ARGSUSED */
> Index: control.h
> ===
> RCS file: /cvs/src/usr.sbin/ldpd/control.h,v
> retrieving revision 1.9
> diff -u -p -u -r1.9 control.h
> --- control.h 3 Mar 2017 23:30:57 -   1.9
> +++ control.h 19 Jun 2020 13:40:46 -
> @@ -32,7 +32,7 @@ extern struct ctl_conns ctl_conns;
>  
>  int  control_init(char *);
>  int  control_listen(void);
> -void control_cleanup(char *);
> +void control_cleanup(void);
>  int  control_imsg_relay(struct imsg *);
>  
>  #endif   /* _CONTROL_H_ */
> Index: ldpe.c
> ===
> RCS file: /cvs/src/usr.sbin/ldpd/ldpe.c,v
> retrieving revision 1.76
> diff -u -p -u -r1.76 ldpe.c
> --- ldpe.c10 Aug 2019 01:30:53 -  1.76
> +++ ldpe.c19 Jun 2020 13:40:46 -
> @@ -171,7 +171,7 @@ ldpe_shutdown(void)
>   msgbuf_clear(_main->ibuf.w);
>   close(iev_main->ibuf.fd);
>  
> - control_cleanup(global.csock);
> + control_cleanup();
>   config_clear(leconf);
>  
>   if (sysdep.no_pfkey == 0) {
>