Behaviour of fsync() in case of write-back errors

2018-04-12 Thread Thomas Munro
Hello,

I work on PostgreSQL.  I don't use OpenBSD, but recently I've been
investigating how fsync() reports write-back errors on all operating
systems that people like to run PostgreSQL on:

https://wiki.postgresql.org/wiki/Fsync_Errors

It seems to me that on OpenBSD, asynchronous write-back errors might
not be reported to userspace in a subsequent call to fsync(), and
synchronous write-back errors that are reported to userspace might not
be reported in a follow-up call to fsync() (that is, retrying will
appear to be successful but in fact your data is gone).

Am I wrong?  Perhaps some other code elsewhere will record the error
at device, filesystem or inode level?  I didn't try to test this: I
simply compared the brelse() error handling code with that of FreeBSD
and NetBSD whose behaviours are known to be correct and incorrect
respectively, according to our assessment of what fsync() *should* do.
(Or at least the behaviour that PostgreSQL relies on, when it reports
that your data exists on disk as part of its checkpoint protocol).
OpenBSD certainly appears to be like NetBSD in this respect, so I
thought it was worth pinging your list and asking for an expert
opinion.

Do you think that my suspicion is correct?  Would you consider that to
be a bug?

Thanks!

Thomas Munro



Move OpenBSD/armv7 to softfp float ABI

2018-04-12 Thread Mark Kettenis
The diff below moves armv7 to the so-called softfp float ABI.  This is
the same ABI as we have now, except that it allows the compiler to
generate hardware floating-point instructions.  Floating-point
function arguments and return values are still passed in the integer
registers though.

This is unlikely to have any significant performance impact.  Some
code may even run slower since floating-point arguments have to be
moved into floating-point registers before hardware floating-point
instructions can be used.  But it is a useful step towards switching
to the hardfp float ABI.

This probably should wait a little bit until the dust on the clang6
switch has settled.  But don't hesitate to test this or give me ok's.


Index: gnu/llvm/tools/clang/lib/Driver/ToolChains/Arch/ARM.cpp
===
RCS file: /cvs/src/gnu/llvm/tools/clang/lib/Driver/ToolChains/Arch/ARM.cpp,v
retrieving revision 1.1.1.2
diff -u -p -r1.1.1.2 ARM.cpp
--- gnu/llvm/tools/clang/lib/Driver/ToolChains/Arch/ARM.cpp 6 Apr 2018 
14:26:32 -   1.1.1.2
+++ gnu/llvm/tools/clang/lib/Driver/ToolChains/Arch/ARM.cpp 12 Apr 2018 
20:56:44 -
@@ -232,7 +232,7 @@ arm::FloatABI arm::getARMFloatABI(const 
   break;
 
 case llvm::Triple::OpenBSD:
-  ABI = FloatABI::Soft;
+  ABI = FloatABI::SoftFP;
   break;
 
 default:



Re: Stop ping telling world its pid

2018-04-12 Thread Miod Vallat
> what we need is an eventually consistent 16 bit number microservice message
> bus that all ping processes can subscribe to.

And it should be available as early as possible during boot, so I think
its place is in init(8).



Re: vput(9) for NFS

2018-04-12 Thread Alexander Bluhm
On Wed, Apr 11, 2018 at 12:34:03PM +0200, Martin Pieuchot wrote:
> @@ -769,6 +769,7 @@ nfs_lookup(void *v)
>   flags = cnp->cn_flags;
>  
>   *vpp = NULLVP;
> + newvp = NULLVP;
>   if ((flags & ISLASTCN) && (dvp->v_mount->mnt_flag & MNT_RDONLY) &&
>   (cnp->cn_nameiop == DELETE || cnp->cn_nameiop == RENAME))
>   return (EROFS);

I dont see why this is necessary, but it does not hurt either.

> @@ -1545,9 +1545,9 @@ nfs_remove(void *v)
>  int
>  nfs_removeit(struct sillyrename *sp)
>  {
> + KASSERT(VOP_ISLOCKED(sp->s_dvp));
>   /*
>* Make sure that the directory vnode is still valid.
> -  * XXX we should lock sp->s_dvp here.

With the assert removed it passes regress, otherwise it panics.

bluhm



Re: diff for usr.bin/mg/fileio.c

2018-04-12 Thread Han Boetes
Yes this also works for me.


On Thu, Apr 12, 2018 at 4:52 PM, Florian Obser  wrote:

> On Tue, Apr 10, 2018 at 06:49:10PM +0200, Han Boetes wrote:
> > I got a problem report from Mark Willson:
> >
> > "I recently installed mg (via the Debian package) under WSL on
> Windows
> > 10.
> > I found that the 'backup-to-home-directory' option didn't work.
> >
> > The cause of this appears to be that getlogin under WSL returns NULL,
> > probably due to my use of wsltty to invoke bash.  The patch below
> fixes
> > the issue for me:"
> >
> > [snip]
> > -   if ((un = getlogin()) != NULL)
> > +   if ((un = getenv("LOGNAME")) != NULL)
> > [snip]
> >
> > Which put me onto the track of what was going on. I found the
> > following in the Linux manpage:
> >
> > BUGS
> >Unfortunately, it is often rather easy to fool  getlogin().
> >  Sometimes
> >it  does not work at all, because some program messed up the utmp
> > file.
> >Often, it gives only the first 8 characters of  the  login  name.
> >  The
> >user  currently  logged  in  on the controlling terminal of our
> > program
> >need not be the user who started it.  Avoid  getlogin()  for
> > security-
> >related purposes.
> >
> >Note  that glibc does not follow the POSIX specification and uses
> > stdin
> >instead of /dev/tty.  A bug.  (Other recent systems, like SunOS
> 5.8
> > and
> >HP-UX  11.11  and FreeBSD 4.8 all return the login name also when
> > stdin
> >is redirected.)
> >
> >Nobody knows precisely what cuserid() does; avoid it in  portable
> > pro‐
> >grams.   Or  avoid  it  altogether: use getpwuid(geteuid())
> instead,
> > if
> >that is what you meant.  Do not use cuserid().
> >
> > So I started looking at the code and rewrote it a bit, which I think
> > makes it more portable and removes a syscall in the process. I do
> > suspect this can be written even more elegantly, but didn't want to
> > rework the code too much.
> >
> > I also took the liberty to remove some whitespace.
> >
> >
> > Index: fileio.c
> > ===
> > RCS file: /cvs/src/usr.bin/mg/fileio.c,v
> > retrieving revision 1.104
> > @@ -703,7 +706,7 @@ expandtilde(const char *fn)
> > struct stat  statbuf;
> > const char  *cp;
> > char user[LOGIN_NAME_MAX], path[NFILEN];
> > -   char*un, *ret;
> > +   char*ret;
> > size_t   ulen, plen;
> >
> > path[0] = '\0';
> > @@ -722,21 +725,21 @@ expandtilde(const char *fn)
> > return (NULL);
> > return(ret);
> > }
> > +   pw = getpwuid(geteuid());
> > if (ulen == 0) { /* ~/ or ~ */
> > -   if ((un = getlogin()) != NULL)
> > -   (void)strlcpy(user, un, sizeof(user));
> > +   if (pw != NULL)
> > +   (void)strlcpy(user, pw->pw_name, sizeof(user));
> > else
> > user[0] = '\0';
> > } else { /* ~user/ or ~user */
> > memcpy(user, &fn[1], ulen);
> > user[ulen] = '\0';
> > }
> > -   pw = getpwnam(user);
> > if (pw != NULL) {
> > plen = strlcpy(path, pw->pw_dir, sizeof(path));
> > if (plen == 0 || path[plen - 1] != '/') {
> > if (strlcat(path, "/", sizeof(path)) >=
> > sizeof(path)) {
> > -   dobeep();
> > +   dobeep();
> > ewprintf("Path too long");
> > return (NULL);
> > }
>
> not quite, you leave pw unitialized in the else part.
>
> Lucas Gabriel Vuotto came up with a similar patch (to a different problem)
> back in 2017:
> https://marc.info/?l=openbsd-tech&m=149521389822841&w=2
>
> Which I subsequently slacked on, sorry!
>
> Here is a slightly tweaked version of Lucas' diff:
> - removed now unused un variable
> - geteuid() instead of getuid()
>
> Han, Lucas, does this work for you?
>
> diff --git fileio.c fileio.c
> index 0987f6f30de..339088f5e2d 100644
> --- fileio.c
> +++ fileio.c
> @@ -703,7 +703,7 @@ expandtilde(const char *fn)
> struct stat  statbuf;
> const char  *cp;
> char user[LOGIN_NAME_MAX], path[NFILEN];
> -   char*un, *ret;
> +   char*ret;
> size_t   ulen, plen;
>
> path[0] = '\0';
> @@ -722,16 +722,13 @@ expandtilde(const char *fn)
> return (NULL);
> return(ret);
> }
> -   if (ulen == 0) { /* ~/ or ~ */
> -   if ((un = getlogin()) != NULL)
> -   (void)strlcpy(user, un, sizeof(user));
> -   else
> -   user[0] = '\0';
> -

Re: ksh: fix buffer overflow in u64ton

2018-04-12 Thread Scott Cheloha
On Mon, Apr 09, 2018 at 08:56:28PM +0200, Tobias Stoeckmann wrote:
> As tb@ pointed out, u64ton can overflow on ULONG_MAX. It could also
> happen on systems with 64 bit int and INT_MIN, although we don't have
> such a system supported by our code base.
> 
> You can reach the u64ton function by printing the length of a string
> within a variable like this:
> 
> $ a=string
> $ echo ${#a}
> 6
> $ _
> 
> Technically there is no need to use BITS(uint64_t) because 20+2 would
> be enough. But it seems that there is a strong preference towards
> going "long long" instead of int64_t due to return types of time()
> and strtonum() as well as a highly theoretical support of larger types
> than 64 bit in long long. Although that probably will never happen.

"Never" is an awful long time...

> So, if we go with a data type like long long, I would prefer to have
> the code prepared to scale its buffer with the data type, therefore
> BITS(). And let's be honest -- nobody will notice this waste of extra
> 44 bytes.

Depending on the compiler, it might have been 64-byte aligned anyway,
so we might just be making use of space we "had" anyway.

Ditto what tb@ said about dropping the hex numerals from the loop,
otherwise this is ok with me.

... though I would like to circle back and modify this function again
after you're done with your conversions.  Given the callers remaining
for u64ton() (nee ulton()), the second argument seems like a hack.

Could it eventually look like this?

long long
llton(long long n)
{
static char buf [BITS(uint64_t) + 2];
char *p;
int negative;

negative = (n < 0) ? 1 : 0;
p = &buf[sizeof(buf)];
*--p = '\0';
do {
*--p = "0123456789"[n % 10];
n /= 10;
} while (n != 0);
if (negative)
*--p = '-';
return p;
}

I'm even tempted to just snprintf(3) the buffer and be done with it,
but I'm reluctant to advocate deoptimizing this function, given that
it has worked fine since import.

-Scott

> Tobias
> 
> Index: eval.c
> ===
> RCS file: /cvs/src/bin/ksh/eval.c,v
> retrieving revision 1.60
> diff -u -p -u -p -r1.60 eval.c
> --- eval.c9 Apr 2018 17:53:36 -   1.60
> +++ eval.c9 Apr 2018 18:47:45 -
> @@ -732,7 +732,7 @@ varsub(Expand *xp, char *sp, char *word,
>   if (Flag(FNOUNSET) && c == 0 && !zero_ok)
>   errorf("%s: parameter not set", sp);
>   *stypep = 0; /* unqualified variable/string substitution */
> - xp->str = str_save(u64ton((uint64_t)c, 10), ATEMP);
> + xp->str = str_save(u64ton((uint64_t)c, '\0'), ATEMP);
>   return XSUB;
>   }
>  
> Index: misc.c
> ===
> RCS file: /cvs/src/bin/ksh/misc.c,v
> retrieving revision 1.70
> diff -u -p -u -p -r1.70 misc.c
> --- misc.c9 Apr 2018 17:53:36 -   1.70
> +++ misc.c9 Apr 2018 18:47:45 -
> @@ -56,20 +56,22 @@ initctypes(void)
>   setctypes(" \n\t\"#$&'()*;<>?[\\`|", C_QUOTE);
>  }
>  
> -/* convert uint64_t to base N string */
> +/* convert uint64_t N to a base 10 number as string with optional PREFIX */
>  
>  char *
> -u64ton(uint64_t n, int base)
> +u64ton(uint64_t n, char prefix)
>  {
>   char *p;
> - static char buf [20];
> + static char buf [BITS(uint64_t) + 2];
>  
>   p = &buf[sizeof(buf)];
>   *--p = '\0';
>   do {
> - *--p = "0123456789ABCDEF"[n%base];
> - n /= base;
> + *--p = "0123456789ABCDEF"[n % 10];
> + n /= 10;
>   } while (n != 0);
> + if (prefix != '\0')
> + *--p = prefix;
>   return p;
>  }
>  
> Index: sh.h
> ===
> RCS file: /cvs/src/bin/ksh/sh.h,v
> retrieving revision 1.72
> diff -u -p -u -p -r1.72 sh.h
> --- sh.h  9 Apr 2018 17:53:36 -   1.72
> +++ sh.h  9 Apr 2018 18:47:45 -
> @@ -527,7 +527,7 @@ void  cleanup_proc_env(void);
>  /* misc.c */
>  void setctypes(const char *, int);
>  void initctypes(void);
> -char *   u64ton(uint64_t, int);
> +char *   u64ton(uint64_t, char);
>  char *   str_save(const char *, Area *);
>  char *   str_nsave(const char *, int, Area *);
>  int  option(const char *);
> Index: tree.c
> ===
> RCS file: /cvs/src/bin/ksh/tree.c,v
> retrieving revision 1.34
> diff -u -p -u -p -r1.34 tree.c
> --- tree.c9 Apr 2018 17:53:36 -   1.34
> +++ tree.c9 Apr 2018 18:47:45 -
> @@ -376,9 +376,7 @@ vfptreef(struct shf *shf, int indent, co
>   case 'd': /* decimal */
>   n = va_arg(va, int);
>   neg = n < 0;
> - p = u64ton(neg ? -n : n, 10);
> -   

Re: Stop ping telling world its pid

2018-04-12 Thread Ted Unangst
Job Snijders wrote:
> I’m not great at math, with a 16 bit random value, wouldn’t we start
> running into ID collisions around 256 concurrent ping processes? Perhaps
> that already is the case today and this patch does nothing to improve that,
> but also doesn’t make it worse.

what we need is an eventually consistent 16 bit number microservice message
bus that all ping processes can subscribe to.



Re: diff for usr.bin/mg/fileio.c

2018-04-12 Thread Florian Obser
On Tue, Apr 10, 2018 at 06:49:10PM +0200, Han Boetes wrote:
> I got a problem report from Mark Willson:
> 
> "I recently installed mg (via the Debian package) under WSL on Windows
> 10.
> I found that the 'backup-to-home-directory' option didn't work.
> 
> The cause of this appears to be that getlogin under WSL returns NULL,
> probably due to my use of wsltty to invoke bash.  The patch below fixes
> the issue for me:"
> 
> [snip]
> -   if ((un = getlogin()) != NULL)
> +   if ((un = getenv("LOGNAME")) != NULL)
> [snip]
> 
> Which put me onto the track of what was going on. I found the
> following in the Linux manpage:
> 
> BUGS
>Unfortunately, it is often rather easy to fool  getlogin().
>  Sometimes
>it  does not work at all, because some program messed up the utmp
> file.
>Often, it gives only the first 8 characters of  the  login  name.
>  The
>user  currently  logged  in  on the controlling terminal of our
> program
>need not be the user who started it.  Avoid  getlogin()  for
> security-
>related purposes.
> 
>Note  that glibc does not follow the POSIX specification and uses
> stdin
>instead of /dev/tty.  A bug.  (Other recent systems, like SunOS 5.8
> and
>HP-UX  11.11  and FreeBSD 4.8 all return the login name also when
> stdin
>is redirected.)
> 
>Nobody knows precisely what cuserid() does; avoid it in  portable
> pro‐
>grams.   Or  avoid  it  altogether: use getpwuid(geteuid()) instead,
> if
>that is what you meant.  Do not use cuserid().
> 
> So I started looking at the code and rewrote it a bit, which I think
> makes it more portable and removes a syscall in the process. I do
> suspect this can be written even more elegantly, but didn't want to
> rework the code too much.
> 
> I also took the liberty to remove some whitespace.
> 
> 
> Index: fileio.c
> ===
> RCS file: /cvs/src/usr.bin/mg/fileio.c,v
> retrieving revision 1.104
> @@ -703,7 +706,7 @@ expandtilde(const char *fn)
> struct stat  statbuf;
> const char  *cp;
> char user[LOGIN_NAME_MAX], path[NFILEN];
> -   char*un, *ret;
> +   char*ret;
> size_t   ulen, plen;
> 
> path[0] = '\0';
> @@ -722,21 +725,21 @@ expandtilde(const char *fn)
> return (NULL);
> return(ret);
> }
> +   pw = getpwuid(geteuid());
> if (ulen == 0) { /* ~/ or ~ */
> -   if ((un = getlogin()) != NULL)
> -   (void)strlcpy(user, un, sizeof(user));
> +   if (pw != NULL)
> +   (void)strlcpy(user, pw->pw_name, sizeof(user));
> else
> user[0] = '\0';
> } else { /* ~user/ or ~user */
> memcpy(user, &fn[1], ulen);
> user[ulen] = '\0';
> }
> -   pw = getpwnam(user);
> if (pw != NULL) {
> plen = strlcpy(path, pw->pw_dir, sizeof(path));
> if (plen == 0 || path[plen - 1] != '/') {
> if (strlcat(path, "/", sizeof(path)) >=
> sizeof(path)) {
> -   dobeep();
> +   dobeep();
> ewprintf("Path too long");
> return (NULL);
> }

not quite, you leave pw unitialized in the else part.

Lucas Gabriel Vuotto came up with a similar patch (to a different problem) back 
in 2017:
https://marc.info/?l=openbsd-tech&m=149521389822841&w=2

Which I subsequently slacked on, sorry!

Here is a slightly tweaked version of Lucas' diff:
- removed now unused un variable
- geteuid() instead of getuid()

Han, Lucas, does this work for you?

diff --git fileio.c fileio.c
index 0987f6f30de..339088f5e2d 100644
--- fileio.c
+++ fileio.c
@@ -703,7 +703,7 @@ expandtilde(const char *fn)
struct stat  statbuf;
const char  *cp;
char user[LOGIN_NAME_MAX], path[NFILEN];
-   char*un, *ret;
+   char*ret;
size_t   ulen, plen;
 
path[0] = '\0';
@@ -722,16 +722,13 @@ expandtilde(const char *fn)
return (NULL);
return(ret);
}
-   if (ulen == 0) { /* ~/ or ~ */
-   if ((un = getlogin()) != NULL)
-   (void)strlcpy(user, un, sizeof(user));
-   else
-   user[0] = '\0';
-   } else { /* ~user/ or ~user */
+   if (ulen == 0) /* ~/ or ~ */
+   pw = getpwuid(geteuid());
+   else { /* ~user/ or ~user */
memcpy(user, &fn[1], ulen);
user[ulen] = '\0';
+   pw = getpwnam(user);
}
-   pw = getpwnam(user);
if (pw != NULL) {
  

[PATCH] www/63.html - correct OpenBSD 6.3 release date

2018-04-12 Thread Raf Czlonka
Hi all,

The main page[0] reads:

[...], released Apr 2, 2018.

while the 6.3 release page, which the above links to[1] says:

Released Apr 15, 2018

[0] https://www.openbsd.org/
[1] https://www.openbsd.org/63.html

Regards,

Raf

Index: 63.html
===
RCS file: /cvs/www/63.html,v
retrieving revision 1.74
diff -u -p -r1.74 63.html
--- 63.html 8 Apr 2018 11:45:49 -   1.74
+++ 63.html 12 Apr 2018 14:16:46 -
@@ -20,7 +20,7 @@
 
 
 
-Released Apr 15, 2018
+Released Apr 2, 2018
 Copyright 1997-2018, Theo de Raadt.
 
 



Re: ipmi(4): don't panic if ipmi_sendcmd() fails

2018-04-12 Thread YASUOKA Masahiko
On Tue, 06 Mar 2018 12:37:28 +0900 (JST)
Naoki Fukaumi  wrote:
> sending IPMI command can fail for some reason (e.g. during BMC reset,
> and possibly BMC is just busy?), then it should be better to return
> error than panic().

Verified.

  # ipmitool mc reset warm
  # ipmitool fru

caused the panic without the diff, but after applying the diff, kernel
keeps working.

ok?

don't panic if ipmi_sendcmd() fails
diff from fukaumi at soum.co.jp.

Index: sys/dev/ipmi.c
===
RCS file: /var/cvs/openbsd/src/sys/dev/ipmi.c,v
retrieving revision 1.100
diff -u -p -r1.100 ipmi.c
--- sys/dev/ipmi.c  1 Jan 2018 16:16:23 -   1.100
+++ sys/dev/ipmi.c  23 Mar 2018 04:16:02 -
@@ -1039,10 +1039,10 @@ ipmi_cmd_poll(struct ipmi_cmd *c)
 {
mtx_enter(&c->c_sc->sc_cmd_mtx);
 
-   if (ipmi_sendcmd(c)) {
-   panic("%s: sendcmd fails", DEVNAME(c->c_sc));
-   }
-   c->c_ccode = ipmi_recvcmd(c);
+   if ((c->c_ccode = ipmi_sendcmd(c)))
+   printf("%s: sendcmd fails\n", DEVNAME(c->c_sc));
+   else
+   c->c_ccode = ipmi_recvcmd(c);
 
mtx_leave(&c->c_sc->sc_cmd_mtx);
 }
@@ -1819,8 +1819,6 @@ ipmiioctl(dev_t dev, u_long cmd, caddr_t
c->c_txlen = req->msg.data_len;
c->c_rxlen = 0;
ipmi_cmd(c);
-
-   KASSERT(c->c_ccode != -1);
break;
case IPMICTL_RECEIVE_MSG_TRUNC:
case IPMICTL_RECEIVE_MSG:



Re: ifconfig,route,netstat: s/tableid/rtable/ for consistency

2018-04-12 Thread Alexander Bluhm
On Thu, Apr 12, 2018 at 10:07:58AM +0200, Peter Hessler wrote:
> :Or is there any real difference between `tableid' and `rtable' I'm not
> :aware of?
> :
> 
> rtables are layer 3.
> 
> rdomains are layer 2 (aka, arp and ndp lookups).

The question was not about rtables vs rdomains.

It is about tableid vs rtable.

bluhm



Re: ifconfig,route,netstat: s/tableid/rtable/ for consistency

2018-04-12 Thread Peter Hessler
On 2018 Apr 11 (Wed) at 23:01:45 +0200 (+0200), Klemens Nanni wrote:
:On Wed, Apr 11, 2018 at 09:28:03AM +0200, Peter Hessler wrote:
:> No, all of these uses are correct as-is.
:`tableid' surely isn't wrong, but using the argument name across manuals
:seems nicer to me.
:

No, they are different things.  Different names help with the concept.


:Or is there any real difference between `tableid' and `rtable' I'm not
:aware of?
:

rtables are layer 3.

rdomains are layer 2 (aka, arp and ndp lookups).

You can have multiple rtables within an rdomain.  An interface can only
be a member of a single rdomain at a time.

-- 
Just about every computer on the market today runs Unix, except the Mac
(and nobody cares about it).
-- Bill Joy 6/21/85