Re: svn commit: r296109 - head/libexec/rlogind

2016-02-27 Thread Bruce Evans

On Sat, 27 Feb 2016, Jilles Tjoelker wrote:


On Sat, Feb 27, 2016 at 09:48:05AM -0500, Pedro Giffuni wrote:

In the case of rlogind, note that the above limitation [FD_SETSIZE]
has disappeared by using poll(2).



I will add that FreeBSD has a native poll(2) implementation, it is
not a wrapper around select as some old postings would suggest.



I don't have any plans to do a massive shakeup against select(2), this
was some lower hanging fruit that was easy to change. For new code
kqueue(2) should be preferred.


The FD_SETSIZE can be a more important issue in library code which may
be called from applications that have many descriptors open already.

I don't agree with always using kqueue(2) for new code. Provided poll(2)
has the necessary functionality and the number of file descriptors is
low, using poll(2) tends to result in simpler code and better
performance. Also, poll(2) is more portable and avoids a failure mode
from the kqueues rlimit.


All the conversions to poll() seem to be buggy.  poll() gives better
detection of EOF, but you have to actually check for it or you get
different behaviour from select().  gdb in FreeBSD has been broken
for 10-15 years by misconversion to poll().  E.g.:

ttyv1:bde@besplex:~> echo "p 2+2" | gdb -q
(gdb) Hangup detected on fd 0
error detected on stdin

This is because it fails to check for POLLIN on stdin when it sees
POLLHUP on stdin.  It loses all the buffered input.  I tried more
input.  It loses all input for a buffer size of < 8K, then starts
seeing some.  The details depend on the kernel buffering.  PIPE_BUF
is only 512 but the kernel normally uses larger buffers than that.

kdump output for this:

   987 cat  CALL  write(0x1,0x806,0x1fff)
   987 cat  GIO   fd 1 wrote 4096 bytes
   "
p 2+2
p 2+2
...
p 2"
   987 cat  RET   write 8191/0x1fff
   987 cat  CALL  read(0x3,0x806,0x4000)
   987 cat  GIO   fd 3 read 0 bytes
   ""
   987 cat  RET   read 0
   987 cat  CALL  close(0x3)
   987 cat  RET   close 0
   987 cat  CALL  close(0x1)
   987 cat  RET   close 0
   987 cat  CALL  exit(0)
   986 sh   RET   wait4 987/0x3db
   986 sh   CALL  wait4(0x,0xbfbfe7c8,0x2,0)
   988 gdb  RET   poll 1
   988 gdb  CALL  write(0x1,0x830,0x18)
   988 gdb  GIO   fd 1 wrote 24 bytes
   "Hangup detected on fd 0
   "
   988 gdb  RET   write 24/0x18
   988 gdb  CALL  write(0x1,0x830,0x18)
   988 gdb  GIO   fd 1 wrote 24 bytes
   "error detected on stdin
   "
   988 gdb  RET   write 24/0x18
   988 gdb  CALL  sigprocmask(0x1,0,0xbfbfe33c)
   988 gdb  RET   sigprocmask 0
   988 gdb  CALL  exit(0)
   986 sh   RET   wait4 988/0x3dc
   986 sh   CALL  exit(0)

The changes have the opposite problem.  They check for POLLIN but not
POLLHUP.  This depends on the unportable and broken but possibly
permittied behaviour of poll() always setting POLLIN when it sets
POLLHUP.

select() has no way to report POLLHUP, so it converts POLLHUP to
input-ready.  poll() should do the following:
- for hangup with buffered input, as in the above pipe case, set
  POLLHUP to indicate the hangup and POLLIN to indicate the data
- for hangup with no buffered input, set POLLHUP to indicate the
  hangup and clear POLLIN to indicate no data
but the usual misimplementation does:
- always set POLLIN when setting POLLHUP.  This makes poll() behave
  more like select(), so that buggy conversions to poll() work, but
  it means that POLLIN cannot be trusted for determining if more
  than non-null data can be read.  This is always done for regular
  files IIRC.  For regular files, there is never any buffered input
  POLLHUP with POLLIN really means POLLHUP without POLLIN.
Another possible misimplementation that I have not observed is:
- delay setting POLLHUP until all buffered input has been read or
  discardered.  This would work OK for terminals since terminals
  are specified to discard input on hangup (but there are many races
  at low levels with the timing of the hangup and the input -- many
  drivers lose input much like gdb does -- they deliver the hangup
  to upper layers synchronously, but deliver buffered input later).
  However, for pipes the input is not discarded, so POLLHUP should
  not be delayed, so that applications can decide if they want to
  discard the input on hanhyp.

The poll regression tests check which version of the bugs are implemented
for pipes and sockets.  They only pass for some file types because the
bugs are implemented for others.  The buggy cases are system-dependent.

Due to the bugs, POLLIN together with POLLHUP can never be trusted.
Applications should use similar code to after select() to detect EOF:

switch (events & (POLLHUP | POLLIN)) {
case POLLHUP:
/* Non-buggy case. */
exit(0);
case POLLIN:
/* Normal input. */

Re: svn commit: r295854 - head/bin/dd

2016-02-21 Thread Bruce Evans

On Sun, 21 Feb 2016, Edward Tomasz Napierala wrote:


Log:
 Make the "invalid numeric value" error message actually displayable
 (was a dead code before).

 Submitted by:  bde@ (earlier version)
 Reviewed by:   bde@


Thanks.


Modified:
 head/bin/dd/args.c

Modified: head/bin/dd/args.c
==
--- head/bin/dd/args.c  Sun Feb 21 13:54:22 2016(r295853)
+++ head/bin/dd/args.c  Sun Feb 21 14:36:50 2016(r295854)
@@ -422,11 +422,10 @@ get_num(const char *val)

errno = 0;
num = strtoumax(val, , 0);
-   if (errno != 0) /* Overflow or underflow. */
-   err(1, "%s", oper);
-
if (expr == val)/* No valid digits. */
-   errx(1, "%s: illegal numeric value", oper);
+   errx(1, "%s: invalid numeric value", oper);
+   if (errno != 0)
+   err(1, "%s", oper);

mult = postfix_to_mult(*expr);

...


This is to avoid the POSIX bug that the strto*() family "helpfully" sets
errno for cases not specified by C90, C99 or C11.  POSIX requires setting
errno to EINVAL if the base is unsupported or no conversion could be
performed.  The FreeBSD man page says that the conversion error is
unportable.  The base error is equally unportable, but the man page doesn't
mention it.  This breaks code like the above which depends on the Standard
C bhaviour of only setting errno to ERANGE.  The code happens to do a
generic check on errno first, and that prevents the more specific handling
for a conversion error from being reached.

Standard C in general allows errno to be clobbered on success, but for
functions that are documented to set errno like the strto*() family, it
intends to require only the documented settings.  The above code is an
example showing that settings to non-Standard C values are little different
from gratuitous clobbering.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295864 - head/sys/kern

2016-02-21 Thread Bruce Evans

On Sun, 21 Feb 2016, Ian Lepore wrote:


Log:
 Allow a dynamic env to override a compiled-in static env by passing in the
 override indication in the env data.

 Submitted by:  bde


Thanks.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295833 - head/sys/sys

2016-02-19 Thread Bruce Evans

On Sat, 20 Feb 2016, Justin Hibbits wrote:


Log:
 Fix the definition of RM_MAX_END.

 Even though casting from signed to unsigned is well-defined in C, it's better 
to
 first cast to the larger unsigned type, then negate.


Casting ~0 is well-defined, but ~0 isn't.

The operation is complementation, not negation.


Modified: head/sys/sys/rman.h
==
--- head/sys/sys/rman.h Sat Feb 20 01:32:58 2016(r295832)
+++ head/sys/sys/rman.h Sat Feb 20 01:34:13 2016(r295833)
@@ -61,7 +61,7 @@ enum  rman_type { RMAN_UNINIT = 0, RMAN_G
 */
#define RM_TEXTLEN  32

-#defineRM_MAX_END  ((rman_res_t)~0)
+#defineRM_MAX_END  (~(rman_res_t)0)


If the operation were negation, then its result -0 would actually be well-
defined.  It would be 0 again, even in the 1's complement and signed
magnitude cases where -0 can be represented.  -0 cast to unsigned must
give 0 again, and not be all-bits-1 even if it started as all-bits 1
representing -0 in 2's complement.

~0 was under-specified in C90 as being the bitwise complement.  The value
of that is not specified unless the promotion of the orginal type is
unsigned (~(unsigned char)0 is also not specified except on exotic
machines (no longer permitted by POSIX) with unsigned char the same size
as unsigned int).

~0 is even more under-specified in C99 because C99 is more careful
with representations and has trap representations.  (This is not fixed
in C11.) Complementing a non- trap-representation might give a trap
representation if done bitwise, even starting from a nondescript value
like 0.  But if there are no trap or padding bits, the only reasonable
interpretation of the spec is that ~0 gives all-bits-1 with whatever
value that has.  In the 2's complement case, the value is -1 and casting
this to unsigned works right.  In the 1's complement case, the value
is -0 and casting this to unsigned gives unsigned plain 0.  In the
signed-magnitude case, I think it gives -0x7fff with 32-bit ints
and casting this to unsigned gives 0x8001.  C90 allows other
representations, so the result can be anything.

The clearest way to write the expression is probably ((rman_res_t)-1)
like the buggy ~0 version did in the 2's complement case.

Everyone knows what -1 is.  Conversion of -1 to unsigned is better
known than what ~0 is.  It takes an explicit specification for both.

Conversion of -1 involves "[in extra precision] repeatedly adding or
subtracting one more than the maximim value that can be represented
in the new type until [the result is representable]".  Thus
((rman_res_t)-1) is maxof(rman_res_t) + 1 - 1 == maxof(rman_res_t).

(~(rman_res_t)0) involves the specification of the '~' operator
saying that complementing the bits actually works for the unsigned
case.  The precise wording is "If the promoted type [of E] is an
unsigned type, then the expression ~E is equivalent to the maximum
value representable in that type minus E".  Thus (~(rman_res_t)0)
is maxof(rman_res_t) - 0 = maxof(rman_res_t).  I think the precise
wording is redundant, but this is not clear.  Something is needed
to ensure that complementing bits doesn't give a trap representation
for the unsigned case.  Complementing only value bits wouldn't give
a trap representation, but complementing padding bits could (should)
give a trap representation.  The spec is too fuzzy to say which bits
are complemented, except in the unsigned case padding bits must not
be complemented if this would give a trap representation.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295768 - head/usr.sbin/iostat

2016-02-19 Thread Bruce Evans

On Sat, 20 Feb 2016, Dimitry Andric wrote:


On 19 Feb 2016, at 16:49, Alan Somers  wrote:


On Fri, Feb 19, 2016 at 5:24 AM, Sergey Kandaurov  wrote:

...

-struct nlist namelist[] = {
+static struct nlist namelist[] = {
#define X_TTY_NIN  0
-   { "_tty_nin" },
+   { .n_name = "_tty_nin",
+ .n_type = 0, .n_other = 0, .n_desc = 0, .n_value = 0 },
[...]


You unlikely need this excessive explicit zeroization.
In this case it is implicitly prezeroed.

...

Yeah, it was being implicitly zeroized before.  But Clang complained
about the structures being only partially initialized.  Since the
whole point of my commit was to increase the WARNS level, I explicitly
zeroed the zero fields to silence Clang.


You got this warning, most likely:

usr.sbin/iostat/iostat.c:122:15: error: missing field 'n_type' initializer 
[-Werror,-Wmissing-field-initializers]
   { "_tty_nin" },
^

This warning is only produced when you use -Wall -W, and then initialize
structs partially, i.e. you initialize some fields but not others.  I
think this is a quite reasonable warning for a high warning level.


No, this more than defeats the reason for existence of C99 initializers.
With C90 initializors, you sometimes had to write initializers for all
the fields between the 2 that you care about, and get the order and number
of al the fields right.  At least you didn't have to write initializers
for the trailing fields.  With compiler warnings about missing initializers,
you also have to write initializers for trailing fields.  I forget if
C99 initializers allows mixing named fields with unnamed fields, but style
rules shouldn't allow it.


On the other hand, if this kind of construct is used throughout the
tree, and it is not seen as a big problem, we can simply silence this
particular warning using -Wno-missing-field -initializers.  There is
already quite a list of warnings which are suppressed by default, even
at WARNS=6, namely:

-Wno-empty-body
-Wno-format-y2k
-Wno-pointer-sign
-Wno-string-plus-int
-Wno-unused-const-variable
-Wno-unused-parameter


We started using it for lists of functions for vfs, where some of the
entries should be optional but all had to be supplied, in the correct
order, for C90 initializers.  These lists were reduced from excessively
dynamic lists, and they ended up with not many optional entries, so
they don't use the feature of omitting optional entries much.

Bruce

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295768 - head/usr.sbin/iostat

2016-02-19 Thread Bruce Evans

On Fri, 19 Feb 2016, Benjamin Kaduk wrote:


On Fri, Feb 19, 2016 at 5:06 PM, Gleb Smirnoff  wrote:


On Fri, Feb 19, 2016 at 08:49:43AM -0700, Alan Somers wrote:
A> On Fri, Feb 19, 2016 at 5:24 AM, Sergey Kandaurov 
wrote:
A> Yeah, it was being implicitly zeroized before.  But Clang complained
A> about the structures being only partially initialized.  Since the
A> whole point of my commit was to increase the WARNS level, I explicitly
A> zeroed the zero fields to silence Clang.

Isn't zero filling part of the standard? I don't see why lack of
explicit zeroing is a warning? Looks a false warning to me.


It is not quite as simple as this would make it sound.  The elements or
members of an aggregate (e.g.) structure type are initialized as if it were
an object of static storage duration (i.e., to zero) if the initializer
list does not contain enough initializers for all members of the aggregate
type, per item 21 of section 6.7.8 of n1256.pdf.  However, such
initialization does not necessarily need to zero any padding bytes that are
present, which may take unspecified values.


Perhaps, but then there is even less reason to expect that initializing
all the struct members initialzes the padding between them.


Personally, I think this
particular clang warning can be too aggressive, especially for complex
structs, but on the other hand given the indeterminateness of padding,
bzero/memset are often a better choice anyway.


It is just a bug in clang.

Using auto structs with initialzers, auto structs initialized by memset/
bzero followed by initializing a few fields, are almost equally good
pessimizations.  Some compilers generate similar code for them (like the
code shown by pluknet -- the initializer is optimized to an inlined
memset followed by initializing 1 field.  But doing this this on every
function call is a good pessimization if the struct never changes or
rarely changes or only changes slightly.  Some compilers generate the
worse code of keeping a static copy and generating an inline memcpy to
the auto variable on every function call.

6.7.8p9 says that unnamed members have indeterminate values even after
initialization, except where explicitly stated otherwise.

I couldn't find anywhere "explicitly stating otherwise" that padding
is initialized.  It takes an explicit statements in p21 to get the
unnamed members in an initializer initialized, but padding isn't even.
p21 says that any unnamed struct member is initialized as if it had
static storage duration with no initializer, but I couldn't find
anything requiring initialization for padding in that case.  Apparently,
only quality of implementation prevents it being initialized with
passwords from the previous program :-(.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295557 - head/sys/dev/uart

2016-02-15 Thread Bruce Evans

On Mon, 15 Feb 2016, Ian Lepore wrote:


On Tue, 2016-02-16 at 11:01 +1100, Bruce Evans wrote:

On Mon, 15 Feb 2016, Ian Lepore wrote:


On Tue, 2016-02-16 at 09:28 +1100, Bruce Evans wrote:

On Mon, 15 Feb 2016, Michal Meloun wrote:


[...]
Please note that ARM architecture does not have vectored interrupts,
CPU must read actual interrupt source from external interrupt
controller (GIC) register. This register contain predefined value if
none of interrupts are active.

1 - CPU1: enters ns8250_bus_transmit() and sets IER_ETXRDY.
2 - HW: UART interrupt is asserted, processed by GIC and signaled
   to CPU2.
3 - CPU2: enters interrupt service.


It is blocked by uart_lock(), right?


4 - CPU1: writes character to into REG_DATA register.
5 - HW: UART clear its interrupt request
6 - CPU2: reads interrupt source register. No active interrupt is
   found, spurious interrupt is signaled, and CPU leaves interrupted
   state.
7 - CPU1: executes uart_barrier(). This function is not empty on ARM,
   and can be slow in some cases.


It is not empty even on x86, although it probably should be.

BTW, if arm needs the barrier, then how does it work with
bus_space_barrier() referenced in just 25 files in all of /sys/dev?


With a hack, of course.  In the arm interrupt-controller drivers we
always call bus_space_barrier() right before doing an EOI.  It's not a
100% solution, but in practice it seems to work pretty well.


I thought about the x86 behaviour a bit more and now see that it does
need barriers but not the ones given by bus_space_barrier().  All (?)
interrupt handlers use mutexes (if not driver ones, then higher-level
ones).   These might give stronger or different ordering than given by
bus_space_barrier().  On x86, they use the same memory bus lock as
the bus_space_barrier().  This is needed to give ordering across
CPUs.  But for accessing a single device, you only need program order
for a single CPU.  This is automatic on x86 provided a mutex is used
to prevent other CPUs accessing the same device.  And if you don't use
a mutex, then bus_space_barrier() cannot give the necessary ordering
since if cannot prevent other CPUs interfering.

So how does bus_space_barrier() before EOI make much difference?  It
doesn't affect the order for a bunch of accesses on a single CPU.
It must do more than a mutex to do something good across CPUs.
Arguably, it is a bug in mutexes is they don't gives synchronization
for device memory.

...
The hack code does a drain-write-buffer which doesn't g'tee that the
slow peripheral write has made it all the way to the device, but it
does at least g'tee that the write to the bus the perhiperal is on has
been posted and ack'd by any bus<->bus bridge, and that seems to be
good enough in practice.  (If there were multiple bridged busses
downstream it probably wouldn't be, but so far things aren't that
complicated inside the socs we support.)


Hmm, so there is some automatic strong ordering but mutexes don't
work for device memory?


I guess you keep mentioning mutexes because on x86 their implementation
uses some of the same instructions that are involved in bus_space
barriers on x86?  Otherwise I can't see what they have to do with
anything related to the spurious interrupts that happen on arm.  (You
also mentioned multiple CPUs, which is not a requirement for this
trouble on arm, it'll happen with a single core.)


Partly.  I wasn't worrying about this "spurious" interrupt but locking
in general.  Now I don't see how mutexes can work unless they order
device accesses in exactly the same way as ordinary memory accesses.


The piece of info you're missing might be the fact that memory-mapped
device registers on arm are mapped with the Device attribute which
gives stronger ordering than Normal memory.  In particular, writes are
in order and not combined, but they are buffered.


arm doesn't need the barrier after each output byte then, and in general
bus_space_write_multi_N() shouldn't need a barrier after each write, but
only it can reasonably know if it does, so any barrier(s) for it belong
in it and not in callers.

Drivers for arm could also do a bunch of unrelated writes (and reads?)
followed by a single barrier.  But this is even more MD.


In some designs
there are multiple buffers, so there can be multiple writes that
haven't reached the hardware yet.  A read from the same region will
stall until all writes to that region are done, and there is also an
instruction that specifically forces out the buffers and stalls until
they're empty.


The multiple regions should be in separate bus spaces so that the barriers
can be optimized to 1 at the end.  I now see how the optimization can
almost be done at the bus_space level -- drivers call
bus_space_maybe_barrier() after every access, but this is a no-op on bus
spaces with sufficient ordering iff the bus space hasn't changed;
drivers call bus_space_sync_barriers() at the end.  I think the DMA sync
functions work a bit li

Re: svn commit: r295557 - head/sys/dev/uart

2016-02-15 Thread Bruce Evans

On Mon, 15 Feb 2016, Ian Lepore wrote:


On Tue, 2016-02-16 at 09:28 +1100, Bruce Evans wrote:

On Mon, 15 Feb 2016, Michal Meloun wrote:


[...]
Please note that ARM architecture does not have vectored interrupts,
CPU must read actual interrupt source from external interrupt
controller (GIC) register. This register contain predefined value if
none of interrupts are active.

1 - CPU1: enters ns8250_bus_transmit() and sets IER_ETXRDY.
2 - HW: UART interrupt is asserted, processed by GIC and signaled
   to CPU2.
3 - CPU2: enters interrupt service.


It is blocked by uart_lock(), right?


4 - CPU1: writes character to into REG_DATA register.
5 - HW: UART clear its interrupt request
6 - CPU2: reads interrupt source register. No active interrupt is
   found, spurious interrupt is signaled, and CPU leaves interrupted
   state.
7 - CPU1: executes uart_barrier(). This function is not empty on ARM,
   and can be slow in some cases.


It is not empty even on x86, although it probably should be.

BTW, if arm needs the barrier, then how does it work with
bus_space_barrier() referenced in just 25 files in all of /sys/dev?


With a hack, of course.  In the arm interrupt-controller drivers we
always call bus_space_barrier() right before doing an EOI.  It's not a
100% solution, but in practice it seems to work pretty well.


I thought about the x86 behaviour a bit more and now see that it does
need barriers but not the ones given by bus_space_barrier().  All (?)
interrupt handlers use mutexes (if not driver ones, then higher-level
ones).   These might give stronger or different ordering than given by
bus_space_barrier().  On x86, they use the same memory bus lock as
the bus_space_barrier().  This is needed to give ordering across
CPUs.  But for accessing a single device, you only need program order
for a single CPU.  This is automatic on x86 provided a mutex is used
to prevent other CPUs accessing the same device.  And if you don't use
a mutex, then bus_space_barrier() cannot give the necessary ordering
since if cannot prevent other CPUs interfering.

So how does bus_space_barrier() before EOI make much difference?  It
doesn't affect the order for a bunch of accesses on a single CPU.
It must do more than a mutex to do something good across CPUs.
Arguably, it is a bug in mutexes is they don't gives synchronization
for device memory.


...
The hack code does a drain-write-buffer which doesn't g'tee that the
slow peripheral write has made it all the way to the device, but it
does at least g'tee that the write to the bus the perhiperal is on has
been posted and ack'd by any bus<->bus bridge, and that seems to be
good enough in practice.  (If there were multiple bridged busses
downstream it probably wouldn't be, but so far things aren't that
complicated inside the socs we support.)


Hmm, so there is some automatic strong ordering but mutexes don't
work for device memory?


The g'teed way is to do a readback of the register written-to, or some
nearby register if it's write-only.  That's a hard thing to do in a
bus_space_barrier implementation that has no knowledge of things like
which registers in a device might be write-only.

And of course, then you're still left with the problem of hundreds of
drivers that don't do any barrier calls at all.


Mostly unimportant ones for fast NICs :-).  These could be made even slower
using lots of barriers.  But the fast ones already have to use special
interfaces DMA since single-word bus accesses are too slow.


[...]


Also, at this time, UART driver is last one known to generate spurious
interrupts in ARM world.

So, whats now? I can #ifdef __arm__ change made in r295557 (for maximum
safety), if you want this. Or we can just wait to see if someone reports
a problem ...


Use better methods.


How about a dev.uart..broken_txrdy tunable sysctl that defaults
to the new behavior but can be turned on by anyone with the buggy
hardware?


It's difficult to know that such knobs even exist.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295557 - head/sys/dev/uart

2016-02-15 Thread Bruce Evans

On Mon, 15 Feb 2016, Michal Meloun wrote:


Dne 13.02.2016 v 1:58 Marius Strobl napsal(a):

On Sat, Feb 13, 2016 at 06:53:25AM +1100, Bruce Evans wrote:

On Fri, 12 Feb 2016, Marius Strobl wrote:


On Fri, Feb 12, 2016 at 05:14:58AM +, Michal Meloun wrote:

Author: mmel
Date: Fri Feb 12 05:14:58 2016
New Revision: 295557
URL: https://svnweb.freebsd.org/changeset/base/295557

Log:
  UART: Fix spurious interrupts generated by ns8250 and lpc drivers:
   - don't enable transmitter empty interrupt before filling TX FIFO.


Are you sure this doesn't create a race that leads to lost TX ready
interrupts? For a single character, the TX FIFO very well may be empty
again at the point in time IER_ETXRDY is enabled now.  With the varying
behavior of 8250/16x50 chips - some of which is documented in sio(4) -


That is mostly FUD.  More likely driver bugs than chip bugs.

A non-broken xx50 interrupts after you (re)enable tx interrupts, iff
the fifo is already empty.  This gives a "spurious" interrupt.  But
perhaps depending on this is too fragile.  Normal operation is to keep
...

I'd expect there are many that no longer generate a TX ready at all
with this change in place. In this case, receiving spurious interrupts
(which ones? IIR_NOPEND? IIR_TXRDY?) with some devices appears to be
the lesser evil.


Not many.  Only broken ones.


In my experience many xx50 are broken, especially the integrated
on-board ones you still have in workstations and servers today.


I haven't seen any with this bug.  But I haven't seen many newer ones
or any in workstations or servers since I only use consumer-grade
motherboards -- maybe those are better :-).  Why would a new ASIC
reimplement bugs that were in the original 8250?  (IIRC, there was
an 8250 with lots of bugs and limited production, and an 8250A that
was better.  FreeBSD is much newer than the 8250A so it might never
have been used on an 8250.)


The "spurious" interrupts are just normal
ones from bon-broken chips:

- uart first does a potentially-unbounded busy-wait before the doing
   anything to ensure that the fifo is empty.  This should be unecessary
   since this function should not be called unless sc_txbusy is 0 and
   sc_txbusy should be nonzero if the fifo is not empty.  If it is called
   when the fifo is not emptu, then the worst-case busy-wait is approx.
   640 seconds for a 128-byte fifo at 2 bps. The 'broken_txfifo case'
   busy-waits for a long time in normal operation.
- enabling the tx interrupt causes one immediately on non-broken uarts
- the interrupt handler is normally called immediately.  Then it always
   blocks on uart_lock()
- then the main code fills the fifo and unlocks
- then the interrupt handler runs.  It normally finds that the fifo is
   not empty (since it has just been filled) and does nothing
- another tx interrupt occurs later and the interrupt handler runs again.
   It normally doesn't hit the lock again, and normally finds the fifo
   empty, so it does something.


You correctly describe what happens at r295556 with a non-broken xx50.
That revision causes a spurious interrupt with non-broken xx50 but
also ensures that the relevant TX interrupt isn't missed with broken
xx50 that do not issue an interrupt when enabling IER_ETXRDY. Besides,
as you say, the general approach of dynamically enabling TX interrupts
works around the common brokenness of these interrupts no longer going
away when they should.


But you are probably correct that a 1-byte write to the fifo often
loses the race.  This depends on how fast the hardware moves the byte
from the fifo to the tx register.  Actually, since we didn't wait
for the tx register to become empty, it will often take a full character
time before the move.  After that, I think it might take 1 bit time but
no more.


My concern is that with r295557, when this race is lost no TX interrupt
is seen at all with broken xx50 that do not trigger an interrupt when
enabling IER_ETXRDY.


Certainly that is a concern if there are chips with the bug.


No, I?m not sure, nobody can be sure if we talking about ns8250
compatible device(s). Also, all UARTs known to me, generates an
interrupt on TX unmasking (assuming level sensitive interrupt).
Only IIR can reports bad priority so some very old 8250 (if
memory still serve me).


Nah, compatible means only having bugs that are compatible, and a
noremal xx50 doesn't have this bug :-).  Strict compatibility with
the original 8250 would give lots of bugs but it is unlikely that
anything new is compatible with that.

Driver bugs are still more likely.  This reminds me that the most
likely source of them is edge triggered ISA interrupts and drivers
not doing enough in the interrupt handler to ensure getting a new
interrupt.  I think you remember correctly that bad priority was
one of the bugs in the original 8250.


I only found following scenario on multiple ARM SOCs.

Please note that ARM architecture does not have vectored interrupts,
CPU must r

Re: svn commit: r295631 - head/lib/libc/stdio

2016-02-15 Thread Bruce Evans

On Mon, 15 Feb 2016, Pedro F. Giffuni wrote:


Log:
 fputs: Return the number of bytes written.

 POSIX.1-2008 requires that successful completion simply return a
 non-negative integer. We have regularly returned a constant value.
 Another, equally valid, implementation convention implies returning
 the number of bytes written.

 Adopt this last convention to be in line with what Apple's libc
 does. POSIX also explicitly notes:

 Note that this implementation convention cannot be adhered to for strings
 longer than {INT_MAX} bytes as the value would not be representable in the
 return type of the function. For backwards-compatibility, implementations
 can return the number of bytes for strings of up to {INT_MAX} bytes, and
 return {INT_MAX} for all longer strings.

 Developers shouldn't depend specifically on either convention but
 the change may help port software from Apple.

 Differential Revision:  https://reviews.freebsd.org/D442 (Partial)
 Obtained from:  Apple Inc. (Libc 997.90.3 with changes)
 Relnotes:  yes

Modified:
 head/lib/libc/stdio/fputs.c

Modified: head/lib/libc/stdio/fputs.c
==
--- head/lib/libc/stdio/fputs.c Mon Feb 15 17:14:10 2016(r295630)
+++ head/lib/libc/stdio/fputs.c Mon Feb 15 18:13:33 2016(r295631)
@@ -37,6 +37,7 @@ static char sccsid[] = "@(#)fputs.c  8.1
__FBSDID("$FreeBSD$");

#include "namespace.h"
+#include 
#include 
#include 
#include "un-namespace.h"
@@ -62,5 +63,7 @@ fputs(const char * __restrict s, FILE *
ORIENT(fp, -1);
retval = __sfvwrite(fp, );
FUNLOCKFILE(fp);
+   if (retval == 0)
+   return (iov.iov_len > INT_MAX ? INT_MAX : uio.uio_resid);
return (retval);
}


Testing would have shown that this change has no effect except in the
unusual case where iov.iov_len > INT_MAX.  uio.resid is always reduced
to 0 if all the output actually worked.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295561 - in head: include sys/mips/include sys/powerpc/include sys/sparc64/include sys/sys sys/x86/include

2016-02-13 Thread Bruce Evans

On Sat, 13 Feb 2016, Konstantin Belousov wrote:


On Sat, Feb 13, 2016 at 07:29:48AM +1100, Bruce Evans wrote:

It needs a version number to be correct.  Plain __POSIX_VISIBLE means
all versions.  __POSIX_VISBLE >= 200112 is probably good enough for
the POSIX part.  I don't know the correct number for XSI.  Perhaps
it can be simply omitted.  _XOPEN_SOURCE >= 600 gives
__XSI_VISIBLE >= 600 && _POSIX_C_SOURCE >= 200112.  So __XSI_VISIBLE
in the above is only to support XSI before 2001.  We only support 1
XSI before that -- #500 which corresponds to 1996 POSIX.  Very likely
XSI did have ucontext_t then, so the above ifdef is correct.  It is
just hard to read.  Easier than google hits though :-).  I found a
gnu man page about getcontext() being in SVID but couldn't easily
find the history.


I have SVID document labeled FINAL COPY June 15, 1995 First Edition.
There, it seems that ucontext.h is required.

Below is the updated patch.

diff --git a/include/signal.h b/include/signal.h
index 33be55c..217fadd 100644
--- a/include/signal.h
+++ b/include/signal.h
@@ -36,8 +36,10 @@
#include 
#include 
#include 
+#if __POSIX_VISIBLE >= 200112 || __XSI_VISIBLE
#include 
#include 
+#endif

#if __BSD_VISIBLE
/*


Good.

The old visibility bugs with mc_* should be fixed someday.  I said that
uc_* could be used.  _mc_* should have been used originally.  Maybe we
can just change to either of these, since nothing except libc should
have used the internals of mcontext_t.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295561 - in head: include sys/mips/include sys/powerpc/include sys/sparc64/include sys/sys sys/x86/include

2016-02-13 Thread Bruce Evans

On Sat, 13 Feb 2016, Konstantin Belousov wrote:


On Sun, Feb 14, 2016 at 04:02:05AM +1100, Bruce Evans wrote:

The old visibility bugs with mc_* should be fixed someday.  I said that
uc_* could be used.  _mc_* should have been used originally.  Maybe we
can just change to either of these, since nothing except libc should
have used the internals of mcontext_t.


mcontext_t is needed and used by many language runtimes which handle
exceptions for their functionality. It could be gc barriers, hacks with
tagging, and many more. API cannot be broken there by renaming the
structure members.


But it is not part of the API.  The API is specified to be opaque.
getcontext(3) doesn't even document the visible parts of it.


The only way to hide mc_* is to rename them to __mc_* and provide compat
redefinitions when  is included directly. This is
extremely cumbersome and I do not see a need in such fix for perceived
use case of pure ANSI C code which, to be broken, must #define mc_XXX
and then include .


How likely is third party code to use undocumented internals of opqaue
types when FreeBSD code mostly doesn't used them.

I checked all files in /usr/src outside of /sys/ that contain "ucontext.h".
There are 71 such files with 204 lines matching "mc_".  The actual users
of mcontext_t's internals are:

contrib/compiler-rt/lib/sanitizer_common/sanitizer_linux.cc: 8 instances
contrib/compiler-rt/lib/sanitizer_common/sanitizer_freebsd.h: 30 instances
lib/libc: 102 instances
lib/libthread_db: 60 instances
tools/KSE/ksetest/kse_threads_test.c: 2 instances
tools/KSE/rr/rr.c: 2 instances

So there are 4 instances outside of libraries (counting compiler_rt as
a libraries) and these 4 are to support KSE which went away about 11
years ago IIRC.  tools/KSE doesn't compile now of course.  The first
error is that its asm source file is i386-only, so fails on i386.
The next error is that its primary (?) API file  no longer
exists.

I hoped to find no instances outside of libc.  Bits in contrib are are
problem.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295561 - in head: include sys/mips/include sys/powerpc/include sys/sparc64/include sys/sys sys/x86/include

2016-02-12 Thread Bruce Evans

On Fri, 12 Feb 2016, Konstantin Belousov wrote:


Log:
 POSIX states that #include  shall make both mcontext_t and
 ucontext_t available.  Our code even has XXX comment about this.


Only newer versions of POSIX have this bug.  In the 2001 version, the
bug was limited to the XSI section.   was limited to XSI
too.  Now,  specified to be obsolescent, but it is further
gone that that -- web searches find only old versions and its functions
seem to be only specified without prototyps in the "Removed" section.

The POSIX bug also reserves uc_* in .  This is needed for
exposing ucontext_t.  POSIX has the bug that ucontext_t must be a
struct to work, but is declared as a typedef.  Structs cannot be very
opaque since they expose struct members...


 Add a bit of compliance by moving struct __ucontext definition into
 sys/_ucontext.h and including it into signal.h and sys/ucontext.h.

 Several machine/ucontext.h headers were changed to use namespace-safe
 types (like uint64_t->__uint64_t) to not depend on sys/types.h.
 struct __stack_t from sys/signal.h is made always visible in private
 namespace to satisfy sys/_ucontext.h requirements.

 Apparently mips _types.h pollutes global namespace with f_register_t
 type definition.  This commit does not try to fix the issue.


However, mcontext_t is opaque, and mc_* is not reserved even in
.  In FreeBSD, mcontext_t is implemented as a struct so the
names of its struct members now pollute even .  It is a
reasonable fix to abuse the reserved uc_ prefix in mcontext_t.
mcontext_t is just the MD part of ucontext_t.  Prefixes like uc_mc_
would express its nesting but are too long.


Modified: head/include/signal.h
==
--- head/include/signal.h   Fri Feb 12 07:27:24 2016(r295560)
+++ head/include/signal.h   Fri Feb 12 07:38:19 2016(r295561)
@@ -36,6 +36,8 @@
#include 
#include 
#include 
+#include 
+#include 


This pollutes even the non-POSIX case with POSIX and FreeBSD names.

The correct ifdefs for this are messy.  No names should be added for
non-POSIX and old versions of POSIX.  2001 POSIX needs an XSI ifdef.  Later
versions of POSIX don't need an ifdef.  Normally it is better to hide
the ifdefs in the included headers, but _ucontext.h should always
provide ucontext_t and uc_*.


Modified: head/sys/mips/include/ucontext.h
==
--- head/sys/mips/include/ucontext.hFri Feb 12 07:27:24 2016
(r295560)
+++ head/sys/mips/include/ucontext.hFri Feb 12 07:38:19 2016
(r295561)
@@ -50,13 +50,13 @@ typedef struct  __mcontext {
 * struct sigcontext and ucontext_t at the same time.
 */
int mc_onstack; /* sigstack state to restore */
-   register_t  mc_pc;  /* pc at time of signal */
-   register_t  mc_regs[32];/* processor regs 0 to 31 */
-   register_t  sr; /* status register */
-   register_t  mullo, mulhi;   /* mullo and mulhi registers... */
+   __register_tmc_pc;  /* pc at time of signal */
+   __register_tmc_regs[32];/* processor regs 0 to 31 */
+   __register_tsr; /* status register */
+   __register_tmullo, mulhi;   /* mullo and mulhi registers... */
int mc_fpused;  /* fp has been used */
f_register_tmc_fpregs[33];  /* fp regs 0 to 31 and csr */
-   register_t  mc_fpc_eir; /* fp exception instruction reg */
+   __register_tmc_fpc_eir; /* fp exception instruction reg */
void*mc_tls;/* pointer to TLS area */
int __spare__[8];   /* XXX reserved */
} mcontext_t;



These mc_* names always polluted  but no one cared since it
was an XSI extension.

sr, mullo and mulhi have worse names.  These names don't even use the
same style as the others.  They now pollute  of course.

__spare__ is bogusly named and has a banal comment.  The names of spares
should be in the normal (reserved) namespace for the struct.  No namespace
is reserved here, bug mc_spare would be no worse than the other mc_'s.

i386 was missing all of these bugs except the mc_* pollution.


Copied and modified: head/sys/sys/_ucontext.h (from r295560, 
head/sys/sys/ucontext.h)


sys/ucontext.h (== ) remains polluted.  I point out its
old bugs here since most of it is all quoted.


==
--- head/sys/sys/ucontext.h Fri Feb 12 07:27:24 2016(r295560, copy 
source)
+++ head/sys/sys/_ucontext.hFri Feb 12 07:38:19 2016(r295561)
@@ -28,11 +28,8 @@
 * $FreeBSD$
 */

-#ifndef _SYS_UCONTEXT_H_
-#define_SYS_UCONTEXT_H_
-
-#include 
-#include 
+#ifndef _SYS__UCONTEXT_H_
+#define_SYS__UCONTEXT_H_

typedef struct __ucontext {
/*
@@ -43,64 +40,13 @@ typedef struct 

Re: svn commit: r295561 - in head: include sys/mips/include sys/powerpc/include sys/sparc64/include sys/sys sys/x86/include

2016-02-12 Thread Bruce Evans

On Fri, 12 Feb 2016, Konstantin Belousov wrote:


On Fri, Feb 12, 2016 at 01:22:04PM +, Ruslan Bukin wrote:

On RISC-V it fails with __uint128_t:

struct fpregs {
__uint128_t fp_x[32];

how to fix?

You did not copied the error.

If my guess is correct, the issue is that __uint128_t typedef is not
present in the riscv/include/_types.h.  Either add the type there, or
use e.g. __uint64_t fp_x[32][2]; for the member definition.


__uint128_t is defined by newer compilers.  Declaring it as a typedef
would probably be a syntax error.


BTW, uintmax_t on riscv is defined as uint64.


Same on all arches.

Any normal use of __uintmax_t breaks uintmax_t since uintmax_t is
supposed to be the largest integer type, but it is ony 64 bits on all
supported arches, bu __uint128_t is larger.  printf() doesn't support
printing __uint128_t...

Expansion of uintmax_t to make __uint128_t, __uint256_t, __uint512_t, ...
more useful would mainly give bloat and break ABIs.


P.S. Does it also mean that the tinderbox machines (AKA universe11*)
do not have riscv toolchain installed ?  I did run make tinderbox
before the commit.


Well, I missed it and only saw __uint128_t in the arm header since
riscv is too new for freefall to have it checked out, and I didn't
rememeber that so I didn't look at another checkout.

Looking at another checkout now shows the following namespace errors
in riscv/include/ucontext.h (mostly the same as arm64):

grpregs, gp_*
fp_regs, fp_*
pad (not the usual __spare__ bogusness.  Such fields are closer to
  being padding than spares.  Of course, padding can be used as
  spares.)

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295561 - in head: include sys/mips/include sys/powerpc/include sys/sparc64/include sys/sys sys/x86/include

2016-02-12 Thread Bruce Evans

On Fri, 12 Feb 2016, Ruslan Bukin wrote:


On RISC-V it fails with __uint128_t:

struct fpregs {
   __uint128_t fp_x[32];

how to fix?


This seems to be an old bug.  __uint128_t shouldn't be used unless it
is supported by .  In old versions of FreeBSD, I used
something like __attribute__(( ... __mode__(__XI__) ... )) for related
problems with long long.  Here __uint128_t is a compiler extension that
is not very different from the attribute.

This seems to be in arm64/ucontext.h, but the name there is spelled fp_q.

arm64/ucontext.h has mounds of namespace errors.  It uses the following
names in the application namespace:

gpregs, gp_*
fpregs, fp_*
mc_* (like other arches)

Checking this for other arches:

arm:
Clean.  It uses an excessive number of leading underscores (2) instead
of none for things like gp_*.  It spells __uint32_t verbosely as
'unsigned int.

i386:
Just the bogus name __spare__.  The 2 trailing underscores don't even
serve a technical purpose.

mips:
Mounds of namespace errors:

sr, mullo, mulhi (2 instances)
__spare__ (3 instances)
Likely pollution from nested include of freebsd32_signal.h (doesn't
  seem to be kernel-only)
SZREG
UCTX_REG
UCR_*

powerpc, sparc64:
OK except for mc_* after adding underscores to uint*_t and register_t.
Both use some hackish #defines of struct member names.  powerpc
#define's mc_* so is more fragile; sparc64 #define's _mc_* instead,
but uses the usual mc_* for the struct member names.

x86:
Clean except for mc_*.

[Context lost to top posting]

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295561 - in head: include sys/mips/include sys/powerpc/include sys/sparc64/include sys/sys sys/x86/include

2016-02-12 Thread Bruce Evans

On Fri, 12 Feb 2016, Konstantin Belousov wrote:


On Sat, Feb 13, 2016 at 01:08:55AM +1100, Bruce Evans wrote:

On Fri, 12 Feb 2016, Konstantin Belousov wrote:

...

Modified: head/include/signal.h
==
--- head/include/signal.h   Fri Feb 12 07:27:24 2016(r295560)
+++ head/include/signal.h   Fri Feb 12 07:38:19 2016(r295561)
@@ -36,6 +36,8 @@
#include 
#include 
#include 
+#include 
+#include 


This pollutes even the non-POSIX case with POSIX and FreeBSD names.

The correct ifdefs for this are messy.  No names should be added for
non-POSIX and old versions of POSIX.  2001 POSIX needs an XSI ifdef.  Later
versions of POSIX don't need an ifdef.  Normally it is better to hide
the ifdefs in the included headers, but _ucontext.h should always
provide ucontext_t and uc_*.

Our non-POSIX namespace is strictly superset of the POSIX space.
I would not hide {m,u}context_t from the default visibility, and this
would defeat the goal of the change.


Our Standard C namespace is a subset of the POSIX namespace.  Most
Standard C headers are (were) careful about this.  There aren't many
Standard C headers or versions of Standard C or large variations in the
versions, so this is relatively easy to do.

We also attempt to use POSIX visibility ifdefs.  This is not done so
carefully, and has more bugs due to more variations.  But 
does it fairly carefully.  It has ifdefs for XSI, POSIX >= 2001,
any-POSIX (this is now very broken by unconditional declarations for
pthreads), POSIX >= 1995 (realtime POSIX stuff which I think was
standardized in POSIX.4 (.1b?) a couple of years before 1995, but we
don't have ifdefs for that), POSIX >= 2008 (psignal).  That is just
in the top-level header.  Almost half of that is only under XSI or
BSD, and it is quite likely that some newer or older POSIX or XSI
things are under the wrong ifdefed but are usually visible because
everything is visible by default.  sys/signal.h has ifdefs for
POSIX >= 1993, more-delicate version ifdefs for XSI and POSIX <= 2001.
Most of the POSIX >= 1993 ifdefs are for realtime POSIX stuff.  An
enormous number of definitions for things like trap codes for FP errors
are misplaced under the POSIX >= 1993 || XSI ifdef.  These weren't in
POSIX.1-2006.  This contrasts with the fine-grained ifdefs for signal
numbers.

I find the ifdefs useful for seeing when POSIX introduced a feature
but not for actual use to compile under an old standard.


Modified: head/sys/mips/include/ucontext.h
==
--- head/sys/mips/include/ucontext.hFri Feb 12 07:27:24 2016
(r295560)
+++ head/sys/mips/include/ucontext.hFri Feb 12 07:38:19 2016
(r295561)
@@ -50,13 +50,13 @@ typedef struct  __mcontext {
 * struct sigcontext and ucontext_t at the same time.
 */
int mc_onstack; /* sigstack state to restore */
-   register_t  mc_pc;  /* pc at time of signal */
-   register_t  mc_regs[32];/* processor regs 0 to 31 */
-   register_t  sr; /* status register */
-   register_t  mullo, mulhi;   /* mullo and mulhi registers... */
+   __register_tmc_pc;  /* pc at time of signal */
+   __register_tmc_regs[32];/* processor regs 0 to 31 */
+   __register_tsr; /* status register */
+   __register_tmullo, mulhi;   /* mullo and mulhi registers... */
int mc_fpused;  /* fp has been used */
f_register_tmc_fpregs[33];  /* fp regs 0 to 31 and csr */
-   register_t  mc_fpc_eir; /* fp exception instruction reg */
+   __register_tmc_fpc_eir; /* fp exception instruction reg */
void*mc_tls;/* pointer to TLS area */
int __spare__[8];   /* XXX reserved */
} mcontext_t;



These mc_* names always polluted  but no one cared since it
was an XSI extension.

sr, mullo and mulhi have worse names.  These names don't even use the
same style as the others.  They now pollute  of course.

__spare__ is bogusly named and has a banal comment.  The names of spares
should be in the normal (reserved) namespace for the struct.  No namespace
is reserved here, bug mc_spare would be no worse than the other mc_'s.

i386 was missing all of these bugs except the mc_* pollution.

The members names for the structures are private per the structure
namespace.  The names of the members cannot pollute signal.h.


No, they aren't private.  Applications can #define them to ,
and this is valid since they are not reserved.  Standards have to reserve
names of all struct members to make this invalid.  POSIX does this for
uc_*.


Copied and modified: head/sys/sys/_ucontext.h (from r295560, 
head/sys/sys/ucontext.h)


sys/ucontext.h (== ) remains polluted.  I point out its
old bugs here si

Re: svn commit: r295557 - head/sys/dev/uart

2016-02-12 Thread Bruce Evans

On Fri, 12 Feb 2016, Marius Strobl wrote:


On Fri, Feb 12, 2016 at 05:14:58AM +, Michal Meloun wrote:

Author: mmel
Date: Fri Feb 12 05:14:58 2016
New Revision: 295557
URL: https://svnweb.freebsd.org/changeset/base/295557

Log:
  UART: Fix spurious interrupts generated by ns8250 and lpc drivers:
   - don't enable transmitter empty interrupt before filling TX FIFO.


Are you sure this doesn't create a race that leads to lost TX ready
interrupts? For a single character, the TX FIFO very well may be empty
again at the point in time IER_ETXRDY is enabled now.  With the varying
behavior of 8250/16x50 chips - some of which is documented in sio(4) -


That is mostly FUD.  More likely driver bugs than chip bugs.

A non-broken xx50 interrupts after you (re)enable tx interrupts, iff
the fifo is already empty.  This gives a "spurious" interrupt.  But
perhaps depending on this is too fragile.  Normal operation is to keep
the tx interrupt enabled and depend on writing to the fifo causing a
tx interrupt later.  But it is a more common chip bug for tx interrupts
later to not go away when they should (normally by reading the IIR),
so some drivers toggle the tx interrupt enable dynamically.

An example of a driver bug is only enabling tx interrupts for this.
It takes a transition of the interrupt enable bit from off to on to
get the interrupt.  Other driver bugs may result in a missing transition
because the bit was supposed to be off but is actually on.


I'd expect there are many that no longer generate a TX ready at all
with this change in place. In this case, receiving spurious interrupts
(which ones? IIR_NOPEND? IIR_TXRDY?) with some devices appears to be
the lesser evil.


Not many.  Only broken ones.  The "spurious" interrupts are just normal
ones from bon-broken chips:

- uart first does a potentially-unbounded busy-wait before the doing
  anything to ensure that the fifo is empty.  This should be unecessary
  since this function should not be called unless sc_txbusy is 0 and
  sc_txbusy should be nonzero if the fifo is not empty.  If it is called
  when the fifo is not emptu, then the worst-case busy-wait is approx.
  640 seconds for a 128-byte fifo at 2 bps. The 'broken_txfifo case'
  busy-waits for a long time in normal operation.
- enabling the tx interrupt causes one immediately on non-broken uarts
- the interrupt handler is normally called immediately.  Then it always
  blocks on uart_lock()
- then the main code fills the fifo and unlocks
- then the interrupt handler runs.  It normally finds that the fifo is
  not empty (since it has just been filled) and does nothing
- another tx interrupt occurs later and the interrupt handler runs again.
  It normally doesn't hit the lock again, and normally finds the fifo
  empty, so it does something.

But you are probably correct that a 1-byte write to the fifo often
loses the race.  This depends on how fast the hardware moves the byte
from the fifo to the tx register.  Actually, since we didn't wait
for the tx register to become empty, it will often take a full character
time before the move.  After that, I think it might take 1 bit time but
no more.

sio uses a timeout to recover from lost output interrupts in case they
occur because the driver or hardware has unforseen bugs.  This works
too well.  It hides bugs like interrupts not working at all provided
the fifo size is larger enough for the low timeout frequency to give
enough throughput.  On newer systems, the overhead for timeouts is in
the noise compared with the overhead for bus accesses (especially
reads), so a simple timeout method is almost as good as an
interrupt-driven method.  It can probably be even better for output
by calculating times to avoid slow reads of status registers to see
how far the chip got.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295561 - in head: include sys/mips/include sys/powerpc/include sys/sparc64/include sys/sys sys/x86/include

2016-02-12 Thread Bruce Evans

On Fri, 12 Feb 2016, Konstantin Belousov wrote:


On Sat, Feb 13, 2016 at 03:56:42AM +1100, Bruce Evans wrote:

Our Standard C namespace is a subset of the POSIX namespace.  Most
Standard C headers are (were) careful about this.  There aren't many
Standard C headers or versions of Standard C or large variations in the
versions, so this is relatively easy to do.

We also attempt to use POSIX visibility ifdefs.  This is not done so
carefully, and has more bugs due to more variations.  But 
does it fairly carefully.  It has ifdefs for XSI, POSIX >= 2001,

I find the ifdefs useful for seeing when POSIX introduced a feature
but not for actual use to compile under an old standard.


So you are about the ANSI C & non-POSIX compilation environment ?
Do you mean the following:

diff --git a/include/signal.h b/include/signal.h
index 33be55c..31cded7 100644
--- a/include/signal.h
+++ b/include/signal.h
@@ -36,8 +36,10 @@
#include 
#include 
#include 
+#if __POSIX_VISIBLE || __XSI_VISIBLE
#include 
#include 
+#endif

#if __BSD_VISIBLE
/*

I can change #if __POSIX_VISIBLE to some version cap, if you prefer.


It needs a version number to be correct.  Plain __POSIX_VISIBLE means
all versions.  __POSIX_VISBLE >= 200112 is probably good enough for
the POSIX part.  I don't know the correct number for XSI.  Perhaps
it can be simply omitted.  _XOPEN_SOURCE >= 600 gives
__XSI_VISIBLE >= 600 && _POSIX_C_SOURCE >= 200112.  So __XSI_VISIBLE
in the above is only to support XSI before 2001.  We only support 1
XSI before that -- #500 which corresponds to 1996 POSIX.  Very likely
XSI did have ucontext_t then, so the above ifdef is correct.  It is
just hard to read.  Easier than google hits though :-).  I found a
gnu man page about getcontext() being in SVID but couldn't easily
find the history.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295523 - head/sys/fs/ext2fs

2016-02-11 Thread Bruce Evans

On Thu, 11 Feb 2016, Pedro F. Giffuni wrote:


Log:
 Ext4: Use boolean type instead of '0' and '1'

 There are precedents of uses of bool in the kernel and
 it is incorrect style to use integers as replacement for
 a boolean type.


This is all backwards.

It is correct style to use integers to represent booleans.

There are precedents for breaking this style by using the
bool type, but not many in fs code:

In all of /sys/fs/*, there were just 12 lines of declarations
that use 'bool'.  All of these were in autofs.  Now there are
13 such lines (1 more in ext2fs).  1 use is especially useless
and especially unlikely to be all uses of booleans.

/sys/fs/nfsclient uses boolean_t in 1 whole line.  boolean_t is
an old spelling of 'bool'.  It is a Mach type for representing
booleans in /sys/vm.  I used to like such typedefs 20-30 years ago
(mine was spelled bool_t) and made the mistake of using the Mach
type elsewhere in the kernel.  The use in nfsclient is actually
correct.  It is used to hold the result of a vm function which
returns boolean_t.

/sys/fs/nfsserver uses bool_t in 1 whole line.  bool_t is is
like boolean_t except in is Sun(?) type for reprenting booleans
in rpc.  It use is again correct -- it is used to hold the
result of an rpc function that returns bool_t.

/sys/fs/tmpfs the Mach spelling on 5 lines.  These uses are to
sks for small pessimizations, but inlining and -O usually gives
the same object code as using integers.  The values start as
flags which are naturally represented as integers and are tested
by expressions of the form (de->td_cookie & FLAG).  These simple
flags tests are obfuscated using function calls.  The functions
return boolean_t, so must convert to a logical expression.  They
do this correctly.  Then the compiler should optimize away all
the obfuscations and pessimizations to give the same object code
as if you wrote the tests directly.

ffs and zfs provided perhaps better examples.  /sys/cddl uses
boolean_t on 102 lines.  It isn't clear what this type is.
compat/opensolaris/sys/types.h seems to use the FreeBSD kernel
boolean_t in the kernel, but in userland it uses 2 different
enum types depending on options.  These types are incompatible
so this depends on magic to work.  /sys/cddl also uses bool_t
twice, once each for rpc and nvpair.  /sys/cddl never uses
C99 bool.

ffs used to use a very pure KNF style with integers and no
typedefs, but it now uses C99 bool on 4 lines.  These uses are
correct.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295362 - head/sys/fs/cd9660

2016-02-07 Thread Bruce Evans

On Sun, 7 Feb 2016, Pedro Giffuni wrote:


On 02/07/16 02:13, Bruce Evans wrote:

On Sun, 7 Feb 2016, Pedro F. Giffuni wrote:


Log:
 cd9660: Drop an unnecessary check for NULL.

 This was unnecessary and also confused Coverity.

 Confirmed on:NetBSD
 CID:978558


This leaves many similar bugs unfixed nearby.  One is a null pointer
panic, not just an unnecessary check.


I admittedly oversimplified the commit log here.

Not only the value can't be null, our brelse() also ignores NULL values.

From sys/kern/vfs_bio.c:

/*
 * Many function erroneously call brelse with a NULL bp under rare
 * error conditions. Simply return when called with a NULL bp.
 */
if (bp == NULL)
return;
...



It has only done that for 11 days.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295370 - head/sys/fs/msdosfs

2016-02-07 Thread Bruce Evans

On Sun, 7 Feb 2016, Pedro F. Giffuni wrote:


Log:
 msdosfs_rename: yet another unused value.

 As with r295355, it seems to be left over from a cleanup
 in r33548. The code is not in NetBSD either.

 Thanks to bde for checking out the history.


r33548 was not a cleanup.  It was a major change to add FAT32 support.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295359 - head/sys/fs/fdescfs

2016-02-06 Thread Bruce Evans

On Sun, 7 Feb 2016, Pedro F. Giffuni wrote:


Log:
 fdesc_setattr: unitialized pointer read

 CID:   1018688


Bug in Coverity.


Modified: head/sys/fs/fdescfs/fdesc_vnops.c
==
--- head/sys/fs/fdescfs/fdesc_vnops.c   Sun Feb  7 01:04:47 2016
(r295358)
+++ head/sys/fs/fdescfs/fdesc_vnops.c   Sun Feb  7 01:09:38 2016
(r295359)
@@ -465,7 +465,7 @@ fdesc_setattr(ap)
{
struct vattr *vap = ap->a_vap;
struct vnode *vp;
-   struct mount *mp;
+   struct mount *mp = NULL;
struct file *fp;
struct thread *td = curthread;
cap_rights_t rights;


2 style bugs in the caller to hide the Coverity bug:
- initialization in declaration
- unused initialization

The initialization is done by calling vn_start_write(... , flags).
mp is only an output parameter unless (flags & V_MNTREF), and fdesc
doesn't put V_MNTREF in flags.

This is a common way of returning extra values so it shouldn't cause
warning is the source code doesn't have bogus initializations in the
caller.  Compilers that look at only 1 source file at a time can't see
the full API so they have to assume that such parameters are output-only
if they are uninitialized in callers.  Checkers need to understand the
API if they want to do more.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295362 - head/sys/fs/cd9660

2016-02-06 Thread Bruce Evans

On Sun, 7 Feb 2016, Pedro F. Giffuni wrote:


Log:
 cd9660: Drop an unnecessary check for NULL.

 This was unnecessary and also confused Coverity.

 Confirmed on:  NetBSD
 CID:   978558


This leaves many similar bugs unfixed nearby.  One is a null pointer
panic, not just an unnecessary check.


Modified: head/sys/fs/cd9660/cd9660_vfsops.c
==
--- head/sys/fs/cd9660/cd9660_vfsops.c  Sun Feb  7 01:45:24 2016
(r295361)
+++ head/sys/fs/cd9660/cd9660_vfsops.c  Sun Feb  7 03:48:40 2016
(r295362)
@@ -741,8 +741,7 @@ cd9660_vget_internal(mp, ino, flags, vpp
if (off + isonum_711(isodir->length) >
imp->logical_block_size) {
vput(vp);
-   if (bp != 0)
-   brelse(bp);
+   brelse(bp);
printf("fhtovp: directory crosses block boundary 
%d[off=%d/len=%d]\n",
   off +isonum_711(isodir->length), off,
   isonum_711(isodir->length));



- 11 lines later, there is the same bug in an #if 0 block.
- 9 lines earlier, bp was brelse()ed without a similar check.  bp is always
  null at that point, so checking for this would be just a style bug.  Not
  checking, but doing the wrong cleanup of calling brelse() after bread()
  fails, gives a null pointer panic in brelse() whenever bread() fails.

That seems to be all the similar bugs.  bp is correctly abused as a flag
later.  Other functions depend more critically on bread() setting bp to
NULL when it fails.  Typical code is to "goto out" when bread() fails,
and brelse() bp there if it is not null.

Not-so-similar bugs include spelling NULL as 0 or (empty).

bread(9) is undocumented.  Someone broke its KPI by changing it to a
macro.  This annoys me when I try to set a breakpoint at it in kernels
with the broken KPI.  The unimportant badly designed bread(3) is documented.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r295160 - stable/7/sys/kern

2016-02-03 Thread Bruce Evans

On Tue, 2 Feb 2016, John Baldwin wrote:


Log:
 Return the timestamps from the corresponding namecache entry on a negative
 namecache hit.  This was ommitted due to a merging error in r238913.  The
 effect was to usually break caching of negative name lookups in the NFS
 client.

 Submitted by:  bde


Thanks.  FreeBSD-7 is now almost the fastest version in my makeworld on
nfs benchmark :-).

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294778 - in head: lib/libc/sys sys/kern

2016-01-26 Thread Bruce Evans

On Tue, 26 Jan 2016, Kubilay Kocak wrote:


Log:
  Restore flushing of output for revoke(2) again.  Document revoke()'s
  intended behaviour in its man page.  Simplify tty_drain() to match.
  Don't call ttydevsw methods in tty_flush() if the device is gone
  since we now sometimes call it then.
...
  This was first broken then fixed in 1995.  I changed only the tty
...


Seems like


  This was next broken in 1997 then fixed in 1998.  Importing Lite2 made
...


A fantastic


  This was next broken in 2008 by replacing everything in tty.c and not
...


Regression test candidate :)


  It is now possible to fix this better using the new FREVOKE flag.


Regression tests for devices are difficult to write and more difficult
to run.  Simpler for ttys than for networking or disks, but you still
need at least 2 generic tty ports just to test things that are not
very related to hardware.

Bugs in flushing and draining are sometimes obvious by observing if
echo 123 >/dev/ttyXx works when it should fail or fails when it should
work.

For more arcane bugs, I use the old NIST POSIX test suite.  This is
badly written and hard to use and not very complete, but it finds about
30 regressions between FreeBSD-5 and FreeBSD-9.  30 over-counts for error
cascades but undercounts for blocking and some other timing bugs, and
of course strict POSIX tests don't get near FreeBSD features like
revoke() or bidrectional devices.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294697 - head/sys/net80211

2016-01-24 Thread Bruce Evans

On Sun, 24 Jan 2016, Andriy Voskoboinyk wrote:


Log:
 net80211: reduce stack usage for ieee80211_ioctl*() methods.

 Use malloc(9) for
  - struct ieee80211req_wpaie2 (518 bytes, used in
 ieee80211_ioctl_getwpaie())
  - struct ieee80211_scan_req (128 bytes, used in setmlme_assoc_adhoc()
 and ieee80211_ioctl_scanreq())

 Also, drop __noinline workarounds; stack overflow is not reproducible
 with recent compilers.

 Tested with Clang 3.7.1, GCC 4.2.1 (from 9.3-RELEASE) and 4.9.4
 (with -fstack-usage flag)


Inlining also breaks debugging.  It is best avoided generally using gcc
-fnon-inline-functions-called-once.  This flag is broken (not supported)
in clang.


Modified: head/sys/net80211/ieee80211_ioctl.c
==
--- head/sys/net80211/ieee80211_ioctl.c Sun Jan 24 23:28:14 2016
(r294696)
+++ head/sys/net80211/ieee80211_ioctl.c Sun Jan 24 23:35:20 2016
(r294697)
-/*
- * When building the kernel with -O2 on the i386 architecture, gcc
- * seems to want to inline this function into ieee80211_ioctl()
- * (which is the only routine that calls it). When this happens,
- * ieee80211_ioctl() ends up consuming an additional 2K of stack
- * space. (Exactly why it needs so much is unclear.) The problem
- * is that it's possible for ieee80211_ioctl() to invoke other
- * routines (including driver init functions) which could then find
- * themselves perilously close to exhausting the stack.
- *
- * To avoid this, we deliberately prevent gcc from inlining this
- * routine. Another way to avoid this is to use less agressive
- * optimization when compiling this file (i.e. -O instead of -O2)
- * but special-casing the compilation of this one module in the
- * build system would be awkward.
- */


Even with -O1 -mtune=i386 -fno-inline-functions-called-once, gcc-4.2.1
still breaks debugging of static functions by using a different calling
convention for them.  The first couple of args are passed in registers.
This breaks ddb stack traces on i386 not quite as badly as they have
always been broken on amd64.  (ddb cannot determine the number of args
or where they are on amd64, and used to print 5 words of stack garbage.
On i386, the args list is still printed and is almost as confusing as
garbage, since it is correct for extern functions but for static
functions it starts at about the third arg).

I use __attribute__((__regparm(0))) to unbreak the ABI for a few
functions designed to be called from within ddb as well as the main
code.  Some older functions like inb_() with this desgn still work
accidentally because they are extern.  I haven't figured out the
command-line flag to fix this yet.  Maybe just -mregparm.  I didn't
try hard to fix this since I was working on optimizations more than
debugging when I added the attribute.

Inlining really should reduce stack usage and thus be an optimization
that is actually useful for kernels.

Compilers are clueless about optimizations that are useful for kernels.
-Os should help, but is very broken in gcc-4.2.1 (it fails to compile
some files due to hitting inlining limits, and after working around
this, gives a negative optimization for space of about 30%).  -Os
works OK for clang -- it reduces the space a little and the time by
almost as much as -O2.  But optimizations like clang -O2 -march=native
are less than 10% faster than pessimizations like gcc-old -O1 -mtune=i386
-fno-inline-functions-called-once -fno-unit-at-a-time in kernels, in
micro-benchmarks that are favourable to the optimizations.  More like
1% for normal use.  (-fno-unit-at-a-time should reduce opportunities
for inlining static functions if -fno-inline-functions-called-once
doesn't work, but is also broken (not supported) in clang.)

Optimizations larger than 1% can possibly be obtained by using compiler
builtins, but compiler builtins are turned off by -ffreestanding.  I
no longer bother to turn some like __builtin_memcpy() back on.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294543 - head/usr.sbin/ypldap

2016-01-22 Thread Bruce Evans

On Fri, 22 Jan 2016, Marcelo Araujo wrote:


2016-01-22 15:57 GMT+08:00 Bruce Evans <b...@optusnet.com.au>:

...
Unfortunately, getdtablesize() is still needed for arrays as used in yp*,
and for anything using select().  If getdtablesize() is large then the
arrays should be sparse or large fd's should not be actually used.
closefrom() should probably be used early to kill unknown fd's to make
space for private fd's.  After that, getdtablesize() becomes almost
useless.  You just allocate a table large enough for the fd's that
you allocate, and don't allocate fd's sparsely.
...


I noticed that getdtablesize(2) specifically for ypldap(8) case is a bit
slower than FD_SETSIZE, mostly because of the size of max_fd.


freefall actually takes only 0.11 seconds to close 706977 mostly-non-open
fd's.  But if you watch this using ktrace (with ktrace.out on nfs) it takes
2.79 seconds for 1000 fd's or 33 minutes.  ktrace on nfs is unusably slow.


However, as ypldap(8) was imported from OpenBSD seemed good to keep it as
much as synced with OpenBSD implementation. Although I'm not sure if
FreeBSD and OpenBSD shares the same getdtablesize(2) implementation.


How did it diverge in the first place?

getdtablesize(2)'s implementation can't be much different, but its value
should be.  Allowing 706977 fd's for a single thread should be considered
a security hole (just a denial of service one).


Not an excuse for sure, but I agree with you!!!


Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294543 - head/usr.sbin/ypldap

2016-01-21 Thread Bruce Evans

On Fri, 22 Jan 2016, Marcelo Araujo wrote:


Log:
 Switch from FD_SETSIZE to getdtablesize(2) as it can make the FD to be
 tunable. Also it gets more close with the original implementation from
 OpenBSD.



Modified: head/usr.sbin/ypldap/yp.c
==
--- head/usr.sbin/ypldap/yp.c   Fri Jan 22 02:28:17 2016(r294542)
+++ head/usr.sbin/ypldap/yp.c   Fri Jan 22 03:02:38 2016(r294543)
@@ -83,17 +83,14 @@ void
yp_enable_events(void)
{
int i;
-   extern fd_set svc_fdset;
struct yp_event *ye;

-   for (i = 0; i < FD_SETSIZE; i++) {
-   if (FD_ISSET(i, _fdset)) {
-   if ((ye = calloc(1, sizeof(*ye))) == NULL)
-   fatal(NULL);
-   event_set(>ye_event, i, EV_READ, yp_fd_event, NULL);
-   event_add(>ye_event, NULL);
-   TAILQ_INSERT_TAIL(>sc_yp->yd_events, ye, ye_entry);
-   }
+   for (i = 0; i < getdtablesize(); i++) {
+   if ((ye = calloc(1, sizeof(*ye))) == NULL)
+   fatal(NULL);
+   event_set(>ye_event, i, EV_READ, yp_fd_event, NULL);
+   event_add(>ye_event, NULL);
+   TAILQ_INSERT_TAIL(>sc_yp->yd_events, ye, ye_entry);
}
}


getdtablesize() is a syscall, so evaluating it every time in the loop is
slow.  Its value must be a loop invariant to work right.  Compilers
cannot reasonably tell that syscalls return invariant values.  Maybe they
should for the standard sysconf() values, but kernels have bugs like
extra limits that break the invariants.

Only old code or very unportable code should use getdtablize().  Code
newer than 1990 should use {OPEN_MAX}.  sysconf(_SC_OPEN_MAX) is
almost as easy to use as getdtablesize() provided you don't handle
errors for either.

Old code in /usr/src provides many examples of better use of getdtablesize().
In fact, the second largest subcollection of examples was in old yp code.
In FreeBSD-~5.2:

- login/login.c: this is the simplest good use.  It uses a count-down loop
  so that the top limit is only evaluated once

- rpcgen/rpc_svcount.c: this is careful to generate good code.  It generates
  a loop like the above, but with the loop invariant moved out of the loop.

- window/wwinit.c: this assigns getdtablesize() to a global but then never
  uses the value.  This was even more broken in 4.4BSD-Lite2.
  getdtablesize() was unused but the comment on the variable said that it
  was used.  The variable was the value of the highest fd seen, and was
  unused.

  This looks like some floundering to avoid using very large values
  of the limit.  It is never right to just use the result of the
  syscall, or {OPEN_MAX}.  E.g., blindly closing {OPEN_MAX} fd's in
  daemons is bad since it might make them take hours to start up.
  {OPEN_MAX} is 706977 on freefall.

- ppp/bundle.c, ppp/chap.c, ppp/chat.c, ppp/command.c, ppp/exec.c: top-down
  loop
- ppp/defs.c: used with malloc().  With no error checking of course.

- rpc/yppasswdd/yppasswdd_main.c, rpc/ypupdate/ypupdate_main.c,
  rpc/ypxfrd/ypxfrd_main.c, ypserv/yp_main.c: bottom-up loop with
  invariant evaluated earlier

- slip/sliplogin.c bottom-up loop with invariant evaluated earlier

- sysinstall/install.c, sysinstall/network.c, sysinstall/user.c: top-down
  loop
- sysinstall/system.c: 1 bottom-up-loop with invariant evaluated earlier,
  and 1 top-down loop

- syslogd/syslogd.c: top-down loop

- libexec/ftpd.c: home made popen() that mallocs() an array of size
  getdtablesize().  This actually has some error handling

- rbootd/rbootd.c: bottom-up-loop with invariant evaluated earlier

- rexecd/rexecd.c: top-down loop

- rshd/rshd.c: top-down loop

So all the examples were not too bad, modulo the bug that the API is
broken as designed (it is OK if {OPEN_MAX} is 20 instead of 706977
even on old systems where 20 is large).

{OPEN_MAX} was surprisingly little used in *bin and libexec.  I once
tried to remove all instances of OPEN_MAX and almost succeeded
(OPEN_MAX is the wrong static limit for {OPEN_MAX} and shouldn't
exist).  The only literal matches for OPEN_MAX were 4 in gperf and
4 in compat code in cron.

Now, OPEN_MAX is sed just twice more in  *bin and libexec.  It is
used in hastd as sysconf(_SC_OPEN_MAX) since hastd is too far from
BSD style to use getdtablesie().  The style differences include
too much error handling instead of too little.

getdtablesize() is now used in much the same places as in ~5.2, except:
- rexecd, slip_login, sysinstall and window went away
- I missed uses in atm in ~5.2, and there are none there now
- I missed a use in pkg_install in ~5.2, and it went away
- a more careful grep somehow found an inconsistent set for yp*
- large changes for syslogd and login.  Without looking at the code,
  I think that these are to reduces the slowness of looking at
  thousands or millions of fd's.  

Re: svn commit: r293979 - head/lib/libkvm

2016-01-14 Thread Bruce Evans

On Thu, 14 Jan 2016, John Baldwin wrote:


Log:
 Fix building with GCC since PAGE_MASK is signed on i386.
...



Modified: head/lib/libkvm/kvm_i386.h
==
--- head/lib/libkvm/kvm_i386.h  Thu Jan 14 15:50:13 2016(r293978)
+++ head/lib/libkvm/kvm_i386.h  Thu Jan 14 15:51:13 2016(r293979)
@@ -70,7 +70,7 @@ _Static_assert(NBPDR == I386_NBPDR, "NBP

_Static_assert(PG_V == I386_PG_V, "PG_V mismatch");
_Static_assert(PG_PS == I386_PG_PS, "PG_PS mismatch");
-_Static_assert(PG_FRAME == I386_PG_FRAME, "PG_FRAME mismatch");
+_Static_assert((u_int)PG_FRAME == I386_PG_FRAME, "PG_FRAME mismatch");
_Static_assert(PG_PS_FRAME == I386_PG_PS_FRAME, "PG_PS_FRAME mismatch");
#endif


This breaks the warning.  PG_FRAME still has the fragile type int and
the fragile value -4096.

libkvm has lots of nearby style bugs like using the long long abomination
as a suffix for the literal constants, and bogus parentheses around
single literals.  All of these are copied from the kernel (except their
ames are changed by adding an I386_ prefix).  The assertions aren't very
useful when half of the constants use lexically identically definitions
with equal ugliness.  The 2 PAE constants are the main ones that aren't
asserted to be equal.

I think PAE in the kernel could even use the signedness of PG_FRAME.
(vm_paddr_t)-4096 first sign extends to (int64_t) with value -4096
and then overflows to the correct mask of 0xf000 with the
correct type.  So -4096 could work for both PAE and !PAE.  But PAE
actually uses an ifdef to to define PG_FRAME as (0x000ff000ull).
This has the 2 style bugs mentioned above, and seems to have a nonsense
value.  I think PAE is only 36 bits, but the mask is for 52 bits.

I don't like explicitly typed constants, but this problem shows that
some care is needed for using constants in expressions.  int is correct
for PAGE_SIZE but not masks, except for masks of all except the sign
bit it is OK.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r294011 - head/share/man/man4

2016-01-14 Thread Bruce Evans

On Thu, 14 Jan 2016, Warner Losh wrote:


Log:
 Document how to enter the debugger here. I'm sure there's some better
 canonical place, and the nit-pickers are welcome to move this
 information there with a cross reference.
...
Modified: head/share/man/man4/ddb.4
==
--- head/share/man/man4/ddb.4   Thu Jan 14 16:21:58 2016(r294010)
+++ head/share/man/man4/ddb.4   Thu Jan 14 16:23:07 2016(r294011)
@@ -1435,6 +1435,47 @@ The NMI allows one to break into the deb
...
+.Pp
+For serial consoles, you can break to the debugger by sending a BREAK
+condition on the serial line if
+.Cd options BREAK_TO_DEBUGGER
+is specified in the kernel.
+Most terminal emulation programs can send a break sequence with a
+special key sequence or via a menu item.
+However, in some setups, sending the break can be difficult to arrange
+or happens spuriously, so if the kernel contains


This is all quite broken, though the above description is almost correct.

The BREAK_TO_DEBUGGER option used to mean: enter (not break into) the
debugger using a serial line break.  This is a good method, except it
is too easy to get serial line breaks.  Unplugging the cable usually
gives on, and turning off the other end might give one.  So this method
is only usable on local systems with secure cables that you don't trip
over or otherwise disconnect).

This option was broken to mean: the default for the sysctl that controls
"breaking into" the debugger using either a serial line break or the console
key for "breaking into" the debugger.  For almost perfection confusion,
this key is normally the "Ctrl-Print Screen" key on PC keyboards and
there is a "Break" key nearby.  Ctrl-Alt-Esc also works.  syscons(4)
misdocuments the latter as Alt-Esc.  Under WinDOS, the Break key normally
needs Ctrl to make it do anything, but under FreeBSD (syscons), Ctrl-
Break is an alias for Scroll Lock.

If you actually have a serial console, then you should normally not
configure BREAK_TO_DEBUGGER since it gives the problems with cables,
but this breaks the console keys.  You can reenable them using the
sysctl, but this also enables entry for serial line breaks since there
is no longer any ifdef to keep the code for that out of the kernel,
and no runtime option to control it independently of other console
drivers.

There is a maze of compile-time options and sysctls to give further
control for non-serial consoles.  These are of course spelled differently
so they are hard to find.  For syscons, DISABLE_KDBKEY removes the code
for all debugger entry keys.  Here KDBKEY has a nice confusing name.  It
looks a bit like KBDKEY (keyboard key) but actual meant
"kernel debugger key" and now means "all key sequences that might lead
to the kernel debugger".  The corresponding runtime option is of course
spelled differently, with kbd instead of kdb.  It is hw.syscons.kbd_debug.

The maze is different and more broken in vt.  There is no option like
DISABLE_KDBKEY.  The 2 different types of key sequences for entering the
debugger are misconfigured differently:
- "alt-break" sequence: this is ifdefed with KDB so it is unreachable unless
  KDB is configured
- "break" key: kdb has a stub that can be called even when KDB is not
  configured.  The stub does a wrong thing (doesn't print a warning if
  KDB is not configured; similarly if KDB is configured but no back end
  is attached).  This is gated by the sysctl, unlike "alt-break".
The sysctl has been partly fixed to have a similar name.  It now has
the same leaf name but is under kern instead of hw (kern.vt.kbd_debug).

The sysctls are gated by each other, so none of them works as documented
and the maze must be understood to turn everything on or off.  Actually
there are separate gates for the 2 kdb sysctls.  The gate for the "break"
key normally defaults to off in kdb (since BREAK_TO_DEBUGGER is normally
not configured), and everthing else defaults to on, so to turn on the
"break" key it suffices to turn on the kdb gate for it, but that also
turns on serial breaks support.

vt also makes all the sysctls tunables and even documents this.  It
documents both "break" keys correctly.  The "alt-break" entries are
not documented in syscons(4) or vt(4).


+.Cd options ALT_BREAK_TO_DEBUGGER


ALT_BREAK_TO_DEBUGGER was the start of the misnaming.  It means
"alternative method to a serial line break for entering the debugger"
but is usually read as "alternative method for breaking into the
debugger".

Not configuring this used to omit code for doing it.  Now the
configuration s soft.  The code is always there if KDB is, and this
option just gives the default for the sysctl that controls the
feature.

The sysctl for this is separate from the one for the "break" key,
and both are tunables.  These are the only tunables in kdb.  These
sysctls have bad names matching the options (with "break" that is
unrelated to serial line breaks in them) and similar 

Re: svn commit: r294011 - head/share/man/man4

2016-01-14 Thread Bruce Evans

On Thu, 14 Jan 2016, Slawa Olhovchenkov wrote:


On Fri, Jan 15, 2016 at 06:27:05AM +1100, Bruce Evans wrote:

The BREAK_TO_DEBUGGER option used to mean: enter (not break into) the
debugger using a serial line break.  This is a good method, except it
...

This option was broken to mean: the default for the sysctl that controls
"breaking into" the debugger using either a serial line break or the console
key for "breaking into" the debugger.  For almost perfection confusion,
this key is normally the "Ctrl-Print Screen" key on PC keyboards and
there is a "Break" key nearby.  Ctrl-Alt-Esc also works.  syscons(4)


On may keyboard "Break" subscribe to "Pause", to "Print Screen"
subscribe "SysReq".


This is similar but not quite the same on all 6 of my (US) keyboards
(except it is the same on 2 identical models).  SysReq seems to be
going away and is only on my 2 20+ year old keyboards.  PrintScreen is
is even more useless/unsupported for its original operation.  But
my worst (and most expensive) keyboard axed ScrollLock before
PrintScreen.  I use ScrollLock a lot under FreeBSD but never noticed
it doing anything elsewhere.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293724 - in head/sys/boot: arm64/libarm64 common efi/boot1 efi/fdt efi/include efi/include/arm64 efi/libefi efi/loader efi/loader/arch/amd64 efi/loader/arch/arm efi/loader/arch/arm64

2016-01-13 Thread Bruce Evans

On Tue, 12 Jan 2016, Conrad Meyer wrote:


On Tue, Jan 12, 2016 at 5:32 PM, Ian Lepore  wrote:

Yep, but then I had to do this because ef->off is 64 bits even on 32
bit arches, so I got a pointer/int size mismatch warning...

Index: common/load_elf.c
===
--- common/load_elf.c   (revision 293796)
+++ common/load_elf.c   (working copy)
@@ -886,7 +886,7 @@ __elfN(parse_modmetadata)(struct preloaded_file *f
error = __elfN(reloc_ptr)(fp, ef, v, , sizeof(md));
if (error == EOPNOTSUPP) {
md.md_cval += ef->off;
-   md.md_data = (void *)((uintptr_t)md.md_data + ef->off);
+   md.md_data = (void *)(uintptr_t)((uintptr_t)md.md_data +
ef->off);
} else if (error != 0)
return (error);
 #endif

That is just some special kind of ugly.


Yes.  You could maybe do:

md.md_data = (c_caddr_t)md.md_data + (ptrdiff_t)ef->off;


This is much worse.

caddr_t is an old mistake.  It was supposed to be an opaque type
representing a core address, whatever that is.  But opaque types
are even harder to work with.  They might be a pointer of any
type, an integer of any size, floating point, or a struct...
In practice, caddr_t only works for virtual address in flat
memory models.  vm_offset_t or uintptr_t would be the easiest
to work with for that, and void * would be most opaque
(intentionally hard to work with).  But it is actually char *,
and most uses of it are misuses which assume this.

c_caddr_t is a newer mistake.  It is the opaque type caddr_t
peered into to see that it is a pointer and modify that pointer
to pointer to const.

md_data isn't const here, so using c_caddr_t is wronger than usual.

We also need to peer into the type of [c_]caddr_t to know that
it can be assigned to the pointer md_data without a cast.  A
cast would be needed if caddr_t were vm_offset_t or uintptr_t.
Of course, it would be foot-shooting to keep using the poorly
chosen type void * for md_data.  All these addresses and offsets 
should be integers not quite like vm_addr_t and vm_offset_t in

vm (vm_addr_t doesn't exist...).

After further peering, we see that this code shouldn't compile,
since the extra const in c_caddr_t is incompatible with the
plain void * for md_data.

md_cval is const nearby, so it needs a const somewhere in casts.
Since it is plain const char *, using c_caddr_t would be unnatural
for it (but would work because c_caddr_t is const char * obfuscated).


Instead.  Yes, the ptrdiff_t will truncate uint64_t on 32-bit pointer
platforms, but the result is truncated regardless when it is stored in
the md_data pointer.  And the result under modulus is equivalent.


This assumes main things that don't need to be assumed.  After casting
md_data to a pointer instead of an int, ev->off can be added to it
without further casts unless the compiler warns about adding 64-bit
offsets to 32-bit pointers.  But the code already depends on no
warning for the addition in the previous line.  So the cast has no
good effect.

The casts asks for bad effects like truncating the offset to 16 bits
on exotic systems.  ptrdiff_t is only required to be 16 bits and is
not required to actually work even for offsets within a single object.
Using it for general memory offsets asks for undefined behaviour but
only gets it on normal machines for offsets larger than half of the
address space (these overflow).


(You could even change the type of md_data to c_caddr_t from 'const
void *' to make it easier to do math on.  Then this could just be:
md.md_data += (ptrdiff_t)ef->off;)


You mean 'const char *' to give the same type mismatch as c_caddr_t,
or 'char *' to work.  Then remove the cast to ptrdiff_t to give
simple-looking code that works, but still uses a bogus type for
ef->off and a magic type for md_data.  The magic mostly occurs later
when we pass the user pointer md_data directly to the kernel and
cast it to other pointers.

The type mistakes are not large enough for c_caddr_t to be used for any
of the 4 struct members misnamed md_data.  The one here is just the
void * one in .  void * is correctly for a general kernel
pointer.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293792 - head/sys/boot/uboot/lib

2016-01-13 Thread Bruce Evans

On Wed, 13 Jan 2016, Konstantin Belousov wrote:


On Wed, Jan 13, 2016 at 03:03:07PM +1100, Bruce Evans wrote:

Oops.  It is only declared in  outside of the kernel.
Only __uintfptr_t is always declared by sys/types.h.


So what is the intended use of the uintfptr_t type ?  Looking at the
implementation only, I am unable to find a difference with uintptr_t.


uintfptr_t is inded for functions.  uintptr_t is an optional type that
is only required to work for object pointers (only if it is a supported
option of course).  A similar type is needed for functions.

I added uintfptr_t at much the same time that I added uintptr_t in 1998.
Actually I added uintptr_t a little earlier under a different name, then
changed it to be more like the draft C9x name uintptr_t.

Unfortunately, standards haven't caught up with that yet.  POSIX now
uses various hacks near dlsym().  In the 2001 version it specifies
impossible things if the arch is not a vax so that function pointers
are too different from data pointers.  It gives the bad example
of casting dlsym() to a function pointer.  dlsym() returns void *.
dlopen() doesn't have this design error.  It returns a handle.


Even on ia64 it is 64bit, which breaks my hypothesis about f standing
for 'function'.


This might be a bug in ia64, but uintfptr_t works on it AFAIK.  It is
used for profiling.  This is hard to test since profiling support is
broken (profiling libraries not installed) on pluto.

Function pointers have size 64 bits on ia64, so they can be represented
in uint64_t's by memcpy()ing them.  Casting them might give a different
representation.  I think ia64 really has fat function pointers but they
are represented specially in 64-bit objects.  E.g., in a small
program, nm says that putchar is at 0x10640, but when (putchar)
is cast to int (*f)(int), the bits in f are 0x1000acbb0 and casting f
to uint64_t doesn't change these bits.  f is just a pointer to data
containing the actual function address and possibly other data.  (It
points to .opd+16 which contains the same address that nm prints.
There seem to be only 64 bits of address and no extra data there too.)

binutils has to understand this of course.  Profiling too.  I think
gprof doesn't understand this.  It needs raw addresses represented in
uintfptr_t with the same encoding that ELF uses.  So uintfptr_t is not
suitable for representing function pointers after all.  It is for raw
addresses or a simple encoding of them.  Profiling hits record the
"program counter".  It is not obvious what this is when it might be
represented indirectly.  I think its raw register value is recorded,
and this can only work if its value is a raw virtual address (if not,
a table must be looked up to convert to something that gprof
understand).  Other data must use the same representation or convert
to it.  Data with C function pointers in it would need conversion.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293792 - head/sys/boot/uboot/lib

2016-01-12 Thread Bruce Evans

On Wed, 13 Jan 2016, Ian Lepore wrote:


Log:
 Go back to using uintptr_t, because code that actually compiles is
 infinitely less buggy than code that is theoretically correct in some
 alternate universe.

 The uintfptr_t type is apparently a freebsd invention, and exists only when
 compiling the kernel.  It's a little hard to say for sure, since it doesn't
 seem to be documented anywhere except in email advice to unsuspecting and
 overly-trusting souls, who then get to wear the pointy hat for blindly
 following advice without investigating or testing it first.


Oops.  It is only declared in  outside of the kernel.
Only __uintfptr_t is always declared by sys/types.h.

Grep shows some style bugs (spelling mismatches) for *uintfptr_t in
:

X amd64/include/profile.h:typedef   u_long  uintfptr_t;
X arm/include/_types.h:typedef  __uint32_t  __uintfptr_t;
X arm/include/profile.h:typedef u_int   uintfptr_t;
X arm64/include/_types.h:typedef__uint64_t  __uintfptr_t;
X arm64/include/profile.h:typedef unsigned long longuintfptr_t;
X i386/include/profile.h:typedefu_int   uintfptr_t;
X mips/include/_types.h:typedef __uint64_t  __uintfptr_t;
X mips/include/_types.h:typedef __uint32_t  __uintfptr_t;
X mips/include/profile.h:typedef u_long uintfptr_t;
X mips/include/profile.h:typedef u_int  uintfptr_t;
X powerpc/include/_types.h:typedef  __uint64_t  __uintfptr_t;
X powerpc/include/_types.h:typedef  __uint32_t  __uintfptr_t;
X powerpc/include/profile.h:typedef u_long  uintfptr_t;
X powerpc/include/profile.h:typedef u_int   uintfptr_t;
X sparc64/include/_types.h:typedef  __uint64_t  __uintfptr_t;
X sparc64/include/profile.h:typedef u_long  uintfptr_t;
X x86/include/_types.h:typedef  __uint64_t  __uintfptr_t;
X x86/include/_types.h:typedef  __uint32_t  __uintfptr_t;

All except arm64 are consistently inconsistent in using the correct
basic type in  but a fixed-width type in
.  arm64 uses the long long abomination to get
a type mismatch (same width, higher rank) in .

This was consistent in FreeBSD-4 where __uintfptr_t doesn't exist.
uintptr_t was only declared in  (and 
via intentional pollution) so there was no difference for the kernel
to confuse you.  But  must be included.  
has a reference but no useful documentation in moncontrol(3), and
a surprisingly large amount of documentation in sysctl(3) of all
places.  It seems I didn't enlarge this when I enlarged gmon with
much more than uintfptr_t.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293724 - in head/sys/boot: arm64/libarm64 common efi/boot1 efi/fdt efi/include efi/include/arm64 efi/libefi efi/loader efi/loader/arch/amd64 efi/loader/arch/arm efi/loader/arch/arm64

2016-01-12 Thread Bruce Evans

On Tue, 12 Jan 2016, Ian Lepore wrote:


On Wed, 2016-01-13 at 01:17 +, Steven Hartland wrote:


On 13/01/2016 00:54, Ian Lepore wrote:

On Wed, 2016-01-13 at 00:43 +, Steven Hartland wrote:
...

Passes a full tinderbox so I assume your forcing gcc for some
reason?

For several reasons.  The fact that gcc isn't the default compiler
doesn't mean that it's okay for code to not compile with gcc; it's
still a supported compiler for arm.


I usually force gcc.


Not disagreeing with that, was just curious that's all ;-)

The warnings you list seem to be detail, typical gcc, specifically:

sys/boot/efi/fdt/../include/efiapi.h:535: warning: function
declaration isn't a prototype

I'm guessing its being picky and wants EFI_RESERVED_SERVICE to have
void in there due to no params.


It is not broken, so it is warning about an unprototyped function as
requested by -Wstrict-prototypes.


Does the following help:

Index: sys/boot/efi/fdt/Makefile
===
--- sys/boot/efi/fdt/Makefile   (revision 293796)
+++ sys/boot/efi/fdt/Makefile   (working copy)
@@ -7,6 +7,8 @@
  LIB=   efi_fdt
  INTERNALLIB=
  WARNS?=6
+CWARNFLAGS.gcc+=   -Wno-strict-prototypes
+CWARNFLAGS.gcc+=   -Wno-redundant-decls

  SRCS=  efi_fdt.c



This just breaks the warning.


@@ -34,4 +36,6 @@ CLEANFILES+=  machine

  .include 

+CFLAGS+=   ${CWARNFLAGS.${COMPILER_TYPE}}
+
  beforedepend ${OBJS}: machine

Could you detail detail how you're switching to gcc so I an run a
full pass on that too?


Sometimes I use CC=gccNN, where gccNN is somwhere in $PATH.  Sometimes
it is a script to select an arch-dependent gcc.  This is unlikely to
work for makeworld but it works for kernels.


Yep, but then I had to do this because ef->off is 64 bits even on 32
bit arches, so I got a pointer/int size mismatch warning...


gcc detects this error too.


Index: common/load_elf.c
===
--- common/load_elf.c   (revision 293796)
+++ common/load_elf.c   (working copy)
@@ -886,7 +886,7 @@ __elfN(parse_modmetadata)(struct preloaded_file *f
error = __elfN(reloc_ptr)(fp, ef, v, , sizeof(md));
if (error == EOPNOTSUPP) {
md.md_cval += ef->off;
-   md.md_data = (void *)((uintptr_t)md.md_data + ef->off);
+   md.md_data = (void *)(uintptr_t)((uintptr_t)md.md_data +
ef->off);
} else if (error != 0)
return (error);
#endif


That is just some special kind of ugly.  Fixing warnings is supposed to
lead to better code, but all this casting isn't better, it's just an
unreadable mess.  Man I miss the days when C was just a really powerful
macro assembler. :)


This is made extra-ugly by misformatting it.  Fixing warnings
unfortunately usually leads to worse code, using extra code to break
the warning.

Here the detected bug is the bogus type for ef->off.  Values >=
UINT32_MAX in it cannot work in expressions like this.  This was only
detected accidentally.

md_data is a very confusing variable name.  It is used nearby in 4 structs
band has a different type in each.  Sometimes it is uint32_t or
uint64_t; sometimes it is void * and sometimes it is char *.  Here it
is void *.

Some of the structs are:
- mod_medadata (md is this); type void *
- mod_metadata64; type uint64_t
- mod_metadata32; type uint32_t
- file_metadata; type char []
The prefix is supposed to be unique in context.  One is better than none.
'off' in ef has none.

The void * type is inconvenient to work with.  The nearby md_cval has
the same bugs, except it isn't in file_metadata and its type is
const char * so you can add an integer to it without casting.  The
previous round of fixes was to fix a warning about using the gnu
extension of adding an integer to a void * without a cast.


+   md.md_data = (void *)(uintptr_t)((uintptr_t)md.md_data +
ef->off);


I don't see any better quick fix than changing 'off' to vm_offset_t
or maybe signed vm_offset_t.  It is in a local struct so this seems
to be possible.  Changing the void * instance of md_data to an integer
is harder.

md_cval should also be an integer.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293349 - head/sys/kern

2016-01-07 Thread Bruce Evans

On Thu, 7 Jan 2016, Konstantin Belousov wrote:


Log:
 Convert tty common code to use make_dev_s().

 Tty.c was untypical in that it handled the si_drv1 issue consistently
 and correctly, by always checking for si_drv1 being non-NULL and
 sleeping if NULL.  The removed code also illustrated unneeded
 complications in drivers which are eliminated by the use of new KPI.


Actually, the handling was consistently incorrect.  In the case where
si_drv1 is actually NULL, it usually returned EWOULDBLOCK instead of 0
for successful opens.  This is a rare case and I haven't seen it so
I'm not sure how to recover from it.  revoke() of the device should
work.  Even open() followed by a "last" close() would probably work.


 Reviewed by:   hps, jhb
 Discussed with:bde


There seem to further old and unfixed problems.  Sorry I didn't look
at your patch very closely.  I will reply privately the complicated
details of this.


Modified: head/sys/kern/tty.c
==
--- head/sys/kern/tty.c Thu Jan  7 20:15:05 2016(r293348)
+++ head/sys/kern/tty.c Thu Jan  7 20:15:09 2016(r293349)
@@ -237,14 +237,10 @@ static int
ttydev_open(struct cdev *dev, int oflags, int devtype, struct thread *td)
{
struct tty *tp;
-   int error = 0;


This code used to be just 2 style bugs:
- initialization in declaration
- use of the initialization much later.  It is to get the variable
  initialized for code starting about 50 lines later, in case that
  code falls through an if ladder to th return another 40 lines later.


-
-   while ((tp = dev->si_drv1) == NULL) {
-   error = tsleep(>si_drv1, PCATCH, "ttdrv1", 1);
-   if (error != EWOULDBLOCK)
-   return (error);
-   }


The initialization became not just a style bug when the si_drv1 handling
corrupted it.

Urk, the handling was almost completely incorrect:
- the timeout shouldn't be needed.  There is a wakeup when si_drv1 is
  initialized.
- if the wakeup actually works, then its handling is very broken.  Then
  tsleep() returns 0.  This is treated as an error, and the function
  returns 0, which means success, but the devie has not been opened.
- PCATCH shouldn't be needed either, but is not harmful like the buggy
  check for the unnecessary timeout.


+   int error;

+   tp = dev->si_drv1;
+   error = 0;


The initialization is now correct and only 40 lines before it is needed.


tty_lock(tp);
if (tty_gone(tp)) {
/* Device is already gone. */


In my version, 'error' is initialized by more complicated checks on entry
that naturally set it to 0 when they succeed.  After returning when
error != 0 early, it would be unnatural to set it to 0 again later.


@@ -1221,71 +1215,72 @@ tty_makedevf(struct tty *tp, struct ucre
flags |= MAKEDEV_CHECKNAME;

/* Master call-in device. */
-   error = make_dev_p(flags, , _cdevsw, cred, uid, gid, mode,
-   "%s%s", prefix, name);
-   if (error)
+   make_dev_args_init();
+   args.mda_flags = flags;
+   args.mda_devsw = _cdevsw;
+   args.mda_cr = cred;
+   args.mda_uid = uid;
+   args.mda_gid = gid;
+   args.mda_mode = mode;
+   args.mda_si_drv1 = tp;
+   error = make_dev_s(, , "%s%s", prefix, name);
+   if (error != 0)
return (error);
-   dev->si_drv1 = tp;
-   wakeup(>si_drv1);


Wakeups like this should have made the timeout unnecessary.  However, it
would be better to not have them.  This wakeup seems to give an instant
race, while the timeout will often be delayed long enough for the
initialization to complete.


tp->t_dev = dev;


The new version presumably fixes the initialiation order for dev->si_drv1,
but it doesn't change this.  This is delicate.  The details are too large
for this reply.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293328 - head/sys/dev/nvme

2016-01-07 Thread Bruce Evans

On Thu, 7 Jan 2016, Jim Harris wrote:


On Thu, Jan 7, 2016 at 9:27 AM, Ravi Pokala  wrote:
...

+ * Used for calculating number of CPUs to assign to each core and number

of I/O

+ *  queues to allocate per controller.
+ */
+#define NVME_CEILING(num, div)num) - 1) / (div)) + 1)
+

...


I'm surprised that this isn't in , along with
roundup()/rounddown()/etc. Finding the ceiling like this is probably pretty
common, so shouldn't it be added to the common header so everyone can use
it?


Good catch.  howmany() does exactly this, just expressed differently.  I'll
switch over to that.  Thanks!


howmany() doesn't do exactly this.  It has diferent bugs / range of
applicability,  I see about 10.  Subtracting 1 instead of adding more
than 1 gives overflow in different cases.  Even the simple and not
unreasonable case num = 0 gives overflow in NVME_CEILING() if div happens
to be unsigned and not 1: NVME_CEILING(0, 2U) = UINTMAX / 2 + 1 =
0x8000 with 32-bit ints.

Getting to 10 different bugs requires considering unreasonable cases
with negative div or num, and nonsense cases with div = 0, and
reasonable but unsupported cases with floating point args.  C's broken
division for negative integer values makes the details large.
NVME_CEILING() is closer to supporting negative values than howmany()
-- the magic 1 that it adds is to adjust for division of non-negative
values rounding down.  For division of negative values, the adjustment
should be by -1 or 0, depending on the C bug and on whether you want
to round negative results towards 0 or minus infinity.

Howmany() is undocumented so its bugs are harder to see than in your own
macro.  Another one is that it is an unsafe macro with the name of a safe
function-like API.  This naming convention is more important for undocumented
APIs.  NVME_CEILING() follows it in reverse.  Subtracting 1 instead of
adding (div) - 1 is less robust but avoids multiple evaluation of div.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293227 - head/etc

2016-01-05 Thread Bruce Evans

On Tue, 5 Jan 2016, Allan Jude wrote:


On 2016-01-05 22:16, Devin Teske wrote:

This e-mail is extremely hard to parse  and I think you are mistaken.

The -f flag is more than just a counter to a possible -i

Try to rm a file that has schg
You will get a prompt without -i
Adding -f will abate the prompt to attempt override of schg flag.


I forgot about the permissions check.  Normally root has permission to
write anything, but some file flags change this.  The schg flag is
handled bogusly: rm prompts without -f, but schg prevents removal if
you answer 'y'.  The uchg flag is handled better.  rm has special
undocumented code to handle it, mainly so that rm -rf blows it away.
Without -f, rm prompts but the the special code removes uchg so that
removal is possible.

The permissions check only applies to the file.  Removal doesn't
depend on the file's permissions.  It only depends on the file's flags
and the directory's permissions and flags.  rm agrees with its man
page and doesn't prompt if the file is writable but the directory is
unwritable.  The directory permissions don't stop root either, but
immutable directories do.  Since there is no prompt, -f has no effect
on anyone in this case.


There are more conditions in rm that lead to a prompt than simply those 
conditions involving -i and adding -f abates them all.


It isn't clear what these are.  POSIX only says that there is a prompt
without -i for files which don't have write permission (to themselves).
FreeBSD's man page should say more about how this extended for file
flags, but actually says nothing for file flags and is less clear than
POSIX for ordinary permissions.


I think this is kind of a poor UI design of rm(1) honestly. It seems
like what we need is a 'never be interactive' flag, that won't surpress
the error message about the schg'd file, or read-only file system, but
won't try to prompt for it.

Although adding a new flag to rm(1) at this point probably doesn't make
sense.


It already has this flag, namely -f.  This is what I started out to say.
-f never suppresses errors except ENOENT for a non-existent file.  What it
suppresses is prompts.  Since the uchg hack involves a prompt, -f also
changes the semantics for removing user-immutable files.

The file flags hack includes uappnd together with uchg, but not the
newfangled uunlnk.  That has only been available for annoying sysadmins
since 1997.  Apparently its name is to confusing for it to be used.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293227 - head/etc

2016-01-05 Thread Bruce Evans

On Tue, 5 Jan 2016, Ian Lepore wrote:


Log:
  Use the more proper -f. Leave /bin/rm in place since that's what
  other rc scripts have, though it isn't strictly necessary.


"proper -f" is hard to parse.  I think you mean:

Use 'rm -f' to turn off -i in case rm is broken and is an alias which
has -i (and perhaps actually even something resembling rm) in it.  More
precisely, use 'rm -f /usr/bin' to partly defend against the same bug
in /bin/rm (where it would be larger).  Keep using /usr/rm instead of
restoring the use of plain rm since that is what other rc scripts have.
The previous change to use /bin/rm instead of plain rm was neither
necessary nor sufficient for fixing the bug.  Neither is this one, but
it gets closer.  It is a little-known bug in aliases that even absolute
pathnames can be aliased.  So /bin/rm might be aliased to 'rm -ri /'.
Appending -f would accidentally help for that too, by turning it into
a syntax error, instead of accidentally making it more forceful by
turning -ri into -rf.

Hopefully this is all FUD.  Non-interactive scripts shouldn't source any
files that are not mentioned in the script.  /etc/rc depends on a secure
environment being set up by init and probably gets it since init doesn't
set up much.  sh(1) documents closing the security hole of sourcing the
script in $ENV for non-interactive shells, but was never a problem for
/etc/rc since init must be trusted to not put security holes in $ENV.
But users could put security holes in a sourced config file like
/etc/rc.conf.local.


Modified: head/etc/rc
=
=
--- head/etc/rc Tue Jan  5 21:20:46 2016(r293226)
+++ head/etc/rc Tue Jan  5 21:20:47 2016(r293227)
@@ -132,9 +132,9 @@ done
 # Remove the firstboot sentinel, and reboot if it was requested.
 if [ -e ${firstboot_sentinel} ]; then
[ ${root_rw_mount} = "yes" ] || mount -uw /
-   /bin/rm ${firstboot_sentinel}
+   /bin/rm -f ${firstboot_sentinel}
if [ -e ${firstboot_sentinel}-reboot ]; then
-   /bin/rm ${firstboot_sentinel}-reboot
+   /bin/rm -f ${firstboot_sentinel}-reboot
[ ${root_rw_mount} = "yes" ] || mount -ur /
kill -INT 1
fi


Using rm -f to suppress an error message seems like a bad idea here --
if the sentinel file can't be removed that implies it's going to do
firstboot behavior every time it boots, and that's the sort of error
that should be in-your-face.  Especially on the reboot one because
you're going to be stuck in a reboot loop with no error message.


Er, -f on rm only turns off -i and supresses the warning message for
failing to remove nonexistent files.  But we just tested that the file
exists, and in the impossible even of a race making it not exist by
the time that it runs, we have more problems than the failure of rm
since we use the file's existence as a control for other things.

So the only effect of this -f is to turn off -i, which can only be set
if the FUD was justified.

The correct fix seems to be 'unalias -a'.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292955 - head/lib/libmd

2016-01-02 Thread Bruce Evans

On Sat, 2 Jan 2016, Allan Jude wrote:


On 2015-12-31 13:50, Allan Jude wrote:

On 2015-12-31 13:32, Jonathan T. Looney wrote:

On 12/31/15, 2:15 AM, "Allan Jude"  wrote:


It seems these problems also slow things down, a lot:

# time md5 /media/md5test/bigdata
MD5 (/media/md5test/bigdata) = 6afad0bf5d8318093e943229be05be67
4.310u 3.476s 0:07.79 99.8% 20+167k 0+0io 0pf+0w
# time env LD_PRELOAD=/usr/obj/media/svn/md5/head/tmp/lib/libmd.so
/usr/obj/media/svn/md5/head/sbin/md5/md5 /media/md5test/bigdata
MD5 (/media/md5test/bigdata) = 6afad0bf5d8318093e943229be05be67
4.133u 0.354s 0:04.49 99.7% 20+167k 1+0io 0pf+0w

(file is fully cached in ZFS ARC, dd reads it at 11GB/s)

Will investigate more tomorrow.


md5 will be slower than dd due to the extra processing it needs to do to
generate the hash. I suspect that explains the difference you're seeing
between those utilities.


Sorry, you missed my point here.

I replaced MDXFile() with the implementation included in my earlier
email. Using the newer libmd with that code, cut the time to md5 the
SAME data down a lot. I need to do a more scientific test on a box that
isn't doing other stuff still though.

The comment about dd doing 11GB/s, was just to clarify that I wasn't
reading the file from disk, which would introduce other variables.


I found the cause of my bogus benchmark, the world on my test machine
was just old enough to be missing jmg@'s bufsize patch.

Now the difference is about 1 second on a 2GB file, so ignore my
foolishness.


That patch is surprisingly new.

The main slowness that I complained about was for the other path in md5
that must be used for special files.  That uses stdio so it suffers from
stdio trusting st_blksize.  But st_blksize is rarely as small as the old
size BUFSIZ in MDXFile.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293064 - head/sys/boot/uboot/lib

2016-01-02 Thread Bruce Evans

On Sat, 2 Jan 2016, Ian Lepore wrote:


Log:
 Cast pointer through uintptr_t on the way to uint64_t to squelch a warning.

Modified: head/sys/boot/uboot/lib/copy.c
==
--- head/sys/boot/uboot/lib/copy.c  Sat Jan  2 22:31:14 2016
(r293063)
+++ head/sys/boot/uboot/lib/copy.c  Sat Jan  2 22:55:59 2016
(r293064)
@@ -100,7 +100,7 @@ uboot_loadaddr(u_int type, void *data, u

biggest_block = 0;
biggest_size = 0;
-   subldr = rounddown2((uint64_t)_start, KERN_ALIGN);
+   subldr = rounddown2((uint64_t)(uintptr_t)_start, KERN_ALIGN);
eubldr = roundup2((uint64_t)uboot_heap_end, KERN_ALIGN);
for (i = 0; i < si->mr_no; i++) {
if (si->mr[i].flags != MR_ATTR_DRAM)


This gives 1 more bogus cast here:
- _start is a function pointer, so it should be cast to uintfptr_t
- rounddown2() acts on any integer and subldr is an integer, and both of the
  integers are unsigned, and KERN_ALIGN is a small signed int, and so there
  is no need for any further cast, except possibily on arches with
  uintfptr_t larger than uint64_t or the type of subldr where the compiler
  warns about downwards cast, but then the downcast may be a bug since it
  just breaks the warning

- if there is a need for a further cast, then it should be of rounddown2(),
  not of one of its args, so that the args and the internals of rounddown2()
  don't have to be analysed for promotions.  I did some of did analysis
  above -- KERN_ALIGN is a small int, so its promotion is int and that is
  presumably smaller than uintfptr_t.

  rounddown2() is of course undocumented, but everyone knows that macros
  like it are supposed to mix the args without any casts and end up with
  a type related to the arg types.

- subldr actually has type uintptr_t, so the old cast to uint64_t was
  just a double type mismatch (mismatched with both the source and target)
  and leaving it makes it just change the correct type to a wrong one
  before converting back to the correct type.  Since you cast to uintptr_t
  and not uintfptr_t and KERN_ALIGN is small int, the rvalue would already
  have the same type as the lvalue unless it is messed up by the uint64_t
  cast.

- not quite similarly on the next line:
  - the lvalue has type uintptr_t
  - casting to uint64_t is wrong, as in the new version of the previous
line.  uboot_heap_end already has type uintptr_t, and casting it to
uint64_t just breaks it or complicates the analysis:
- if uintptr_t is 64 bits, then casting to uint64_t has no effect
- if uintptr_t is 128 bits, then casting to uint64_t just breaks things
- if uintptr_t is 32, bits, then casting to uint64_t doesn't even
  break things.

  There is an overflow problem in theory, but the bogus cast doesn't
  fix the problem.  roundup2() normally overflows if the address is
  at a nonzero offset on the last "page" (of size KERN_ALIGN bytes
  here).  Then roundup2() should overflow to 0.  On 32-bit arches,
  the bogus cast avoids the overflow and gives a result of
  0x1.  But since the rvalue has type uintptr_t, it cannot
  represent this result, and the value overflows to 0 on assignment
  instead of in roundup2().

  roundup2() and its overflow problems are of course undocumented.

  Another of your recent commits reminded me about this overflow
  problem.  I will complain about that separately :-).

  rounddown2() cannot overflow, and is also missing the bug of
  being named in lower case despite being an unsafe macro (one
  which evaluates its args more than once).  It happens to be
  safe because it only needs to evaluate its args more than once,
  but its name is just because it folows the bad example of
  howmany().  Even howmany() is undocumented.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r293053 - head/sys/boot/uboot/lib

2016-01-02 Thread Bruce Evans

On Sat, 2 Jan 2016, Ian Lepore wrote:


Log:
 Use 64-bit math when finding a block of ram to hold the kernel.  This fixes
 a problem on 32-bit systems which have ram occupying the end of the physical
 address space -- for example, a block of ram at 0x8000 with a size of
 0x8000 was overflowing 32 bit math and ending up with a calculated size
 of zero.

 This is a fix for one of the two problems mentioned in the PR.  Something
 similar will need to be done on the kernel side before the PR is closed.


I don't like this.  It gives a type morass, and has no effect if uintptr_t
has 64 bits, and is just broken when uintptr_t has more that 64-bits.  The
corresponding hack for 64-bit arches is to use uint128_t, and for 128-bit
arches to use uint256_t, and that is more obviously excessive and more
likely to expose type errors (for example, __uint128_t compiles, but is
unprintable since printf() hasn't dreamed of its existence yet, and its
existence breaks uintmax_t since it is larger than uintmax_t).

I see that you committed related changes in many places.  Only the
one in arm/physmem.c looks right -- don't change types, but handle the
top page specially.

I once looked at vm to see how it handles this.  It seemed to use
(offset, size) pairs a lot where it would simpler to use offset+size
except for the problem that this would overflow.  This would avoid
most problems with the top page.  Eventually I decided that it didn't
really care about the top page but was just doing this because it
was over-engineered.


Modified: head/sys/boot/uboot/lib/copy.c
==
--- head/sys/boot/uboot/lib/copy.c  Sat Jan  2 18:15:10 2016
(r293052)
+++ head/sys/boot/uboot/lib/copy.c  Sat Jan  2 18:16:24 2016
(r293053)
@@ -69,11 +69,11 @@ uint64_t
uboot_loadaddr(u_int type, void *data, uint64_t addr)
{
struct sys_info *si;
-   uintptr_t sblock, eblock, subldr, eubldr;
-   uintptr_t biggest_block, this_block;
-   size_t biggest_size, this_size;
+   uint64_t sblock, eblock, subldr, eubldr;
+   uint64_t biggest_block, this_block;
+   uint64_t biggest_size, this_size;


This obviously breaks the 128-bit case.

I replied out of order about a fixup for the type morass created by this,
and said that subldr and euldr have type uintptr_t.  They only had this
correct (less incorrect) type before this change.

Making everything 64-bit mostly avoids the type morass but gives a
lot of bloat in the 32-bit case.  It doesn't completely avoid the type
morass since sizeof() still gives type uint32_t in the 32-bit case.


@@ -100,14 +100,15 @@ uboot_loadaddr(u_int type, void *data, u

biggest_block = 0;
biggest_size = 0;
-   subldr = rounddown2((uintptr_t)_start, KERN_ALIGN);


My previous reply was about this.  This was correct except for using
uintptr_t instead of uintfptr_t.


-   eubldr = roundup2(uboot_heap_end, KERN_ALIGN);
+   subldr = rounddown2((uint64_t)_start, KERN_ALIGN);


This has various type errors -- see previous reply.


+   eubldr = roundup2((uint64_t)uboot_heap_end, KERN_ALIGN);


Most of the uintptr_t types were also wrong.  vm_offset_t or vm_addr_t
or vm_paddr_t or vm_size_t would be better (I think the boot code is
low enough level to use these instead of Standard "portable" types).
Except vm_addr_t doesn't exist.  vm has its own type morass, but not
enough complications to express the difference beteen offsets and
addresses in the type (virtual addresses have to abuse vm_offset_t and
physical offsets have to abuse vm_physaddr_t).  OTOH, all these types
except vm_paddr_t are the same, so it is too easy to misuse vm_size_t
for an address of offset.

Using uintptr_t was wrong since it is just a C type large enough to
represent pointers.  It has little relation to virtual addresses and
no relation to physical addresses.  It tends to be large enough to
represent virtual addresses since it has to be large enough to represent
all C pointers uniquely.  But this is not required.  Perhaps a general
pointer has 36 bits but C pointers have only 32 bits, or C pointers
have 36 bits but can be encoded uniquely in 32 bits and casting them
to uint32_t does the encoding.  Boot code and vm should operate at a
lower level than C pointers and use raw offsets (probably represented
as raw bits).  Even casting pointers to uintptr_t is technically wrong.
If this cast does special encoding, then it is probably not what is
wanted.  A recalcitrant implement of C could use the simple encoding
of reversing all the bits.  The resulting uintptr_t's would be useless
for most existing (ab)uses of the result.  Boot and vm code is low
enough level to be honestly unportable and convert the bits using a
type pun, or better memcpy(), or better still, portable using a
conversion function that usually just casts through uintptr_t for
virtual addresses.


Re: svn commit: r292955 - head/lib/libmd

2015-12-30 Thread Bruce Evans

On Wed, 30 Dec 2015, Jonathan T. Looney wrote:


Log:
 Fix a file descriptor leak in mdXhl.c (which is used by numerous hashing
 algorithms.


This code had amazingly low quality and is still especially broken near
the main leak fixed.


Modified: head/lib/libmd/mdXhl.c
==
--- head/lib/libmd/mdXhl.c  Wed Dec 30 17:36:34 2015(r292954)
+++ head/lib/libmd/mdXhl.c  Wed Dec 30 18:04:50 2015(r292955)
@@ -59,14 +59,18 @@ MDXFileChunk(const char *filename, char
f = open(filename, O_RDONLY);
if (f < 0)
return 0;
-   if (fstat(f, ) < 0)
-   return 0;
+   if (fstat(f, ) < 0) {
+   i = -1;
+   goto error;
+   }
if (ofs > stbuf.st_size)
ofs = stbuf.st_size;


st_size is only valid for regular files.  The first bug is using it
here.  Usually it is 0 when it is invalid.  This code sets the offset
to 0 then.  I think this is just to avoid a false negative in the buggy
seek test.


if ((len == 0) || (len > stbuf.st_size - ofs))
len = stbuf.st_size - ofs;


Style bugs (extra parentheses) and second use of the invalid st_size.


-   if (lseek(f, ofs, SEEK_SET) < 0)
-   return 0;
+   if (lseek(f, ofs, SEEK_SET) < 0) {
+   i = -1;
+   goto error;
+   }


md5 is very broken by using this function for non-seekable files.  The
main case of a non-seekable file is a pipe.  This case works for at
least md5(1) by not using this function -- it uses MDFilter() which
uses stdio's fread() which works.
  (stdio knows better than to use st_size, but it naively trusts
  st_blksize and uses it to give a pessimized small block size for read()
  called from fread().  This code handles buffer sizing and allocation
  worse using its low quality methods:
  - its buffer used to have size BUFSIZ.  This size is not usable for
anything except possibly buffering keyboard input and stderr.  It
is mainly part of the broken setbuf() API.  Its value is 1024,
perhaps to avoid changing this API.  1024 large enough for more
uses in 1980.
  - its buffer now has size 16*1024 (spelled with a style bug like that).
This sometimes matches and sometimes exceeds the size used by stdio.
stdio at least attempts to choose the best size, but is defeated by
stat() putting useless sizes in st_blksize.
  - it also tries to pessimize the i/o by asking for a misaligned buffer.
Its buffer is just a local char[] variable.  Compilers usally mostly
foil this attempt by aligning large variables on the stack.

Misaligned usr buffers should only pessimize the i/o, but last time
I looked they cause DMA errors in ahci.  This breaks bsdlabel(8).
bsdlabel ask for a misaligned buffer and gets it at least when
statically linked because its buffer is a global char [] variable
and the neither the compiler nor the linker gives more alignment
than requested.

The DMA error would break md5 (or just cat) similarly if the buffer
were misaligend.  But the main bug here breaks md5 on disks without
trying any i/o.

Back to the main bug.  This seek test detects some cases of non-regular
files.  These files indeed cannot be handled by this code.  But the error
handling is broken -- it is just to set errno and return 0 (EOF).  0
means no error and errno is no set.  So all non-seekable files are
treated as empty regular files.  E.g., md5 /proc/0/* gives:

md5: /proc/0/ctl: Permission denied
MD5 (/proc/0/cmdline) = d41d8cd98f00b204e9800998ecf8427e
MD5 (/proc/0/etype) = d41d8cd98f00b204e9800998ecf8427e
MD5 (/proc/0/rlimit) = d41d8cd98f00b204e9800998ecf8427e
MD5 (/proc/0/status) = d41d8cd98f00b204e9800998ecf8427e

(/proc gives my favourite examples of irregular regular files.  It
doesn't work to use S_ISREG() to determine if st_size can be
trusted, since too many irregular files claim to be regular: e.g.,
file /proc/0/* gives:

/proc/0/cmdline: empty
/proc/0/ctl: empty
/proc/0/etype:   empty
/proc/0/rlimit:  empty
/proc/0/status:  empty

wc /proc/0/* works.  md5 works like wc using a hack to avoid calling this
broken function.  E.g.,

for i in $(ls /proc/0/*); do echo -n "$i: "; md5 <$i; done  # gives

/proc/0/cmdline: 3c5896b1ac441f4998f052e2126e8d20
/proc/0/ctl: d41d8cd98f00b204e9800998ecf8427e
/proc/0/etype: 674441960ca1ba2de08ad4e50c9fde98
/proc/0/rlimit: 67d6ad67b412e1bceb7cb508a3492197
/proc/0/status: 3ccc3067b97c3232ea2dbcb64c458fd4


n = len;
i = 0;
while (n > 0) {


Disk device files are seekable, so the lseek test doesn't result in
mishandling them.  Instead, they are treated as empty files since
their st_size is 0.  E.g., md5 /dev/ad* gives:

MD5 (/dev/ad0) = d41d8cd98f00b204e9800998ecf8427e
MD5 (/dev/ad0s1) = d41d8cd98f00b204e9800998ecf8427e
MD5 (/dev/ad0s2) = d41d8cd98f00b204e9800998ecf8427e
MD5 (/dev/ad0s2a) = 

Re: svn commit: r292955 - head/lib/libmd

2015-12-30 Thread Bruce Evans

On Thu, 31 Dec 2015, Bruce Evans wrote:



wc /proc/0/* works.  md5 works like wc using a hack to avoid calling this
broken function.  E.g.,

   for i in $(ls /proc/0/*); do echo -n "$i: "; md5 <$i; done  # gives

/proc/0/cmdline: 3c5896b1ac441f4998f052e2126e8d20
/proc/0/ctl: d41d8cd98f00b204e9800998ecf8427e
/proc/0/etype: 674441960ca1ba2de08ad4e50c9fde98
/proc/0/rlimit: 67d6ad67b412e1bceb7cb508a3492197
/proc/0/status: 3ccc3067b97c3232ea2dbcb64c458fd4


Further examples:

md5 # on terminal input

works correctly by not using MDXFileChunk().

md5 /dev/stdin # on the same terminal input

produces the d41d8cd98f00b204e9800998ecf8427e garbage using
MDXFileChunk().  truss shows that lseek() is broken too -- MDXFileChunk()
tries it and it succeeds on the unseekable file /dev/stdin.  Bugs in this
area are common.  E.g., lseek() on named pipes was broken so that it
succeeded, by rearranging the plumbing use fileops more or less and not
attaching lseek right.

cat | md5 # on the same terminal input

works correctly by not using MDXFileChunk().

cat | md5 /dev/stdin # on the same terminal input

doesn't work correctly, but it fails better than without the pipe.  Now
a seek error occurs and is reported as "md5: /dev/stdin: Illegal seek"
(I can't see where it is reported).  Then md5 exits.  Then cat waits to
read input.  Then cat fails to write output and is killed by SIGPIPE.
So md5 handled the seek error in a fail-safe though incorrect way.  One
correct way is to fall back to the working code, but it is better to
just use that without an lseek check.

It is a bug that [l]stat() on /dev/stdin sees the device file and not the
actual stdin file.  md5 can't work around this except by not using stat().

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292809 - head/lib/libc/stdio

2015-12-29 Thread Bruce Evans

On Tue, 29 Dec 2015, John Baldwin wrote:


On Monday, December 28, 2015 01:01:26 PM Warner Losh wrote:

I'll look at that, but I don't think posix_memalign is the right way to go.
The alignment of FILE is more strict than posix_memalign will return. Ian's
idea of __alignof__ is the way to go. We allocate them in one block on
purpose for performance, and posix_memalign would be a one at a time affair.


posix_memalign gives you whatever alignment you ask for.  Using __alignof__
to determine the alignment instead of hardcoding sizeof(int64_t) would
certainly be an improvement.  If you move the glue after the FILE objects
then you can use posix_memalign() directly as so:

void *mem;
int error;

error = posix_memalign(, MAX(ALIGNBYTES, __alignof__(mbstate_t)),
n * sizeof(FILE) + sizeof(*g));


Using __alignof__() involves 2 or 3 layers of style bugs:
- it is a gnu-ish spelling (full gnu would also have a space before the
  left parentheses).  The FreeBSD spelling is __alignof().  FreeBSD
  defines a macro for this, but only for compatiblity with gcc < 2.95.
  Later versions apparently support both __alignof and __alignof__()
- C++ apparently spells this as both _Alignof() and alignof() after 2011/03
- FreeBSD defines _Alignof() unconditionally.  The only difference for
  C++ after 2011/03 is it is less careful about namespaces and depends on
  alignof() existing and being part of the language.  The general definition
  using __alignof() should work in this case too.

So it seems that the correct spelling is _Alignof().  _Alignof(),
__alignof() and __alignof__() are all in the implementation namespace
except possibly _Alignof() for C++ after 2011/03, so any use of them
gives undefined behaviour which might be to do the right thing.  But
no one knows what that is or when it is done since none of this is
documented in a man page.

sys/cdefs.h is now about 8.5 times as large and more than that many
times as complicated and ugly as an FreeBSD-1 where it only had __P(())
and a few other portability macros to hide the differences between K
and C90.  It should be 8.5 times smaller (11 lines).  It contains a
mixture of old and new portability macros and perhaps some standard
macros for newer C++ and C.  I checked that it doesn't define anything
without at least 1 leading underscore except for const, inline, signed
and volatile in old compatibility modes.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292809 - head/lib/libc/stdio

2015-12-29 Thread Bruce Evans

On Wed, 30 Dec 2015, David Chisnall wrote:


On 30 Dec 2015, at 00:48, Bruce Evans <b...@optusnet.com.au> wrote:


- C++ apparently spells this as both _Alignof() and alignof() after 2011/03


This is not correct.  C++ spells it alignof.  C spells it _Alignof, unless you 
include , in which case C spells it alignof and defines _ 
_alignof_is_defined.

On FreeBSD, we define _Alignof in C++ mode, because it???s in the reserved 
identifier space and gives us something that works in C and C++.


So it is more broken than first appeared :-).  Extra spellings are a bug
since users don't know which one to use and prefer the worst one unless
they are experts in at least 3 versions of 3 standards (C-K, C90, C99,
C11, C++-mumble, gnu89, gnu99, gnu11) and FreeBSD variations on these.

There are also syntactical problems.  stdalign.h uses _Alignas and _Alignof,
but FreeBSD only defines _Alignas(x) and _Alignof(x).  The former is because
alignof is like sizeof so it doesn't need parentheses.  However,
alignof(typename) needs the parentheses and 'alignof expression' is 
apparently only a gnu extension, so it is difficult to construct an

example of Standard code without the parentheses.

_Alignas is more broken than _Alignof.  In the C case, _Alignas(x) is as
__aligned(x), but this only works if x is an expression.  __aligned(x)
is often used in FreeBSD code, but the same code in C++ with __aligned(x)
replaced by alignas(x) with any spelling is a syntax error.

Bruce___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r292777 - in head: lib/libc/sys sys/kern

2015-12-29 Thread Bruce Evans

On Tue, 29 Dec 2015, Dmitry Chagin wrote:


So, my point was:
a)  for a long time we have broken settimeofday() which does not allow us
to set the system time before the Epoch


We still have a broken settimeofday that doesn't allow us to set the system
time to the Epoch in the Western hemisphere if the clock is on local time,
and doesn't allow as to set the system time to a year > .  More precisely
it does allow us to set these and usually panics on x86 before completing.


b) as you already mentioned POSIX doesn't require times before the Epoch to
work


In fact, the representation of times is undefined for times before the
Epoch in general.

The specification of clock_settime() is interesting.  At least in an
old POSIX-2001 draft, It is required to set to the specified time and is
not limited to the current time, whatever that is.  It is only required
to work for CLOCK_REALTIME.  It is required to fail if the specified
time is outside of the "legal range" for the clock id.  "legal range"
is bad wording and is unspecified.


??) Linux does not allows negative seconds also


Old (~2004) versions of Linux seem to have less error checking for i386
settimeofday() FreeBSD then, so they seem to allow it.  The implementation
seems to be:
- copy the requested value to the kernel software time without much error
  checking.  Don't touch the hardware yet
- update the RTC every 11 minutes if the clock is externally synchronized.

So with 64-bit time_t anyone could easily set the time to year +-292g, or
just to year +-10k.  The AT RTC cannot represent this, and it is difficult
to make it try since it is difficult to find external servers running that
far off the current time.  With 32-bit time_t the range is 1906-2037.
Linux seems to have no range checking for the conversion to BCD, but it
doesn't panic because the BCD macros are expressions.  These macros of
course don't work for arbitrary inputs.  They just convert valid but
unusual years to garbage.  Before that, negative times are converted to
garbage using a type pun: set_rtc_mmss(xtime.tv_sec) starts with a time_t,
but the function takes an unsigned long.  So 1 second before the Epoch
becomes year +584g.  The usual divisions by 60 are used to reduce the
type-punned time.  Since it is unsigned, these now give garbage in-range
values.


d) we have settimeofsay(2) that consistent with POSIX and in my
understanding phrase "The time is expressed in seconds and microseconds
since midnight (0 hour), January 1, 1970." is a strict definition, which
prohibits time before the Epoch


TImes before the Epoch give undefined behaviour, so they are allowed to
work.  They are even allowed to work better than times after the Epoch,
since they can have any representation and are not require to be missing
support for leap seconds.


I do not understand why we should have our own (separate from the rest
world) behavior.


It would be good to not have such broken bounds checkng as the rest of
the world.

Updating the AT RTC synchronously in settimeofday() is already more
significantly different than limiting the range of accepted times.  One
reason that Linux updates asynchronously is that a correct synchronous
update (which FreeBSD doesn't do) requires waiting for a second or two.
The asynchronous update gives the usual problems handling errors --
even if the interrupt handler detected representation errors, it wouldn't
be able to return them to settimeofday().  So errors of a measly 292g
years go unreported.

Bruce___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r292777 - in head: lib/libc/sys sys/kern

2015-12-28 Thread Bruce Evans

On Mon, 28 Dec 2015, Konstantin Belousov wrote:


On Mon, Dec 28, 2015 at 09:35:11AM +1100, Bruce Evans wrote:

If this causes a panic, then it is from a sanity check detecting the
invalid conversion later.  A negative value in days breaks the loop
logic but seems to give premature exit from the loops instead of many
iterations.

It causes the panic due to out of bound accesses to bin2bcd_data[] array.
The issue affects around twenty rtc drivers, according to the quick grep
for clock_ts_to_ct() usage.  It probably also affects geom raid modules,
but I did not looked there at all.

As I understand, people prefer to have ability to test practically
useless values for the current time, by the cost the unplugged easy
kernel panic, in the production systems ? Am I right ?


It is not unreasonable to panic when such tests fail, just like for other
settings of unreasonable values.  Only the superuser can make them, and
the superuser should know better than to run them on production systems.

Of course, the correct fix is to fail for unrepresentable values.  Different
subsystems have different limits, so valid negative timevals may become
invalid for certain RTC hardware or software.  Valid positive timevals
may also become invalid.  The RTC might not have a century register, or
one that it has might be unreliable as on x86 (the option USE_RTC_CENTURY
tells atrtc to use the century register, but this is so little used that
it is not a supported option).  We use a hack to translate to a range of
years including the current time.  This used to be done almost correctly.
It gave the range 1970-2069, so the Epoch as representable except in the
Western hemisphere when the RTC is on local time.  Now the range is 1980-
2079, so the Epoch is never representable.  If someone tries to set the
time to the Epoch, they get 2070 instead of 1970.


The commit gave
the immediate relief for the issue. If somebody have the correct fix
for clock_ts_to_ct(), I am happy to see this commit reverted after the
proper fix.


It only avoids for some out of bounds value.  I already pointed out that
negative seconds still occur in the Western hemisphere for times near the
Epoch, etc.

Other overflow bugs are now easier to see.  64-bit time_t's allow really
large times.  The maximum with signed ones is about 292 billion years.
'year' has the wrong type (int) so it cannot represent that many.  I
think the loop to reduce 'days' iterates that many times and 'year'
overflows every 2**32'nd iteration and the result is normally as if
calculated correctly but then assigned to a 32-bit signed int.  It can
overflow to any 32-bit signed value.  Fortunately, the calculation
doesn't overrun a table of leap years.  It just uses a simple rule for
leap years which is expected to work until 2037.  Years larger than
that are not supported elsewhere.  Next, the overflowed year is assigned
to ct->year.  This is also int, so no further overflow occurs.  ct->year
is misnamed (missing a prefix) and has a comment saying that it is a
4-digit year, but the code can produce any value.  Next, ct->year is
trusted by at least the atrtc driver.  This gives an out of bound
accesses to bin2bcd_data[] in about 49.5% of cases if USE_RTC_CENTURY
is not configured, and in almost all cases if USE_RTC_CENTURY is configured:

X   writertc(RTC_YEAR, bin2bcd(ct.year % 100)); /* Write back Year*/
X #ifdef USE_RTC_CENTURY
X   writertc(RTC_CENTURY, bin2bcd(ct.year / 100));  /* ... and Century*/
#endif

ct.year overflows to a negative value in about 50% of the cases.  Then
ct.year % 100 is negative in 99% of the subcases and 0 in 1% of the
subcases.

ct.year / 100 is negative in about 50% of the cases.  When it is positive,
it is too large for the array except for the small range 0- for ct.year.
The comment on ct->year is correct that the year must have only 4 digits.
Anything larger than that takes more than 2 bin2bcd steps and more than
2 bcd registers to represent.  Probably other drivers are sloppy as above,
but they should be able to handle 4 digits by throwing away the century
digits as in the usual case above.

The adjustment to give times between 1980 and 2079 is for the other
direction.  That direction is more careful since the time comes from
untrusted hardware instead of the trusted superuser.  I pointed out
some missing checks on lower bounds in clock_ct_to_ts().  The one
for the month is not missing and is most important since the month
is used as an array index and has base 1.  The month day also has base 1
but is not used as an array index.  Buggy hardware might produce
0 but can't produce a negative value for seconds minutes or hours if it
uses bcd encoding.  This leaves the year as most in need of a lower
bounds check.  The code is accidentally fail-safe for years -- years
before 1970 are converted to 1970 except for the leap year caclulation.

subr_clock.c is a little under-engineered.  subr_fattime.c is grossly
over-engineered, bu

Re: svn commit: r292777 - in head: lib/libc/sys sys/kern

2015-12-28 Thread Bruce Evans

On Mon, 28 Dec 2015, Garrett Cooper wrote:




On Dec 28, 2015, at 02:17, Bruce Evans <b...@optusnet.com.au> wrote:


...


It is not unreasonable to panic when such tests fail, just like for other
settings of unreasonable values.  Only the superuser can make them, and
the superuser should know better than to run them on production systems.


On a development system, this is perfectly reasonable. However, on systems in 
production, dying on asserts or panicking when unexpected input is encountered 
instead of erroring out appropriately is not ideal: it causes unnecessary 
downtime and can confuse others (customers, lower level admins) who are not 
fully aware of how UNIX and FreeBSD works (and in turn generate support calls 
and bug reports).


This is sort of backwards.  Unexpected cases by definition have no code to
handle them.  If you are lucky then they will be detected by an assertion
or a bad pointer and cause a panic before they do further damage.  Bad
pointers works better for this since they cannot be turned off on
production systems.

Out of bounds input that is expected can be handled, but the handling can
still reasonably be to panic.  For example, add assertions at the lowest
level that values are within bounds for the BCD array.  This "handles" the
bad input that is "expected" from buggy upper layers.

Panics for expected cases are easier to avoid (or cause) than for unexpected
cases.  The super user should avoid them.  Malicious users should try to
caus them.  But when the syscall that causes them is privileged, the
malicious users can't get far.

Another easy way to trigger the panic is:
- abuse the adjkerntz sysctl to set a huge negative offset.  utc_offset()
  is normally small, so it can only produce negative times from positive
  times in the Western hemisphere for times near the Epoch.  The
  adjkerntz offset is added in utc_offset() so it can reach 68 years
  back from the current time, i.e., to 1947.

The adjkerntz sysctl, like most sysctls, is correctly implemented with
no bounds checking.  Callers are trusted to know more about the correct
bounds than the kernel.

A not so way way to trigger the panic is:
- set the time in the BIOS to before the Epoch if the BIOS supports that.
  Say 1969/01/01.
- set this time in the kernel by booting if you can do that.  On x86, this
  requires configuring with USE_RTC_CENTURY to prevent 1969 being
  interpreted as 2069.  You now have a negative time.
- find or create an ntp server that supports times before the Epoch, and
  set its time to 1969/01/01
- wait to sync with the ntp server with the fake time.  The dubious
  periodic refresh of the TODR when the time is synced with ntp is not
  even an option and the default period is 1800 seconds.  Also wait
  for the refresh.  resettodr() is then called with a negative time.

ntpd normally uses microadjustments so it is not affected by restrictions
in settime().  I don't know of any other way of trying to write back the
current time, and ntpd with a non-fake time doesn't do it because 1969
can only be in sync with a fake time.

The periodic refresh is dubious because at least on x86 the RTC reading
and setting code is sloppy and has an error of about 1 second each,
and races.  So setting the RTC more accurately than 1-2 seconds is
worse than useless (it rarely helps, and may lose to races).  It can be
kept that accurate by setting it much less often than twice per hour,
except for the initial setting after booting which is best done by
ntpdate.  Everything is best handled in userland.  adjkerntz(8) handles
more complicated things, and the adjkerntz sysctl can be used at any
time for its side effect of calling resettodr().

The badly named machdep.disable_rtc_set sysctl still exists and is still
used by adjkerntz(8) to make resettodr() do nothing.  This is normally
enabled, but adjkerntz(8) disables it transiently to avoid setting the
RTC too often.  This could be disabled more permantly to stop buggy
things like settime() and the ntpd refresh ever setting the RTC.  IIRC,
adjkerntz(8) is fairly careful with this sysctl and restores it to
its previous state after using it.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292777 - in head: lib/libc/sys sys/kern

2015-12-27 Thread Bruce Evans

On Sun, 27 Dec 2015, Ian Lepore wrote:


On Sun, 2015-12-27 at 12:05 -0800, NGie Cooper wrote:

On Dec 27, 2015, at 11:30, Slawa Olhovchenkov 
wrote:


?


I have no idea what you mean by that -- I didn't say anything at
all
about panic.


As I understund commit log -- this is prevent kernel panic at some
call (with illegal arguments). This accpetable irrelevant to bugs
in
calling code.


This also makes us POSIX compliant and more compatible with
Linux:
http://pubs.opengroup.org/onlinepubs/007908799/xsh/clock_settime.html
 (search for ?negative?).
Thanks kib!
-NGie


This thread just keeps becoming more and more surrealistic.  First
someone tries to reply to the original commit (I guess?) by replying
with a complete non sequitur to my reply.  Now you cite a document that
says nothing directly related to the commit at all.

The only reference to "negative" in what you cited is part of
specifying how to truncate/round fractional values that fall between
two representable values given the resolution of the clock you're
setting.


It also has an obfuscated verbose spelling of negative as "less than
zero" in the description of [EINVAL].  This is the specification of
a invalid timespec which is repeated ad nauseum.  The upper limit is
spelled even more verbosely as "greater than or equal to 1000 million".
The correct spelling of this is ">= 10" but that is hard to
read in another way (too many 0's to count quickly).  Spelling this
value is remarkably difficult.  There are about 10 different spellings
that are no good since they depend on the locale or language (natural
or programming).  Mixing digits and words is ugly.  1 billion is
shorter but is off by a factor of 1000 in some locales.

I stared at this description many times.  It doesn't allow considering
negative times as invalid generally.  setitimer(2) has to be specially
broken to disallow them.  This bug is missing for nanosleep().  FreeBSD
still documents a non-POSIX limit of 1 seconds for setitimer(2),
but its implementation has been broken to overflow instead of enforcing
this limit.  Note that this is 1 followed by 8 zeros and applies to
the seconds value, while the limit for nanoseconds os 1 followed by 9
zeros.

Different spellings of 1 followed by a large number of zeros makes thes
value hard to grep for.  1 followed by 8 zeros is in about 50 man pages
(counting links).  It is documented as the limit on seconds in
get/setitimer(2).  mtree(8) misspells 1 followed by 9 zeros as 1 followed
by 8 zeros.  alarm(3) is implemented using itimers and documents the
same limit.  ularm(3) documents the bizarre limit of 1 followed by 14
zeros "in case this value fits in an the unsigned integer".  This is
alarm(3)'s documented but not actual limit of 10**8 seconds converted
to microseconds.  It is reachable on systems with >= 47 bit longs.
This spelling is not used in any man page for the limit on the number
of nanoseconds (as in POSIX).


Later in that document they specifically require EINVAL for negative
fractional second values.  If they intended to to prohibit negative
whole-second values, that would certainly have been the place to
mention it, and they don't.


This is the "obfuscated verbose spelling" part.  This is not really
a restriction on negative fractions.  It is just that negative fractions
are represented as a negative integer plus a proper fraction, where by
definition a proper fraction is nonnegative and less than 1.
Normalization gives this (except when it would overflow).  The
requirement is essentially that callers don't pass unnormalized values.

Bruce___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r292777 - in head: lib/libc/sys sys/kern

2015-12-27 Thread Bruce Evans

On Sun, 27 Dec 2015, Ian Lepore wrote:


On Sun, 2015-12-27 at 15:37 +, Dmitry Chagin wrote:

Author: dchagin
Date: Sun Dec 27 15:37:07 2015
New Revision: 292777
URL: https://svnweb.freebsd.org/changeset/base/292777

Log:
  Verify that tv_sec value specified in settimeofday() and
clock_settime()
  (CLOCK_REALTIME case) system calls is non negative.
  This commit hides a kernel panic in atrtc_settime() as the
clock_ts_to_ct()
  does not properly convert negative tv_sec.

  ps. in my opinion clock_ts_to_ct() should be rewritten to properly
handle
  negative tv_sec values.

  Differential Revision:https://reviews.freebsd.org/D4714
  Reviewed by:  kib

  MFC after:1 week


IMO, this change is completely unacceptable.  If there is a bug in
atrtc code, then by all means fix it, but preventing anyone from
setting valid time values on the system because one driver's code can't
handle it is just wrong.


I agree.  Even (time_t)-1 should be a valid time for input, although it
is an error indicator for output.
  (This API makes correctly using functions like time(1) difficult or
  impossible (impossible for time(1) specifically.  The implementation
  might reserve (time_t)-1 for representing an invalid time.  But
  nothing requires it to.
(POSIX almost requires the reverse.  It requires a broken
implementation that represents times as seconds since the Epoch.  I
think POSIX doesn't require times before the Epoch to work.  But
FreeBSD and the ado time package tries to make them work.)
  So if the representation of the current time is (time_t)-1, then
  time(1) can't do anything better than return this value.  But this
  value is also the error value.  There is no way to distinguish this
  value from the error value, since time(1) is not required to set
  errno.)

I think the change also doesn't actually work, especially in the Western
hemisphere, but it was written in the Eastern hemisphere.  Suppose
that the time is the Epoch.  This is 0, so it is pefectly valid.  Then
if the clock is on local time, utc_offset() is added before calling
clock_cs_to_ct() and the result is a negative time_t in the Western
hemisphere.  Similarly in the Eastern hemisphere when you test with
with Western settings.

The main bug in clock_ts_ct() is due to division being specified as
broken in C:

X void
X clock_ts_to_ct(struct timespec *ts, struct clocktime *ct)
X {
X   int i, year, days;
X   time_t rsec;/* remainder seconds */
X   time_t secs;
X 
X 	secs = ts->tv_sec;

X   days = secs / SECDAY;
X   rsec = secs % SECDAY;

Division of negative numbers used to be implementation-defined in C, but
C90 or C99 broke this by requiring the broken alternative of rounding
towards 0 like most hardware does.  The remainder operation is consistently
broken.  So when secs < 0, this always gives days < 0 and rsec either 0
or < 0.

If this causes a panic, then it is from a sanity check detecting the
invalid conversion later.  A negative value in days breaks the loop
logic but seems to give premature exit from the loops instead of many
iterations.

Another bug here is the type of rsec.  This variable is a small integer
(< SECDAY = 86400), not a time_t.

Code like this is probably common in userland.  w(1) uses it, but w(1)
only deals with uptimes which should be positive.

clock_ct_to_ts() is also buggy:

X int
X clock_ct_to_ts(struct clocktime *ct, struct timespec *ts)
X {
X   int i, year, days;
X ...
X   /* Sanity checks. */
X   if (ct->mon < 1 || ct->mon > 12 || ct->day < 1 ||
X   ct->day > days_in_month(year, ct->mon) ||
X   ct->hour > 23 ||  ct->min > 59 || ct->sec > 59 ||
X   (sizeof(time_t) == 4 && year > 2037)) {  /* time_t overflow */
X   if (ct_debug)
X   printf(" = EINVAL\n");
X   return (EINVAL);
X   }

The limit of 2037 is bogus with 64-bit time_t's or even with 32-bit
unsigned time_t's.

Years before 1970 are insane due to the C bug, and years before ~1906
are insanse due to representability problems, but there is no check
for the lower limit of 'year'.

There is also no check for the lower limit of ct->hour, ct->min or
ct->sec.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292817 - head/usr.bin/systat

2015-12-27 Thread Bruce Evans

On Mon, 28 Dec 2015, Marcelo Araujo wrote:


Log:
 Add on systat -vm the ability to display the physical and kernel memory
 percent usage.

 PR:bin/203917
 Submitted by:  ota 
 Approved by:   bapt (mentor)
 Differential Revision: https://reviews.freebsd.org/D4281


This has lots of style bugs in both the output and the source code.  There
was negative space for expansion, but some space is used to squeeze memory-
related variables in between the general stats line and the memory lines.
I think they would fit in the general stats line and not look quites as bad
there.  This line now uses 67 columns and only needs about 50 of these.

Style bugs start with not updating the comment about STATROW.  It still
says that that this (really the STAT window) uses 1 row and 67 columns,
but it now uses 2 row.  This use is an abuse.  It would be clearer to
put the new memory statistics in a new window.  They don't really fit
in tje MEM window either, and putting them there would change all the
existing line numbers there.

Some other statistics are squeezed into out of the way places and not
printed if the terminal is not large enough.  nbuf is one of these.
Displaying it is not very useful since (like the variables in this commit
and unlike almost all the other variables in the display), it is constant.
It is also less useful than the variables in this commit.  So it is put
on line 24 of 0-24 and not displayed unless the terminal has > 25 lines
since the last line is reserved for input.

I recently noticed that the interrupt lines are misformatted by letting
them run into the interactive i/o line (probably also outside the
window where they are not displayed of course) if there are many of
them.  Then if you do input, it makes a mess by erasing the last
displayed interrupt line, but not completely.  The line is first erased
completely, but later the count is refreshed but the description is
not.  Switching to another display and back gives the same mess, with
the description displayed for an instant on switching back before it
is erased by info on the i/o line.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292620 - head/sys/kern

2015-12-24 Thread Bruce Evans

On Thu, 24 Dec 2015, Konstantin Belousov wrote:


On Wed, Dec 23, 2015 at 08:02:10AM +1100, Bruce Evans wrote:

On Tue, 22 Dec 2015, Konstantin Belousov wrote:


Log:
 If we annoy user with the terminal output due to failed load of
 interpreter, also show the actual error code instead of some
 interpretation.


This and nearby messages are of annoyingly low quality.  They don't
even print the program name(s).

I use the following partial fixes.  I forget if they print the program
name or the interpeter name.

X Index: imgact_elf.c
X ===
X RCS file: /home/ncvs/src/sys/kern/imgact_elf.c,v
X retrieving revision 1.151
X diff -u -2 -r1.151 imgact_elf.c
X --- imgact_elf.c  5 Jun 2004 02:18:28 -   1.151
X +++ imgact_elf.c  5 Jun 2004 06:51:25 -
X @@ -694,6 +693,6 @@
X   brand_info = __elfN(get_brandinfo)(hdr, interp);
X   if (brand_info == NULL) {
X - uprintf("ELF binary type \"%u\" not known.\n",
X - hdr->e_ident[EI_OSABI]);
X + uprintf("%s: ELF binary type \"%u\" not known.\n",
X + imgp->stringbase, hdr->e_ident[EI_OSABI]);

This cannot be a fix.  there is no stringbase member in struct imgact.


There was in FreeBSD-5.


In fact, there is no available path to the executable in the image activation
routine, and due to things like fexecve(2), it may have been not passed
to the kernel at all.


argv[0] and the string being interpreted must be known at some point.


The present structure of the messages is forced by this fact, and it is
usually obvious what is the image kernel complained about, since it is
the image that was just executed.


Not if it is one of many commands in a script.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292620 - head/sys/kern

2015-12-22 Thread Bruce Evans

On Tue, 22 Dec 2015, Konstantin Belousov wrote:


Log:
 If we annoy user with the terminal output due to failed load of
 interpreter, also show the actual error code instead of some
 interpretation.


This and nearby messages are of annoyingly low quality.  They don't
even print the program name(s).

I use the following partial fixes.  I forget if they print the program
name or the interpeter name.

X Index: imgact_elf.c
X ===
X RCS file: /home/ncvs/src/sys/kern/imgact_elf.c,v
X retrieving revision 1.151
X diff -u -2 -r1.151 imgact_elf.c
X --- imgact_elf.c  5 Jun 2004 02:18:28 -   1.151
X +++ imgact_elf.c  5 Jun 2004 06:51:25 -
X @@ -694,6 +693,6 @@
X   brand_info = __elfN(get_brandinfo)(hdr, interp);
X   if (brand_info == NULL) {
X - uprintf("ELF binary type \"%u\" not known.\n",
X - hdr->e_ident[EI_OSABI]);
X + uprintf("%s: ELF binary type \"%u\" not known.\n",
X + imgp->stringbase, hdr->e_ident[EI_OSABI]);
X   error = ENOEXEC;
X   goto fail;
X @@ -828,5 +827,6 @@
X   >entry_addr, sv->sv_pagesize);
X   if (error != 0) {
X - uprintf("ELF interpreter %s not found\n", interp);
X + uprintf("%s: ELF interpreter %s not found\n",
X + imgp->stringbase, interp);
X   goto fail;
X   }

Other uprintf()s in imgact_elf.c are worse -- they print a fixed string that
gives no hint about the intepreter either (except it sometimes says ELF or
elf or PT_).

Y   uprintf("elf_load_section: truncated ELF file\n");
Y   uprintf("Program headers not in the first page\n");
Y   uprintf("Unaligned program headers\n");
Y   uprintf("Invalid PT_INTERP\n");
Y   uprintf("i/o error PT_INTERP\n");
Y   uprintf("Cannot execute shared object\n");
Y   uprintf("%s\n", err_str);
Y   uprintf("ELF interpreter %s not found\n", interp);
Y   uprintf("i/o error PT_NOTE\n");

The "ELF binary type \"%u\" not known.\n", message is the only one with
the style bug of a terminating ".".  My patch is missing the fix for this.

"elf_load_section: truncated ELF file\n" messages is the only one that
prints the function name.  This is not very useful for users.

All kernel printfs and KASSERT()s in the file use __func__ to obfuscate
the function name and have many other style bugs.

uprintf() is rarely used.  Aproximately 100 times.  Most uses don't
print enough context.  In kern_exec.c there is a related one that
prints a pathname, but not in err() format with the name first.  The
problematic pathname or interpreter name may be nested.  I'm not sure how
to find the complete sequence of names.

tprintf() is only used 7 times.  Some uprintf()s should probably be
tprintf()s to get them logged.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292621 - head/sys/fs/devfs

2015-12-22 Thread Bruce Evans

On Tue, 22 Dec 2015, Konstantin Belousov wrote:


Log:
 Keep devfs mount locked for the whole duration of the devfs_setattr(),
 and ensure that our dirent is instantiated.

 Reported and tested by:bde
 Sponsored by:  The FreeBSD Foundation
 MFC after: 1 week


Thanks.

This is part of fixing revoke(2).  Even stat() doesn't work right when
it races revoke().  setattr() is used surprisingly often since it is used
for opening with O_TRUNC.  open() racing with revoke() caused problems
doing the truncation even though truncation is a no-op for devfs.
Truncation is not atomic for opening with O_TRUNC.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292443 - head/usr.sbin/boot0cfg

2015-12-18 Thread Bruce Evans

On Fri, 18 Dec 2015, John Baldwin wrote:


Log:
 Fix the precious change to check the pointer returned by malloc().

 Submitted by:  luke 
 Pointy hat to: jhb


Of course, malloc can't fail, especially for small utilities like boot0cfg,
especially in -current.


Modified: head/usr.sbin/boot0cfg/boot0cfg.c
==
--- head/usr.sbin/boot0cfg/boot0cfg.c   Fri Dec 18 17:39:54 2015
(r292442)
+++ head/usr.sbin/boot0cfg/boot0cfg.c   Fri Dec 18 17:52:08 2015
(r292443)
@@ -337,7 +337,7 @@ read_mbr(const char *disk, u_int8_t **mb
return (mbr_size);
}
*mbr = malloc(sizeof(buf));
-if (mbr == NULL)
+if (*mbr == NULL)
errx(1, "%s: unable to allocate MBR buffer", disk);


Style bug.  Someone suggested using the 'if ((*mbr = malloc(...)) ==
NULL)' style.  That might be less clear, but it is what is used elswhere
in the program, including for the old malloc check.  Also, the error
messages are too different.  The other one calls the buffer a "read"
buffer although it spells its variable name mbr.


memcpy(*mbr, buf, sizeof(buf));
close(fd);


Program for checking that malloc can't fail:

X #define   SIZE16
X 
X #include 

X #include 
X #include 
X 
X int

X main(void)
X {
X   char *p;
X   size_t tot;
X 
X 	tot = 0;

X   while ((p = malloc(SIZE)) != NULL) {
X   bzero(p, SIZE);
X   tot += SIZE;
X   if (tot > 4 * 1024 * 1024)
X   errx(1, "giving up after allocating %zu bytes", tot);
X   }
X   errx(1, "malloc failed after allocating %zu bytes", tot);
X }

It is hard to get this to fail in -current.  ulimit -Sm 4096 no longer
works for limiting data size (ulimit -m never worked).  Virtual memory must
be limited in -current.  But ulimit -Sv 4096 prevents most programs from
starting, due to bugs in dynamic linkage:

  pts/9:bde@freefall:~/zz> ulimit -Sv 4096
  pts/9:bde@freefall:~/zz> ls
  /lib/libutil.so.9: mmap of entire address space failed: Cannot allocate memory
  pts/9:bde@freefall:~/zz> ./z
  /lib/libc.so.7: mmap of entire address space failed: Cannot allocate memory

"./z" is this program.  Linking the program statically makes it run
in the measly 4096k of virtual and mallocable memory.  It then allocates
4194320 bytes using the small allocation size.  Of course, cc can't
run in 4096k.  However, it is not necessary to map the _entire_ address
space like the error messages say.  Dynamic "./z" runs in a measly
1k.  ls fails as above up to at least 12000k, but dumps core with
14000k.  ls starts working with a measly 16000k.  cc needs more still.
I didn't find the exact requirement, and used 16k to compile this
little program.  Not so measly.

The program is careful to touch all the memory that it allocates.  Without
that, I think things are even more broken.  Somehow the process virtual
size doesn't grown above 6.2M.  Apparently even the virtual memory is
not completely allocated until it is used.  So malloc() returns success,
but failures may happen much later when the memory is used.  Similarly
with SIZE = 16MB but only the first byte of each allocation touched.
OTOH, with size = 16MB all touched and no limits, the process grows
very large and takes too long to respond to job control symbols, as
if it is swapping.  The -d and -m limits don't work for preventing
growth of its real memory.

ulimit -Sv should be used in testing since ulimit -v sets the hard
limit and prevents restoring a usable soft limit.  Setting -v to a low
value (lower than 16000k for ls) is not useful except for testing
malloc failure since it gives an unusable shell.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292153 - head/lib/libc/regex/grot

2015-12-13 Thread Bruce Evans

On Sun, 13 Dec 2015, Garrett Cooper wrote:


Log:
 Add -static to CFLAGS to unbreak the tests by using a libc.a with
 the xlocale private symbols exposed which aren't exposed publicly
 via the DSO


This is an interesting hack.

I think there are some bad bugs in static libraries from exposing
public private (sic) symbols in them, and from using private public
(sic) symbols.  Or perhaps the reverse.

A public private symbol is one that is public (extern) in C but not
exported from the shared library.  A private public symbol is one
that is public (extern) in C and is also exported by the shared 
library, but is a weak symbol so it is sort of private for both.


An example of the latter is 'open' or syslog.  Both are in the
user namespace for Standard C.  syslog is in the user namespace
for some versions of POSIX.  libc is supposed to use renamed
versions so that it never uses a user version.

It mostly does this for _open.  open is a weak symbol so that
sloppy parts of libc and POSIX applications can see it.  Non-POSIX
applications can see it too unless they replace it.  According to
nm on libc.a, the only thing in the library with a sloppy reference
to open is citrus_mmap.o.  It also references close, fstat, mmap
and munmap.  This is probably OK since it is an extension of POSIX.

There is the following thicket of complications in names for what
should be the simple open syscall:

open.o:
X  U __libc_interposing
X  U __stack_chk_fail
X  U __stack_chk_guard
X  W open

_open.o:
X  U .cerror
X  T __sys_open
X  W _open

Even _open is a weak symbol.  That makes no sense.  Similarly for all
syscalls except ones like _exit whose primary syscall name has the
single underscore.  I think it is just a bug in the PSEUDO macro in SYS.h.

Functions like syslog() are not handled so carefully.  They are never
renamed.  This was not a problem, since they were not called much from
other parts of libc.  Perhaps never for syslog(), or just from associated
parts with the closure of the parts being entirely inside or entirely
outside APIs that have syslog().  But this was broken by the stack
protector code.  The stack protector code is not called from all over
the library, e.g., for open as shown above.  This turns the careful
renaming for open into just an obfuscation (except the obfuscations also
give pessimizations).

Test program for this.

X #include 
X #include 
X 
X #ifndef NO_DEBLOAT

X void
X openlog(void)
X {
X   puts("opensyslog() is not in the Standard C library()");
X }
X 
X void

X syslog(void)
X {
X   puts("syslog() is not in the Standard C library()");
X   /* I am careful not to call this, but stack_protector.c isn't. */
X   // system("rm -rf /");
X }
X 
X void

X vsyslog(void)
X {
X   puts("vsyslog() is not even in POSIX");
X }
X 
X void

X closelog(void)
X {
X   puts("opensyslog() is not in the Standard C library()");
X }
X 
X off_t

X lseek(int fd, off_t offset, int whence)
X {
X   puts("lseek() is not in the Standard C library()");
X   return -1;
X }
X #endif
X 
X int

X main(void)
X {
X #ifdef FORCE_USE
X   openlog();
X   vsyslog();
X   closelog();
X #else
X   open("/dev/null", 0); /* warm up for debugging */
X   open("/dev/null", 0); /* try to get it to call us */
X #endif
X   puts("hello world is bloated");
X }

This must be run under gdb.  Manually corrupt the stack so that
__stack_chk_fail is called.  Then the private syslog() is called iff
the linkage is static.  The other functions are normally not called
since __stack_chk_fail kills the program with SIGABRT after calling
syslog().

Debugging this shows another bug (bogusness at least): _open uses the
open syscall, but open uses the openat syscall.

Debugging this is especially difficult when it is dynamically linked.
Then the __libc_interposing, __stack_chk_fail and __stack_chk_guard
symbols are so private that even the debugger can't see them, at least
with the library not compiled with -g.  With static linkage, they are
normal public symbols.

This program was originally for testing reduction of library bloat.
The bloat is so large that the null program main(){} now links to
syslog and might even call it if the stack gets corrupted.  So the
null program now has size more than 460KB on amd64 (more than 4
times larger than /bin/sh in FreeBSD-1).

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r292013 - head/lib/libc/stdio

2015-12-09 Thread Bruce Evans

On Wed, 9 Dec 2015, Garrett Cooper wrote:


Log:
...
 Use stdint.h instead of inttypes.h as the latter pollutes namespace more
...
Modified: head/lib/libc/stdio/open_memstream.c
==
--- head/lib/libc/stdio/open_memstream.cWed Dec  9 08:53:41 2015
(r292012)
+++ head/lib/libc/stdio/open_memstream.cWed Dec  9 09:14:57 2015
(r292013)
@@ -31,10 +31,10 @@ __FBSDID("$FreeBSD$");
#include "namespace.h"
#include 
#include 
+#include 
#ifdef DEBUG
-#include 
+#include 
#endif
-#include 
#include 
#include 
#include 


Thanks.  It wou;d be noice to fix some other cases of excessive includes.
There aren't many for newer headers like stdint.h and inttypes.h.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r291994 - head/include

2015-12-09 Thread Bruce Evans

On Thu, 10 Dec 2015, Hajimu UMEMOTO wrote:


On Wed, 9 Dec 2015 18:19:16 +1100 (EST)
Bruce Evans <b...@optusnet.com.au> said:


brde> resolv.h already had massinve namespace pollution and style bugs in
brde> its includes.  One more include of a header that is relatively clean
brde> since it is tiny and was designed for minimising namespace pollution
brde> makes little difference.

I understood.  Thank you for your detailed explanation.
However, I realized that r289315 changed the size of struct
__res_state.  It broke binary backward compatibility.  I think we
still need to revert its change in struct __res_state and move them
into struct __res_state_ext.


I see.

Most of the pollution in resolv.h dates from 1993, so is hard to fix
now.  It was not in 4.4BSD.  In 4.4BSD,  was a prerequisite,
and undocumented APIs like fp_query() that used FILE were named with
leading underscores.  The ABI changes in __res_state are mostly newer
than than 1993.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r291994 - head/include

2015-12-08 Thread Bruce Evans

On Wed, 9 Dec 2015, Hajimu UMEMOTO wrote:


Hi,


On Tue, 8 Dec 2015 16:09:48 + (UTC)
Eric van Gyzen  said:

...
vangyzen> Log:
vangyzen>   resolver: fix the build of some ports, broken by r289315
vangyzen>
vangyzen>   r289315 required time_t and struct timespec to be defined before
vangyzen>   including .  This broke the build of net-mgmt/sx, at 
least.
vangyzen>
vangyzen>   Include  in resolv.h to fix this with minimal 
pollution.
...
It seems wrong.  We should not have pollution from timespec.h, here.
You should consider moving them into __res_state_ext like NetBSD does.


resolv.h already had massinve namespace pollution and style bugs in
its includes.  One more include of a header that is relatively clean
since it is tiny and was designed for minimising namespace pollution
makes little difference.

Old namespace pollution and style bugs:

X #include 

This includes the kitchen sink via nested includes (not as many as in the
kernel).  Almost none of its namespace is documented.  Even the names that
it is supposed to define are mostly undocumented, so no one knows what these
are.

X #include 

This was already included recursively at least ones.  style(9) explicitly
forbids including this again.

X #include 

This was already included recursively approximitately  times.  Include guards actually prevent it being
parsed more than the first time.  This is the only included header that is
supposed to have no namespace pollution.  style(9) has less to say about
this.  It makes more sense to include it first, and that is the correct
style (for __FBSDID) in .c files.  Headers with no pollution usually need
to include it directly.  Others should depend on primary headers like
 including it.

X #include 
X #include 
X #include 

Further pollution.

sys/timespec.h adds to this just:

Y #include 

Example of a correct use of sys/cdefs.h.  It is to get the definition
of __BSD_VISIBLE.  However, use of that is bogus.

Y #include 

sys/_timespec was originally correctly designed have no namespace
pollution.  It declared a struct __timespec and use __time_t in it,
since timespec and time_t would be namespace pollution in some cases.
It should be included in places like  where __timespec
etc. would be pollution.

sys/timespec.h was originally correctly designed turn __timespec into
timespec and __time_t into timespec (except it also defines the bogus
conversion macros -- see below).  It should be included in the very
few places where __timespec would not be pollution in any supported
version, and for traditional pollution in places like time.h.

Using these little headers is delicate and usually does nothing except
increase compilation times by including both of them nested.  Having
2 for just struct timespec was a bit much, and has been turned into
nonsense.  Someone changed __timespec to timespec and __time_t to
time.h in _timespec.h, and removed the careful anti-pollution ifdefs
in , with the justification that timespec is now standard
in POSIX.  But it isn't standard in old versions of POSIX.  So this
change mainly broke  and turned _timespec.h's existence
into nonsense.  It didn't break much else since not much else was
careful about this.  POSIX now requires timespec to be declared in
many headers and allows but doesn't require it to be declared (as
pollution) in many others.

Y 
Y #if __BSD_VISIBLE

Y #define   TIMEVAL_TO_TIMESPEC(tv, ts) 
\
Y   do {\
Y   (ts)->tv_sec = (tv)->tv_sec;  \
Y   (ts)->tv_nsec = (tv)->tv_usec * 1000; \
Y   } while (0)
Y #define   TIMESPEC_TO_TIMEVAL(tv, ts) 
\
Y   do {\
Y   (tv)->tv_sec = (ts)->tv_sec;  \
Y   (tv)->tv_usec = (ts)->tv_nsec / 1000; \
Y   } while (0)
Y 
Y #endif /* __BSD_VISIBLE */


These macros are bogus.  They are bad enough in the kernel.  Of course
they are not documented in any man page.  sys/timespec.h was original
clean except for defining these.

Y 
Y /*

Y  * Structure defined by POSIX.1b to be like a itimerval, but with
Y  * timespecs. Used in the timer_*() system calls.
Y  */
Y struct itimerspec {
Y   struct timespec  it_interval;
Y   struct timespec  it_value;
Y };

This pollution was added later.  It turns timespec.h into half-nonsense.
timespec.h was supposed to match its name and declare only struct timespec.
The pollution here implements the POSIX requirement that struct itimerspec
is declared in .

 and  have much worse namespace and organization
problems than resolv.h.  There is also sys/_timeval.h which does for
timevals what sys/timespec.h should do for timespecs.  It is relatively
simple and clean.  In fact, it has always been almost as perfect as
possible 

Re: svn commit: r292004 - head/lib/libc/stdio

2015-12-08 Thread Bruce Evans

On Tue, 8 Dec 2015, Garrett Cooper wrote:


Author: ngie
...

Log:
 Fix compilation when -DDEBUG is defined by adding inttypes.h #include
 for intmax_t


Wrong include.  intmax_t is declared in .   declares
much more and is usually only needed for functions like strtoimax().
 also declares PRI* and SCN*, but those are only needed in
the garbage.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r291936 - head/sys/ufs/ufs

2015-12-07 Thread Bruce Evans

On Mon, 7 Dec 2015, Konstantin Belousov wrote:


Log:
 Update ctime when atime or birthtime are updated.

 Cleanup setting of ctime/mtime/birthtime: do not set IN_ACCESS or
 IN_UPDATE, then clear them with ufs_itimes(), making transient
 (possibly inconsistent) change to the times, and then copy
 user-supplied times into the inode.  Instead, directly clear IN_ACCESS
 or IN_UPDATE when user supplied the time, and copy the value into the
 inode.

 Minor inconsistency left is that the inode ctime is updated even when
 birthtime update attempt is performed on a UFS1 volume.

 Submitted by:  bde
 MFC after: 2 weeks


Thanks.

I forgot to remind yo to make the same change in other file systems.
Especially ext2fs.  msdosfs already has as much as possible of this.

POSIX generally requires even null changes of attributes to update the
ctime, so it is not wrong to always update the ctime when the only change
is for an unsupported birthtime.   This also happens for other file systems
that don't support birthtimes.  Other file systems might also not support
other timestamps.  msdosfs doesn't support ctimes so it doesn't need to
worry about setting them for null and ignored changes to other timestamps.
It ignores attempts to set unsupported times, like ffs does for birthtimes.
Some versions of msdosfs support birthtimes, but msdosfs_setattr() doesn't
support them.

But POSIX also requires omitted changes of attributes to update the ctime.
For timestamp attributes, it even requires the security hole of updating
the ctime when changing the other times fails (perhaps due to no permission
to change them), and the nonsense of updating the ctime when the file
doesn't exist.  The security hole and the nonsense are because POSIX is
missing "successful" in the usual wording "Upon successful completion"
in one place.

The cases with omitted changes occur mainly for utimensat().  UTIME_OMIT
for both times gives 2 omitted changes and not 2 null changes.  Clearly
the ctime should not be changed when all changes are omitted.  But
POSIX requires utimensat() to mark the ctime for update on all completions.
This gives the security hole but not the nonsense of acting on a nonexistent
file in a way in a less unintentional way than the missing "successful".
POSIX requires not doing the permissions tests for omitted changes.  So
2 omitted changes normally result in success for files that would fail
the permissions tests for non-omitted changes.  Then the correct
"Upon successful completion" wording requires marking the ctime.

Some other syscalls like chown() provide another way of omitting
changes: Uid (uid_t)-1 means to use the current uid of the file, etc.
This is closer to a null change than an omitted change.

FressBSD has a different set of bugs in this area.  For utimes() and also
for utimensat(), it has the undocumented behaviour of turning times
with tv_sec == -1 into the equivalent of UTIME_OMIT, the standard
utimes() doesn't have such a feature and standard utimensat() is only
supposed to have this feature by magic values in tv_nsec.  Similarly for
chown().  For file times, this is the accidental result of an
implementation detail.  For chown(), it is more intentional.  POSIX
chown() has smaller wording bugs and doesn't require the security hole.

Most appplications do extra work to avoid null changes.  It is unclear
if this is allowed.  Maybe chown -R root:wheel / is require to update
the ctime for all files, not just the ones it makes non-null changes to.
This is negatively useful, but not disallowed.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r291936 - head/sys/ufs/ufs

2015-12-07 Thread Bruce Evans

On Mon, 7 Dec 2015, Andrey Chernov wrote:


On 07.12.2015 17:42, Andrey Chernov wrote:

On 07.12.2015 15:09, Konstantin Belousov wrote:

Author: kib
Date: Mon Dec  7 12:09:04 2015
New Revision: 291936
URL: https://svnweb.freebsd.org/changeset/base/291936

Log:
  Update ctime when atime or birthtime are updated.


Who calls ufs_itimes() and when to process IN_CHANGE flag in the new
version?


UFS_UPDATE() is called next and it call ufs_itimes().  Converting
IN_CHANGE from a mark to a time could be delayed safely, but the other
changes should be written soon.  We use a delayed write, so they are
not actually written very soon, and this could be optimized a little.
Setting IN_LAZYACCESS might work as a hack (just some flag that the
syncer checks and calls UFS_UPDATE() to clear).  Only do this for
unimportant attributes like times.  For ids, I would go the other
way and tell UFS_UPDATE() to do an async update.  It currently only
supports a waitfor boolean flag which selects bwrite() if set and
normally bdwrite() if clear unless under load when it selects
bawrite() if clear.


-   ufs_itimes(vp);
+   ip->i_flag |= IN_CHANGE | IN_MODIFIED;




New version also forces IN_MODIFIED flag old one tends to avoid (only
for birthtime, and only for non-suspended systems in ufs_itimes()).


That was mostly obfuscation in the old version:
- for settings of the atime only, we modified the atime (possibly with
  a null change) but don't set IN_MODIFIED.  ufs_itimes() fixeed this
  up by translating IN_ACCESS to a modification of the atime and a
  setting of IN_MODIFIED.  This is depends on not being in the suspended
  case.  Then we overwrote the modification with the final one before
  writing
- for settings of the mtime only, ufs_itimes() fixes up IN_MODIFIED in
  the same way except this is obviously not affected by the suspended
  flag
- for settings of the birthtime only, ufs_itimes() had no effect unless
  some update marks were already set, so we had to set IN_MODIFIED
  directly.

The suspended case makes the side effects of ufs_itimes() even harder to
understand, except it doesn't actually occur for calls from VOP_SETATTR(),
since setattr is like write() -- it arranges for exclusive access and not
suspended.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r291741 - head/sys/geom

2015-12-04 Thread Bruce Evans

On Thu, 3 Dec 2015, Bryan Drewery wrote:


On 12/3/2015 7:38 PM, Kenneth D. Merry wrote:

 g_disk_limit(struct disk *dp, struct bio *bp)
 {
bool limited = false;
-   off_t d_maxsize;
-
-   d_maxsize = (bp->bio_cmd == BIO_DELETE) ?
-   dp->d_delmaxsize : dp->d_maxsize;
+   off_t maxsz = g_disk_maxsize(dp, bp);


This looks like a style issue.


This looks like 5 style bugs and 0 issues:

3 old style bugs:
- use of bool
- initialization in declaration
- unsorted declarations

2 new style bugs:
- another initialization in declaration.  A larger one
- lost blank line after declarations.

sys/geom had no instances of the bool style bug as recently as last
month.  At the top level, it didn't even use boolean_t.  In subdirectories,
it used boolean_t on 33 lines.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r291741 - head/sys/geom

2015-12-04 Thread Bruce Evans

On Fri, 4 Dec 2015, Bryan Drewery wrote:


On 12/4/2015 10:03 AM, Bruce Evans wrote:

This is specified by not giving an example of using it.  style(9) was


I really take issue with this stance. I don't think you really mean what
this implies. For example, can I even use 'x++' or mtx_lock(9) or
uintptr_t since they are not shown in an example?


mtx_lock() is just another function, so it doesn't need an example any
more than printf().  printf() also happens to have no example, but there
is a formal rule for it an example for fprintf().

'++' is just a standard operator.  style(9) happens to have an example
of it, though not on a variable named x.  It even has the bad example
of ++p instead of p++ where the result is not used.  ++p is more logical
and I used to prefer it, but only p++ is KNF-normal.

style(9) has even more detailed rules for uintXX_t, but no examples,
and nothing for uintptr_t.  uintptr_t can be considered as just another
typedef.  The rule about uintXX_t is mainly part of deprecating the
old spelling u_intXX_t.

I wouldn't trust style(9) for anything except simple formatting, but
look at the subsystem(s) style.  Just the top level of geom has 11000+
lines which can be considered as giving about 20 times more examples
than style(9).

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290605 - in head/lib/msun: . man

2015-12-04 Thread Bruce Evans

On Fri, 4 Dec 2015, NGie Cooper wrote:


On Nov 9, 2015, at 04:03, Bruce Evans <b...@optusnet.com.au> wrote:

On Mon, 9 Nov 2015, Garrett Cooper wrote:


Log:
Document powl(3)


powl was garbage that was intentionally undocumented.  At least, I
intentionally ignored its non-documentation together with it.


POSIX documents it and I noticed it was missing when porting the msun testcases 
from NetBSD. That???s the reason why I filed the bug.


C99 requires it, but FreeBSD doesn't support it.  Compilers still don't
support delicate floating point things for C90, so I don't feel too bad
about this.


powl doesn't compute the value of .Ar x to the exponent .Ar y.  It computes
the value of (double)(.Ar x) to the exponent (double)(.Ar y), converted to
double.


Hmmm? The types look ok per the function signatures in lib/msun:

lib/msun/src/imprecise.c:imprecise_powl(long double x, long double y)
lib/msun/src/math.h:long double powl(long double, long double);


That is just the types.  The following misimplementation also has the
correct types: '#define powl(x, y) 1.0L'.  This is actually more precise
than than the current one on average (it gets almost all cases completely
wrong, but the current one gets more than 99% of cases wrong and has more
errors of infinity instead of large and finite).  It is only worse on
average for non-exceptional cases like powl(2, 2).


...
These are bugs that need to be fixed then in the longterm, but in the short 
term should be documented under CAVEATS.


The details are large and not very interesting.  A polite version of what
I think of this might be "The the powl() function is not of production
quality.  It is similar to '#define powl(x, y) (pow((x), (y)))'".

Bruce___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r291741 - head/sys/geom

2015-12-04 Thread Bruce Evans

On Fri, 4 Dec 2015, Andrew Turner wrote:


On Fri, 4 Dec 2015 23:42:51 +1100 (EST)
Bruce Evans <b...@optusnet.com.au> wrote:
...

This looks like 5 style bugs and 0 issues:

3 old style bugs:
- use of bool


I don't seem to see where in style(9) we disallow the use of bool. Can
you point me where this is specified?


This is specified by not giving an example of using it.  style(9) was
mostly written 10+ years before bool existed, and bool is so
(un)important that style(9) wasn't changed to allow it in its 16+
years of existence.

This is also a style bug since it was not used anywhere in nearby code,
but nearby code used the older boolean_t.  I don't mind using bool in
new code.

I typedefed everything too much in code that I wrote 25-30 years ago.

For C99 types more useful than bool, but leading to a deeper morasse,
use [u]int_fastN_t and [u]int_leastN_t types in .  These are
so useable that they are used approximately 0 times in the kernel and
10 times in userland (perhaps more than 10 in contrib).  But they
should almost always be used where fixed-width types are now used.
E.g., using uint16_t asks for a type of width exactly 16 bits at all
costs.  The costs are:
- this type might not be supported.  Then nothing would work.  But
  uint_least16_t should always be supported, and anything that doesn't
  need precisely 16 bits would work using it.  It only needs to be
  larger than 16 bits, so it can be u_int.
- this type might be supported, but might be much slower than
  uint_fast16_t.  This is probably the case on arches like old alpha
  where the hardware only does wider or narrower memory accesses so
  has do loads and stores of different sizes and repack
- this type might be supported, but might be slower than uint_fast16_t.
  This is the case on arches like old i386 where movz was slow and
  instruction fetch bandwidth was also inadquate (so the extra movz's
  were even slower)
- this type might be supported, but might be a little slower than
  uint_fast16_t.  This was the case on arches like current x86.
  Extra movz instructions are still required, but these excecute fast
  on current x86.

Kernel code needs fixed-width types in many places in hardware and network
storage layouts.

Fixed-width types are also useful for saving space in software storage
layouts.  But they are just logically wrong for that.  The
[u]int_leastN_t are logically right for that.  But in practice, all
the optional [u]intN_t types exist the [u]int_leastN_t types are not
needed.  Using either leads to a type morasses so is rarely done.  It
is never done in the kernel.

Fixed-width types should not be used except for storage.  They should
not be part of ABIs except (quite often) to force a specific ABI (in
this case, only "fast" fixed-width types should be used in the ABI,
but then the ABI won't be optimal forever).  They should be converted
to "fast" types when they are loaded and not converted back until they
are stored to non-temporary variables.  This leads to type morasses
so is rarely done.  It is never done in the kernel.

Before  existed, the kernel hard-coded uint8_t as u_char,
etc.  The 8-bit cases of this are now standard (u_char must be uint8_t
on newer POSIX).  The kernel now uses fixed-width types excessively.
Not only in storage layouts, but also for function calls and local
variables using types copied from the storage structs.  Fixed-width
instead of "fast" integer types are also pessimal for function calls.
Inlining of functions reduces this pessimization.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r291074 - in head: share/man/man9 sys/kern sys/vm

2015-11-19 Thread Bruce Evans

On Thu, 19 Nov 2015, Jonathan T. Looney wrote:


On 11/19/15, 12:58 PM, "John Baldwin"  wrote:

Note that if you are going to document in each section 9 manpage which
APIs
are not safe to call in a critical section, you will need to update just
about every section 9 manpage.  A more prudent approach would probably be
to
instead go the sigaction(2) route and instead document the subset of APIs
which are permissible to call in critical_enter(9).  The list is probably
not
very long.  Off the top of my head I can think of sched_*, swi_sched,
taskqueue_enqueue_fast, and little else.


Point taken.


Both the man page and the KASSERT()s with redundant (yet incomplete) "or"
clause about spinlocks.

The change also has some style bugs (extra blank lines).

The above list is similar to the list of APIs that may be called by a
"fast" interrupt handler.  It is much smaller than the full list, but
still unmanageably large.  The closure of the list is difficult to
determined, and the closure of the list of calls actually made is
infinite due to bugs like console drivers calling kdb_alt_break() which
calls everything via reboot() and panic() (also via kdb_enter(), but
the only correct reason for existence of kdb_alt_break() is to enter
kdb, and kdb is a little more careful about not calling everything).


As noted above, the point wasn't to enable checking when WITNESS was
disabled; rather, the point was to make the existing checks more
consistent. Basically, if you do something that engenders a panic at high
scale, you should get consistent behavior at low scale, too.


Checking that is actually done is also likely to cause recursive panics
when panic() is called with in a critical section or in another bad
state.  This may be considered as a feature.  It limits the damage
when panic() wanders into normal code does unsafe things for the
bad state, and inhibits bugs like kdb_alt_break() calling panic().

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290014 - in stable/10: lib/libthr/arch/amd64 lib/libthr/arch/i386 libexec/rtld-elf/amd64 libexec/rtld-elf/i386 share/mk

2015-11-15 Thread Bruce Evans

On Sun, 15 Nov 2015, Konstantin Belousov wrote:


On Sat, Nov 14, 2015 at 06:30:13PM +, David Chisnall wrote:

On 26 Oct 2015, at 16:21, Eric van Gyzen  wrote:


One counter-argument to this change is that most applications already
 use SIMD, and the number of applications and amount of SIMD usage
 are only increasing.


Note that SSE and SIMD are not the same thing.  The x86-64 ABI uses SSE 
registers for floating point arguments, so even a purely scalar application 
that uses floating point will end up faulting in the SSE state.  This is not 
the case on IA32, where x87 registers are used (though when compiling for i686, 
SSE is used by default because register allocation for x87 is a huge pain).


Is it ?  If SSE is used on i686 (AKA >= Pentium Pro) by default,
this is a huge bug.


clang is not as broken as that.  It needs excessive setting of -march to
get SSE instructions and of course a runtime arch that has SSE to execute
them.  I usually see it by forcing -march=core2 or -march=native on a
host arch that has SSE.

Using SSE instead of x87 on i386 is a usually a small pessimization except
in large functions where the x87 register set is too small or non-scalar
SSE can be used, since the i386 ABI requires returning results in x87
registers and the conversions between SSE and x87 for this have large
latency.  But clang doesn't understand the x87 very well, so it tends to
be faster using SSE despite this.  Strangely, it appears to understand the
x87 better on amd64 than on i386 -- better than gcc on amd64, but worse
than gcc on i386.  I think this is mostly because to kill SSE for
arithmetic on i386 on arches that support it, all use of SSE must be
killed using -mno-sse or -march=lower.  -mfpmath=387 to give fine control
of this is still broken in clang: -march=i386 -mfpmath=387 works, but
-march=core2 -mfpmath=387 fails with "'387' not supported.  Similarly for
any arch that supports sse, or with -mfpmath=sse on an arch where clang
wants to use x87.

gcc supports -mfpmath=387 even on amd64.  This is just slower, and usually
not more accurate, since the ABI forces conversons on function return.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290711 - head/sys/ofed/drivers/infiniband/core

2015-11-13 Thread Bruce Evans

On Fri, 13 Nov 2015, David Chisnall wrote:


On 13 Nov 2015, at 08:35, Konstantin Belousov  wrote:


On Fri, Nov 13, 2015 at 09:18:54AM +0100, Hans Petter Selasky wrote:

Hi,

On 11/12/15 18:17, Conrad Meyer wrote:

These should cast through (u)intptr_t rather than unsigned long.


This is Linux code, and they use "unsigned long" for pointer casts
everywhere, trying to not break their style.

BTW: I added to linux_compat.c:

CTASSERT(sizeof(unsigned long) == sizeof(uintptr_t));

And it survived my "tinderbox" build and I was surprised!


This is not surprising.  "long" is broken on all supported systems
since no 16-bit systems are supported and misimplementations with longs
the same size as int or register_t are not unusable on 32+ bit systems.
Longs should actually be long.  That means that they should be longer
than register_t, so they are also longer than pointers except on systems
with large pointers like x86 with far pointers.


FreeBSD (at least currently) runs on two kinds of ABIs: ILP32 and LP64.
ILP32 means that sizeof(int) == sizeof(long) == sizeof(void *) == 4.
For LP64, sizeof(long) == sizeof(void *) == 8, while sizeof(int) == 4.
We do not support anything else.


Note that this is not true of all downstreams.  We currently have 128 and 
256-bit void*s with 64-bit longs on CHERI, and I believe that bde???s version 
has 32-bit longs on all platforms.  This kind of code *is* broken for us and 
we???d greatly appreciate people not writing new code that intentionally relies 
on undefined behaviour (round tripping a pointer via any integer type other 
than intptr_t is undefined in C), when a well-defined mechanism exists, just 
because Linux decides to do the wrong thing.


Does CHERI have far pointers or fat pointers?  Does it have newlines?

32-bit longs on all platforms is the last thing that I would have.  I
have them on 16-bit systems (but I last ran one of those in 2009 after
getting it out of the attic).  i386 with correctly-sized longs has
64-bit longs (IP32L64).  I never finished that.  The tiny amount of
support for this needed in i386/include/_limits.h was removed in 2011
(r217128).  gcc just needed to be configured with -DLONG_TYPE_SIZE=64.
Other places mostly just need to use the correct type.  Often the fix
is simply to replace long by int or a typedefed type.  i386/include/
_types.h was already correct or fixed (mostly by using fixed-width
typedefed types).  But sometimes there is an ABI problem.  long in an
API literally asks for pessimal double-width sizes, but sometimes what
it needs is precisely 32 ot 64 bits.

Bruce___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r290613 - head/sys/compat/linuxkpi/common/include/linux

2015-11-10 Thread Bruce Evans

On Tue, 10 Nov 2015, Hans Petter Selasky wrote:


On 11/09/15 22:17, Bruce Evans wrote:

...
This shouldn't compile either.



 static int
-sysctl_root_handler_locked(struct sysctl_oid *oid, void *arg1, intptr_t 
arg2,
+sysctl_root_handler_locked(struct sysctl_oid *oid, void *arg1, intmax_t 
arg2,

 struct sysctl_req *req, struct rm_priotracker *tracker)


Given that the second argument is sometimes used for pointers, maybe we 
should keep it intptr_t. Or add a compile time assert that sizeof(intmax) >= 
sizeof(intptr_t) which I think doesn't hold?


Then it wouldn't be large enough to hold int64_t on i386 or intmax_t on
all arches.  intmax_t is already not large enough to hold uintmax_t.

intmax_t can hold more than intptr_t, but its size and rank may be smaller.
See another reply.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290613 - head/sys/compat/linuxkpi/common/include/linux

2015-11-10 Thread Bruce Evans

On Tue, 10 Nov 2015, Conrad Meyer wrote:


On Tue, Nov 10, 2015 at 7:08 AM, Ian Lepore  wrote:

On Tue, 2015-11-10 at 08:44 +0100, Hans Petter Selasky wrote:

-sysctl_root_handler_locked(struct sysctl_oid *oid, void *arg1,
intptr_t arg2,
+sysctl_root_handler_locked(struct sysctl_oid *oid, void *arg1,
intmax_t arg2,
 struct sysctl_req *req, struct rm_priotracker *tracker)


Given that the second argument is sometimes used for pointers, maybe
we
should keep it intptr_t. Or add a compile time assert that
sizeof(intmax) >= sizeof(intptr_t) which I think doesn't hold?


If intmax_t is the "maximum width integer type" and intptr_t is
"integer type capable of holding a pointer", I think by definition
sizeof(intmax_t) must be >= sizeof(intptr_t).


intmax_t isn't the "maximum width integer type".  It is a "signed integer
type capable of representing any value of any signed integer type (including
extended ones).  This imples that INTMAX_MAX >= INTPTR_MAX but not that
sizeof(intmax_t) >= sizeof(intptr_t) or that
rankof(intmax_t) >= rankof(intptr_t).  intptr_t may have more padding bits
than intmax_t.  In FreeBSD, rankof(intmax_t) >= rankof(intptr_t), but
rankof(intmax_t) doesn't have maximum rank.  It is smaller than the long
long abomination in rank on all 64-bit arches  All arches declare intmax_t
as int64_t, and this has to be long_long_abomination_t on 32-bit arches
(except I made it a magic gcc type in old versions of FreeBSD and C where
long long was a syntax error), but intmax_t is always plain long on 64
bit arches.  In terms of rank, intptr_t == int64_t == intmax_t <
long_long_abomination_t on 64-bit arches, and
intptr_t < int64_t == intmax_t == long_long_abomination_t on 32-bit arches.


On the other hand, given
the perverse way standards-writers think, I'm not sure "big enough" is
all it takes to qualify as "capable of holding a pointer".  But I think
in reality it'll work out right anyway.


Only on vaxes, unless you write correct code.  E.g., consider the following
reasonable implementation:
- pointers have 64 bits, consisting of 32 high address bits and 32 low
  low type bits
- the type bits for void have value 1 
- intmax_t is also 64 bits

- intptr_t is 32 bits (the type bits need not be represented because they
  are always 1)
- conversion from void * to intmax_t then can and should be to copy all
  the bits.  The void * pointer kernbase with bits 0xc001
  gives the same bit when converted to intmax_t.  The reversion conversion
  also copies the bits.
- conversion from void * to intmax_t must reduce to 32 bits.  It can and
  should do this in the obvious way by discarding the low type bits and
  shifting the top bits to the low bits.  The kernbase becomes
  0xc when converted to intptr_t.  The reverse conversion shifts
  the addres bits back and sets the type bits to their known value for
  void *.
Then wrong code breaks nicely.  (intptr_t)(intmax_t)kernbase gives 1
and there is obviously no way to convert this back to kernbase.  To
use intmax_t for kernbase, you have write (intmax_t)(intptr_t)kernbase.
This has value 0xc000.  Then to get back to void *, first cast
back to intptr_t.

Qualified pointers to void are required to work too.  In the above
implementation, they should have different type bits and some only
difference is that casting back must restore different type bits.
Conversions of other pointers are not required to work, but would
work accidentally.  All the type info needed to recover the original
type bits is in the cast back (provided it is really back to the same
type).


+1 to what Ian said.

In any C99 implementation where intptr_t is defined, I believe
intmax_t must be at least as big.  See ?? 7.18.1.5, "Greatest-width
integer types," and ?? 7.18.1.4, "Integer types capable of holding
object pointers."


The following type designates a signed integer type with the property that any 
valid pointer to void can be converted to this type, then converted back to 
pointer to void, and the result will compare equal to the original pointer: 
intptr_t


Note that it only requires working when the conversion is to this type.
Any conversion from a pointer to int may involve some reduction or
reordering of the bits.  The conversion might depend on the type of
both the integer and the pointer in more complicated ways than the
above.  The requirement that the conversions are not required to work
for all "large enough" types makes things simpler for everyone.


The following type designates a signed integer type capable of representing any 
value of any signed integer type: intmax_t


Given that intptr_t exists in our implementation and is a signed
integer type, I see no reason why intmax_t could possibly not
represent any such value.  Same argument for the unsigned variants.


It can only represent integers directly.  intptr_t is only required to
represent pointers with some unique encoding.  Other integer types are
only required to represent 

Re: svn commit: r290613 - head/sys/compat/linuxkpi/common/include/linux

2015-11-10 Thread Bruce Evans

On Tue, 10 Nov 2015, Justin Hibbits wrote:


On Tue, Nov 10, 2015 at 9:42 AM, Conrad Meyer  wrote:

...
Given that intptr_t exists in our implementation and is a signed
integer type, I see no reason why intmax_t could possibly not
represent any such value.  Same argument for the unsigned variants.


I may be wrong on this, but I *think* uintptr_t/intptr_t are required
to be *precisely* the same size as a pointer, which explains why you
can't cast directly from uintmax_t on 32-bit architectures.


Good compilers use the size for a simple portability check.  They
should also complain if the types are different but have the same size,
as is done for int vs long format mismatches on i386.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290613 - head/sys/compat/linuxkpi/common/include/linux

2015-11-09 Thread Bruce Evans

On Mon, 9 Nov 2015, Conrad E. Meyer wrote:


Log:
 linuxkpi/sysfs.h: Cast arg2 through intptr_t to avoid GCC warning

 The code compiles fine under Clang, but GCC on PPC is less permissive about
 integer and pointer sizes.  (An intmax_t is clearly *large enough* to hold a
 pointer value.)

 Another follow-up to r290475.


This shouldn't compile either.


Modified: head/sys/compat/linuxkpi/common/include/linux/sysfs.h
==
--- head/sys/compat/linuxkpi/common/include/linux/sysfs.h   Mon Nov  9 
15:59:42 2015(r290612)
+++ head/sys/compat/linuxkpi/common/include/linux/sysfs.h   Mon Nov  9 
16:50:42 2015(r290613)
@@ -80,7 +80,7 @@ sysctl_handle_attr(SYSCTL_HANDLER_ARGS)
ssize_t len;

kobj = arg1;
-   attr = (struct attribute *)arg2;
+   attr = (struct attribute *)(intptr_t)arg2;


This can have any result (except undefined behviour) since the pointer type
is not void *.  No warning is required but a good compiler would give 1.


if (kobj->ktype == NULL || kobj->ktype->sysfs_ops == NULL)
return (ENODEV);
buf = (char *)get_zeroed_page(GFP_KERNEL);


This shouldn't compile either.  The pointer type is not void *, and the
integer type is neither intptr_t nor uintptr_t except accidentally on
some arches (it is unsigned long), while uintptr_t is unsigned on i386
and unsigned long on amd64.  No warning is required, but a good compiler
would give 1 and a half.

This works on x86 of course, and the code isn't required to work on anything
else, but the compiler doesn't know this so should warn about logical
type mismatches.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290605 - in head/lib/msun: . man

2015-11-09 Thread Bruce Evans

On Mon, 9 Nov 2015, Garrett Cooper wrote:


Log:
 Document powl(3)


powl was garbage that was intentionally undocumented.  At least, I
intentionally ignored its non-documentation together with it.


Modified: head/lib/msun/man/exp.3
==
--- head/lib/msun/man/exp.3 Mon Nov  9 10:35:33 2015(r290604)
+++ head/lib/msun/man/exp.3 Mon Nov  9 10:40:16 2015(r290605)
@@ -99,9 +102,10 @@ functions compute the value exp(x)\-1 ac
.Fa x .
.Pp
The
-.Fn pow
+.Fn pow ,
+.Fn powf ,
and the
-.Fn powf
+.Fn powl
functions compute the value
of
.Ar x


powl doesn't compute the value of .Ar x to the exponent .Ar y.  It computes
the value of (double)(.Ar x) to the exponent (double)(.Ar y), converted to
double.


@@ -122,9 +126,10 @@ Otherwise the error in these functions i
These functions will return the appropriate computation unless an error
occurs or an argument is out of range.


powl() almost never returns the appropriate computation.  Its bugs are
most obvious when an argument is too large for double but not out of range.


The functions
-.Fn pow x y
+.Fn pow x y ,
+.Fn powf x y ,
and
-.Fn powf x y
+.Fn powl x y
raise an invalid exception and return an \*(Na if
.Fa x
< 0 and


I doubt that the rest of this section (about exception handling) is
correct for powl().

The section on errors wasn't changed since it uses generic pow().  It
claims that the error is generally less than 1 ulp.  But for powl(),
the error is generally more that 4096 ulps.  Much more when the
error exponentiates.  powf() and pow() do well to avoid exponentiation
of roundoff errors.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290475 - in head: share/man/man9 sys/kern sys/sys

2015-11-07 Thread Bruce Evans

On Sat, 7 Nov 2015, Konstantin Belousov wrote:


On Sat, Nov 07, 2015 at 09:25:32PM +1100, Bruce Evans wrote:

...
I put intptr_t in  long ago, since it was more needed and
less magic than intmax_t.  This was obfuscated by moving it to
 and including that in various places.  intmax_t is
still only in  which is much larger.  It and uintmax_t
should be together with intptr_t.  That is more polluting in theory
but less in practice.


In other words, do you suggest the following change to fix the compilation ?

diff --git a/sys/sys/_stdint.h b/sys/sys/_stdint.h
index d0f9249..a0fe0ad 100644
--- a/sys/sys/_stdint.h
+++ b/sys/sys/_stdint.h
@@ -78,5 +78,13 @@ typedef  __intptr_t  intptr_t;
typedef __uintptr_t uintptr_t;
#define _UINTPTR_T_DECLARED
#endif
+#ifndef _INTMAX_T_DECLARED
+typedef__intmax_t  intmax_t;
+#define_INTMAX_T_DECLARED
+#endif
+#ifndef _UINTMAX_T_DECLARED
+typedef__uintmax_t uintmax_t;
+#define_UINTMAX_T_DECLARED
+#endif

#endif /* !_SYS__STDINT_H_ */
...


Yes, but some source file is apparently not including .
It is probably trying to be too smart and only including .
This bug was detected as a side effect of  growing a
dependency on .

The most exotic type used in  is counter_u64_t.  The magic
for getting this defined without much namespace pollution seems to be
that it is only used in a macro that is only used by source files that
include .  Macros often work like that.  Here intmax_t
is always needed since it is in an unconditional declaration.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290475 - in head: share/man/man9 sys/kern sys/sys

2015-11-07 Thread Bruce Evans

On Sat, 7 Nov 2015, Conrad E. Meyer wrote:


Log:
 Round out SYSCTL macros to the full set of fixed-width types

 Add S8, S16, S32, and U32 types;  add SYSCTL*() macros for them, as well
 as for the existing 64-bit types.  (While SYSCTL*QUAD and UQUAD macros
 already exist, they do not take the same sort of 'val' parameter that
 the other macros do.)


Actually, the *QUAD macros do take a 'val' parameter (that didn't
work), except for the dynamic _ADD forms.  SYSCTL_ADD_INT() also takes
this parameter.  It works there, but is even less useful than for
SYSCTL_INT() (it is always possibly to point to a variable).
SYSCTL_ADD_LONG() is most inconsistent.  It could take a 'val' parameter
that would work like for SYSCTL_ADD_INT(), but is like SYSCTL_ADD_QUAD()
and doesn't.


 Clean up the documented "types" in the sysctl.9 document.  (These are
 macros and thus not real types, but the manual page documents intent.)

 The sysctl_add_oid(9) arg2 has been bumped from intptr_t to intmax_t to
 accommodate 64-bit types on 32-bit pointer architectures.


This arg is handled poorly in many places.

This is a large ABI change.  arg2 is rarely used for scalar sysctls so
its use for 64-bit types should be disallowed for all cases instead of
fixed or allowed for more cases.

Support for 'val' was already left out for the ADD*QUAD macros, perhaps
because stronger type checking in these macros made its brokenness
more obvious.  This arg is not very useful, but was made available at
essentially no cost to save a few bytes.  Its cost was negative since
there was a spare variable in the oid struct that could be repurposed
without too many type hacks (it should really be void *, but was
intptr_t to represent both void * and small integers).  Now its costs
are thousands of bytes on 32-bit systems by expanding all oid structs,
and larger type hacks.  I can't actually find any use of arg2 for a
pointer.  In most cases where it is really needed, it is a small
integer giving a size.  sysctl_handle_opaque() is such a case.


 This is just the kernel support piece; the userspace sysctl(1) support
 will follow in a later patch.


You mean sysctl(8).  There is also sysctl(3).  Both have type info, with
too much duplication and too many abstract types in sysctl(3) where the
raw C types are of interest.  The lists of sysctls in sysctl(3) and
sysctl(8) are almost useless since they are so incomplete, but they
are in a nicer format than other lists.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290475 - in head: share/man/man9 sys/kern sys/sys

2015-11-07 Thread Bruce Evans

On Sat, 7 Nov 2015, Svatopluk Kraus wrote:


You broke buildkernel. The following patch helps:

diff --git a/sys/sys/sysctl.h b/sys/sys/sysctl.h
index 950e712..a34c890 100644
--- a/sys/sys/sysctl.h
+++ b/sys/sys/sysctl.h
@@ -37,6 +37,7 @@
#define _SYS_SYSCTL_H_

#include 
+#include 

struct thread;
/*


This adds namespace pollution.  It shouldn't be needed.  
is standard namespace pollution in .  (It has a bogus
comment there describing only one thing that it is for.)

This pollution is already in dnv.h, nv.h and racct.h.


@@ -949,7 +950,7 @@ extern char kern_ident[];
/* Dynamic oid handling */
struct sysctl_oid *sysctl_add_oid(struct sysctl_ctx_list *clist,
 struct sysctl_oid_list *parent, int nbr, const char *name, int kind,
- void *arg1, intptr_t arg2, int (*handler)(SYSCTL_HANDLER_ARGS),
+ void *arg1, intmax_t arg2, int (*handler)(SYSCTL_HANDLER_ARGS),
 const char *fmt, const char *descr);
int sysctl_remove_name(struct sysctl_oid *parent, const char *name, int del,
 int recurse);


Apparently the original change wasn't tested on 32-bit arches.

I put intptr_t in  long ago, since it was more needed and
less magic than intmax_t.  This was obfuscated by moving it to
 and including that in various places.  intmax_t is
still only in  which is much larger.  It and uintmax_t
should be together with intptr_t.  That is more polluting in theory
but less in practice.

 doesn't declare much else directly, but includes
 for the most important types (fixed-width ones).  It
mainly declares least-width and fast-width types directly.  These
should be used in most cases where fixed-width types are now used,
especially in kernels, but their usability is shown by them almost
never being used.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290505 - in head/sys: kern sys

2015-11-07 Thread Bruce Evans

On Sat, 7 Nov 2015, Conrad E. Meyer wrote:


Log:
 Flesh out sysctl types further (follow-up of r290475)

 Use the right intmax_t type instead of intptr_t in a few remaining
 places.

 Add support for CTLFLAG_TUN for the new fixed with types.  Bruce will be
 upset that the new handlers silently truncate tuned quad-sized inputs,
 but so do all of the existing handlers.


I think I have complained about the getenv_*() integer functions before.
All of them truncate or otherwise corrupt the value.  getenv_quad()
does non-blind clamping using strtoq() followed by blind scaling in
the suffix case.  All others use getenv_quad() with blind truncation,
except on 64-bit arches long == quad so getenv_long() only has the same
errors as getenv_quad().


Modified: head/sys/kern/kern_sysctl.c
==
--- head/sys/kern/kern_sysctl.c Sat Nov  7 18:26:02 2015(r290504)
+++ head/sys/kern/kern_sysctl.c Sat Nov  7 18:26:32 2015(r290505)
...
+   case CTLTYPE_S32:
+   if (getenv_long(path + rem, _long) == 0)
+   return;
+   val_32 = val_long;
+   req.newlen = sizeof(val_32);
+   req.newptr = _32;
+   break;


This should use getenv_int().  FreeBSD never supported 16-bit ints, and
POSIX now requires >= 32-bit ints.


@@ -250,6 +274,27 @@ sysctl_load_tunable_by_oid_locked(struct
req.newlen = sizeof(val_64);
req.newptr = _64;
break;
+   case CTLTYPE_U8:
+   if (getenv_uint(path + rem, (unsigned int *)_int) == 0)
+   return;
+   val_8 = val_int;
+   req.newlen = sizeof(val_8);
+   req.newptr = _8;
+   break;
+   case CTLTYPE_U16:
+   if (getenv_uint(path + rem, (unsigned int *)_int) == 0)
+   return;
+   val_16 = val_int;
+   req.newlen = sizeof(val_16);
+   req.newptr = _16;
+   break;


These could use getenv_int() since int is larger than int17_t.  Will null
error checking, there would be little difference for the overflows caused
by negative values.  With non-null error checking, the checking would be
slightly different.


+   case CTLTYPE_U32:
+   if (getenv_ulong(path + rem, (unsigned long *)_long) == 0)
+   return;
+   val_32 = val_long;
+   req.newlen = sizeof(val_32);
+   req.newptr = _32;
+   break;


Like for S32.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r290221 - head/sys/powerpc/powerpc

2015-10-31 Thread Bruce Evans

On Fri, 30 Oct 2015, Justin Hibbits wrote:


On Oct 30, 2015, at 9:27 PM, Conrad Meyer wrote:


Comments inline.

On Fri, Oct 30, 2015 at 7:08 PM, Justin Hibbits  
wrote:

Author: jhibbits
Date: Sat Oct 31 02:08:39 2015
New Revision: 290221
URL: https://svnweb.freebsd.org/changeset/base/290221

Log:
 Print unsigned memory sizes, to handle >2GB RAM on 32-bit powerpc.
...
= 
=
--- head/sys/powerpc/powerpc/machdep.c  Sat Oct 31 02:07:30 2015 
(r290220)
+++ head/sys/powerpc/powerpc/machdep.c  Sat Oct 31 02:08:39 2015 
(r290221)

@@ -176,12 +176,12 @@ cpu_startup(void *dummy)
#ifdef PERFMON
   perfmon_init();
#endif
-   printf("real memory  = %ld (%ld MB)\n", ptoa(physmem),
+   printf("real memory  = %lu (%lu MB)\n", ptoa(physmem),
   ptoa(physmem) / 1048576);


Shouldn't this be "real memory = %ju (%lu MB)\n" and
(uintmax_t)ptoa(physmem), or it may overflow on >4GB RAM systems?


No.  Any overflow that may occur happens in ptoa() before this cast
can effect it.  Such overflow is supposed to be prevented by bogusly
casting to the undocumented type unsigned long in the undocumented
macro ptoa().  %lu is correct since it matches this type.  Any mismatch
(except the old sign mismatch) would be detected by printf format
checking.  Casting to uintmax_t would have no effect except to bloat
the code from 32 bits to 64 bits in the 32-bit case and simplify
matching the printf format with the type.

There are about 100 different combinations of bogus types, bogus casts,
no casts, and printf formats in various arches.  Perhaps the bogus
cast in powerpc32 ptoa() doesn't actually work.  It only works up to
4GB.  i386 used to have that limit and had to be changed for PAE.  It
now uses no cast in ptoa().  Callers must cast the arg to vm_paddr_t
for efficiency or uintmax_t for simplicity.  That is best since it
forces them to be careful with types.  The i386 code corresponding to
the above uses uintmax_t for everything for simplicity.

The bogus types start with physmem being long.  That is more than large
enough, at least on 32-bit arches, since its units are pages.  But it
is inconsistent with vm's page counters mostly having type u_int except
where.  On 64-bit arches, long is much longer than u_int, so using it
tends to mask bugs, but on 32-bit arches it is shorter than u_int, so
using it tends to unmask bugs.  In the above, its signedness makes little
difference since the bogus cast in the macro kills its sign bit.

Yes, it should, and it will.  However, currently ptoa() casts to unsigned 
long, and I didn't want to change that yet (I will eventually).  In fact, the 
machine I'm testing on has 8GB, which I've had to artificially limit to <4GB 
because of the various memory accounting size limits.


Similar to i386 with PAE.


   realmem = physmem;

   if (bootverbose)
-   printf("available KVA = %zd (%zd MB)\n",
+   printf("available KVA = %zu (%zu MB)\n",
   virtual_end - virtual_avail,
   (virtual_end - virtual_avail) / 1048576);



This has logical type mismatches.  The types are vm_offset_t, and it is
assumed that this is the same as size_t so that it matches size_t.  This
defeats the reason for existence of fancy types like vm_offset_t.  MD
code can more easily just hard-code vm_offset_t and size_t as u_int ot
u_long everywhere.  This only works for virtual addresses.  The types
remain useful in MI code and for keeping really different things like
physical memory sizes different.

%zu is also logically wrong for the size in MB.  size_t is only logically
correct for sizes in bytes.  The expression for the size in MB normally
has type size_t too, but this is MD.  vm has a vm_size_t type to add
to the confusion.  It is hard to remember to convert between offsets and
sizes when they are physically identical.  But IIRC, vm doesn't have
a type for its page counts where the differences are physical.


@@ -199,7 +199,7 @@ cpu_startup(void *dummy)
   #ifdef __powerpc64__
   printf("0x%016lx - 0x%016lx, %ld bytes (%ld 
pages)\n",

   #else
-   printf("0x%08x - 0x%08x, %d bytes (%ld pages)\n",
+   printf("0x%08x - 0x%08x, %u bytes (%lu pages)\n",


It seems wrong that bytes is only %u here and pages is a %lu.  I think
bytes should probably be %ju too?  What is the type of size1?


This is perfectly backwards.  Byte counts are large so they need to have
type vm_paddr_t or larger.  Page counts are small so they can be signed
int.

But surely the types are correct here since the printf format checker
would have detected any mismatch except for signs?  Bogus long format
specifiers are likely to be needed to match the bogus long type of
physmem.

All of the code just before here has type errors.  It starts with the
style bugs of a nested declaration of 

Re: svn commit: r289765 - in head/sys: conf libkern sys

2015-10-23 Thread Bruce Evans

On Thu, 22 Oct 2015, Conrad E. Meyer wrote:


Log:
 Add libkern ffsll() for parity with flsll()

 Sponsored by:  EMC / Isilon Storage Division



This fills a much needed gap, like the one filled by flsll().

It uses the long long abomination.

uintmax_t has been Standard since 1999, but is still not supported
by the ffs() family.

It further expands the design error in ffs().
  (ffs() only supports ints.  Its behaviour on negative values is
  undocumented, but needs documentation more than for non-negative
  values because even in the 2's complement case, the sign bit can
  be anywhere).  The bad BSD documented is for ffs() is duplicated
  in POSIX in at least old versions.  I think this requires ffs()
  to operate on values, so it cannot work in the 1's complement
  case.  The -0 has all bits 1 but value 0.  ffs(-0) by value must
  be -1, and ABIs are probably allowed to pass -0 as +0 so ffs(-0)
  is probably -1 even if ffs() is by bits.)
Due to this design error, it is not obvious that even u_char is
directly supported by the ffs() family.  It is supported because
POSIX requires 8-bit chars so chars get promoted to non-negative
ints when passed to ffs().  This doesn't happen for u_int, but
you can promote u_int manually to long long to get correct and
pessimal code.  This doesn't work for u_foo that is already
of larger rank than long long.


Added: head/sys/libkern/ffsll.c
==
--- /dev/null   00:00:00 1970   (empty, because file is newly added)
+++ head/sys/libkern/ffsll.cThu Oct 22 20:28:37 2015(r289765)
+ ...
+/*
+ * Find First Set bit
+ */
+int
+ffsll(long long mask)
+{
+   int bit;
+
+   if (mask == 0)
+   return (0);
+   for (bit = 1; !(mask & 1); bit++)
+   mask = (unsigned long long)mask >> 1;
+   return (bit);
+}


This has the usual implementation for negative values.  They are cast
to the correct unsigned type to get defined behaviour for the shift.
The cast itself gives implementation-defined behaviour.  Even when the
compiler was gcc so that it was documented in man pages and info pages,
the documentation for this was hard to find.  Whatever it is, it gives
the undocumented behaviour of the function.

This implementation is pessimal unless the compiler can convert it to
special instructions.  On i386, 2 32-bit ffs()'s are more portable and
much faster, since each one uses a special instruction and it is not
possible to do much better.  (But fls64() can be done better in userland
using the FPU or perhaps SSE.)

clang optimizes similar simple loops for popcount() but doesn't optimize
the ones used in libc for the ffs() family.  If it did optimize these
loops, 2 32-bit ffs()'s with asms inside them wouldn't be so good, but
for bswap64() we arranged for compilers to good combining for even more
complicated variations.  (For bswap*, clang converts the complicated C
expression giving the rearrangement to minimal bswap instruction(s),
but we hard-code these instructions in asm because gcc can't do this
in the old versions in FreeBSD, and then arrange so that clang can
combine the asms efficiently on i386.)

Efficiency of the ffs() family is rarely important.  bitset.h uses ffsl()
but bitstring.h uses a simple loop.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289773 - in head: sbin/sysctl sys/kern sys/sys

2015-10-22 Thread Bruce Evans

On Thu, 22 Oct 2015, Conrad E. Meyer wrote:


Log:
 Sysctl: Add common support for U8, U16 types

 Sponsored by:  EMC / Isilon Storage Division


This uses similar low-quality code to other sysctls, and more.


Modified: head/sbin/sysctl/sysctl.c
==
--- head/sbin/sysctl/sysctl.c   Thu Oct 22 22:29:25 2015(r289772)
+++ head/sbin/sysctl/sysctl.c   Thu Oct 22 23:03:06 2015(r289773)
@@ -96,6 +96,8 @@ static int ctl_size[CTLTYPE+1] = {
[CTLTYPE_ULONG] = sizeof(u_long),
[CTLTYPE_S64] = sizeof(int64_t),
[CTLTYPE_U64] = sizeof(uint64_t),
+   [CTLTYPE_U8] = sizeof(uint8_t),
+   [CTLTYPE_U16] = sizeof(uint16_t),
};


Unsorting.  The table was sorted on size, then alphabetically.
Alphabetical ordering also happens to give ordering on rank.  (This
ordering is the inverse of the ordering for declarations.)



static const char *ctl_typename[CTLTYPE+1] = {
@@ -105,6 +107,8 @@ static const char *ctl_typename[CTLTYPE+
[CTLTYPE_ULONG] = "unsigned long",
[CTLTYPE_S64] = "int64_t",
[CTLTYPE_U64] = "uint64_t",
+   [CTLTYPE_U8] = "uint8_t",
+   [CTLTYPE_U16] = "uint16_t",
};


Consistent unsorting.



static void
@@ -221,6 +225,8 @@ parse(const char *string, int lineno)
int len, i, j;
const void *newval;
const char *newvalstr = NULL;
+   uint8_t u8val;
+   uint16_t u16val;
int intval;
unsigned int uintval;
long longval;


Locally consistent, but backwards sorting.  The *val variables were sorted
on rank, and the addition matches this, but style(9) requires inverse
sorting on rank.  Other local variables are randomly ordered.


@@ -322,6 +328,8 @@ parse(const char *string, int lineno)
}

switch (kind & CTLTYPE) {
+   case CTLTYPE_U8:
+   case CTLTYPE_U16:
case CTLTYPE_INT:
case CTLTYPE_UINT:
case CTLTYPE_LONG:


Consistent sorting.  The types are sorted on rank.  Style(9) requires
sorting case stylements alpahbetically, but it is good to break that
here and sort on rank or inverse rank to match other orderings.


@@ -345,6 +353,16 @@ parse(const char *string, int lineno)
errno = 0;

switch (kind & CTLTYPE) {
+   case CTLTYPE_U8:
+   u8val = (uint8_t)strtoul(newvalstr, , 0);
+   newval = 
+   newsize = sizeof(u8val);
+   break;


Bug: blind truncation gives overflow before overflow can be checked for.
The overflow checking done by the subsequent errno check only checks for
representability of the return value.

Style bug: bogus cast to make the overflow more explicit and/or prevent
the compiler from detecting the bug and misleading the reader into thinking
that overflow is intentionally allowed.  The cast usually has no other
effect (the behaviour on conversion is only implementation-defined in
the overflow case.  Perhaps it can be different with the cast).

The other assignments are mostly similarly broken:
- CTLTYPE_INT: similar bogus cast to int
- CTLTYPE_UINT: unsimilar bogus cast to int.  The correct cast to u_int
  would have no effect, but casting to int gratuitously gives
  implementation defined behaviour even for representable values like
  UINT_MAX
- CTLTYPE_LONG, CTLTYPE_ULONG: correct (null conversion with no cast)
- CTLTYPE_S64: no cast, though overflow can occur if int64_t < intmax_t
- CTLTYPE_U64: no cast.  Correct since the type is unsigned
- CTLTYPE_IMAX, CTLTYPE_UIMAX: unsupported.


+   case CTLTYPE_U16:
+   u16val = (uint16_t)strtoul(newvalstr, , 
0);
+   newval = 
+   newsize = sizeof(u16val);
+   break;


As a bug, but another style bug from the cast (ling too long).


case CTLTYPE_INT:
if (strncmp(fmt, "IK", 2) == 0)
intval = strIKtoi(newvalstr, , 
fmt);


The error handling is somewhat obfuscated by having general code after the
switch statement to avoid almost repeating it for each case.  It could be
simplified by always assigning to uintmax_t uimaxval.  Then do general
error handling.  Then do specific error handling (range checking for each
case).

The error handling also has a silly check for empty values before the
main check.  The general error handling finds empty values and much
more as a special case of disallowing values with no digits (e.g.,
nonempty leading whitespace and/or sign before the empty value).  So
all the silly check does is give a special error message for empty
values.  Not worth it.

Style bug: you forgot to add the new types to the silly check.


free(oval);

Re: svn commit: r289332 - head/tools/regression/lib/msun

2015-10-17 Thread Bruce Evans

On Sat, 17 Oct 2015, Konstantin Belousov wrote:


On Thu, Oct 15, 2015 at 10:12:03PM +1100, Bruce Evans wrote:

On Thu, 15 Oct 2015, Konstantin Belousov wrote:


On Wed, Oct 14, 2015 at 08:22:12PM +, Garrett Cooper wrote:

Author: ngie
Date: Wed Oct 14 20:22:12 2015
New Revision: 289332
URL: https://svnweb.freebsd.org/changeset/base/289332

Log:
  Fix test-fenv:test_dfl_env when run on some amd64 CPUs

  Compare the fields that the AMD [1] and Intel [2] specs say will be
  set once fnstenv returns.

  Not all amd64 capable processors zero out the env.__x87.__other field
  (example: AMD Opteron 6308). The AMD64/x64 specs aren't explicit on what the
  env.__x87.__other field will contain after fnstenv is executed, so the values
  in env.__x87.__other could be filled with arbitrary data depending on how the
  CPU-specific implementation of fnstenv.

No Intel or AMD CPU write to __other field at all.


No, they all do.

Test on old i386 on old A64:
...

No, I did not thought about fegetenv() as executing FXSAVE instruction.
I did knew that fegetenv() is a wrapper around FNSAVE, and I was completely
sure that FNSAVE in the long mode, when executed by a 64bit program,
never writes anything into the %eip/data offset portion of the FNSAVE
area.  I suspect that this was motivated by unavoidable 32-bitness of
the store format.


fegetenv() is actually a wrapper around FNSTENV, and FNSTENV is a subset
of FNSAVE.


I was unable to find a reference in either Intel SDM or in AMD APM which
would support my statement.  The closest thing is the claim that the FOP
field is not filled, in the SDM.  Still, I wonder how things are really
arranged by hardware for FNSAVE in 64bit mode.


Let's check FOP below.


Are your experiments below were done for the 32bit programs, or for 64bit ?
Both FXSAVE and XSAVE area formats and rules would be irrelevant for the
FreeBSD ABI.


For amd64, I used the freefall default which I think is long mode, but
there is some magic for the pointer size (small model?)

BTW, -m32 has been broken on freefall for years.  At least libgcc.a is
incompatible or not installed.


...
Modified state for fegetenv():
X 005C  7F 12 00 00 00 00 80 1F FF FF FF FF 63 82 04 08
 --cw- -mxhi --sw- -mxlo --tw- -pad- fip
  -
X 006C  1F 00 1D 05 A0 92 04 08 2F 00 FF FF
 -fcs- -opc- foff--- -fds- -pad-
other[16]--


FOP (opc) is clearly filled on i386 (32-bit mode).


...
Test on -current amd64 on Xeon (freefall):
...
Later fegetenv():
X 0060  7f 03 ff ff 00 00 ff ff  ff ff ff ff 75 08 40 00
X 0070  43 00 1c 05 20 62 60 00  3b 00 ff ff 80 1f 00 00


FOP is filled to 1c 05 on freefall and to 1D 05 on my old i386.  But the
instruction is the same (fstpl).  The difference is a different encoding
of the direct address mode.

Futher testing:

Only small model seems to be supported.  I got relocation errors with
messages about R_X86_64_32S for an array of size 4G.

malloc() works to allocate arrays larger than 4G.  Writing to addresses
above 4G in such arrays or on the stack never gave 64-bit offsets.
It truncated the offsets to 32 bits and still printed the segment
register.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289405 - head/sys/ufs/ffs

2015-10-16 Thread Bruce Evans

On Fri, 16 Oct 2015, Warner Losh wrote:


Log:
 Do not relocate extents to make them contiguous if the underlying drive can do
 deletions. Ability to do deletions is a strong indication that this
 optimization will not help performance. It will only generate extra write
 traffic. These devices are typically flash based and have a limited number of
 write cycles. In addition, making the file contiguous in LBA space doesn't
 improve the access times from flash devices because they have no seek time.

 Reviewed by: mckusick@


Actually, making the file contiguous does improve the access time, probably
by relatively more for flash devices, since for fast devices the number of
i/o's per second is a bottleneck and discontiguous files give many more
i/o's per second.

E.g., suppose the block size is 16K and the transfer rate is 1GB/sec.
This requires 64K i/o's per second (iops) to keep up with the data and
many more to keep up with the metadata.  Completely discontiguous files
are limited to this rate.  But clustering of large contiguous files
increases the block size to 128K, so you only need 8K iops to keep up
with the data.

I think turning of reallocation always gives 1 discontiguous block
for medium-sized files, but not many more than that.  That still
doubles ot triples the number of data i/o's for files of size about
128K (1 block is often split into 3 by a seek in the middle).


Modified: head/sys/ufs/ffs/ffs_alloc.c
==
--- head/sys/ufs/ffs/ffs_alloc.cFri Oct 16 03:03:04 2015
(r289404)
+++ head/sys/ufs/ffs/ffs_alloc.cFri Oct 16 03:06:02 2015
(r289405)
@@ -481,9 +481,19 @@ ffs_reallocblks(ap)
struct cluster_save *a_buflist;
} */ *ap;
{
+   struct ufsmount *ump;

-   if (doreallocblks == 0)


The correct way to configure this is a mount option, not this sysctl
variable.  I think the variable was only intended for turning off
reallocation when it was buggy.  In 4.4SD-Lite2, this variable wasn't
even private for ffs, and old versions of FreeBSD misused it in ext2fs.

The related sysctl variables noclusterr and noclusterw were misconfigured
similarly in 4.4BSD-Lite2, but FreeBSD fixed this by turning them into
mount options, despite them probably being less important than
doreallocblocks.  I only use them to see if vfs clustering is still
useful.  Unfortunately, it still is in most cases.  It is too complicated,
and too heavyweight.  But its weight is still smaller than more i/o's
for smaller blocks, at least on non-memory disks.  I think the cleanup
was motivated mainly for non-automatic use of the flags on memory disks
in pc98.

Configuration of memory disks in main memory is also badly supported.
I think md(4) still gives double-caching for all types of backing store,
so if iops is not a problem then doreallocblks and cluster[rw] should
be turned off in all cases for md to recover a small part of the loss
from the double-caching, but md doesn't know anything about this.

Oops, actually md does try to avoid the double-caching, but it does
this for all reads (by using IO_DIRECT) for all types of backing store,
and this destroys performance for at least the case of vnode-backed
disks with the vnode on a hard disk.  IO_DIRECT certainly prevents
clustering.  Then if it works as intended to avoid double-caching, it
also gives many more i/o's than necessary if there is a block size
mismatch.  Perhaps 128 times as many for a 64K:512 mismatch (128
reads of different virtual 512-blocks are mapped to 128 reads of the
same physical 64K-block.  IO_DIRECT prevents caching of the physical
block.  The virtual blocks should be clustered into part of 1 128K-
block, but don't seem to be.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289405 - head/sys/ufs/ffs

2015-10-16 Thread Bruce Evans

On Fri, 16 Oct 2015, Hans Petter Selasky wrote:


On 10/16/15 08:21, Bruce Evans wrote:

[Bruce Evans didn't write:]
In addition, making the file contiguous in LBA space doesn't
  improve the access times from flash devices because they have no seek
time.


This is not exactly true, like Bruce pointed out too. Maybe there should be a 
check, that if the block is too small reallocate it, else leave it for the 
sake of the flash. Doing 1K accesses versus 64K accesses will typically show 
up in the performance benchmark regardless of how fast the underlying medium 
is.


Now I don't unerstand the whole point of the change.  Anything that reduces
i/o's is good, but AFAIK ffs_doreallocblks() is all in software.  Writes
should be delayed so that it doesn't have to do extra i/o's to back out of
committed writes.  Often it reduces the number of writes and increases
their size by making blocks contiguous so that the write can be clustered.
Increasing the write size is especially good for flash devices, but maybe
ffs's default block size is already large enough.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r289405 - head/sys/ufs/ffs

2015-10-16 Thread Bruce Evans

On Fri, 16 Oct 2015, Warner Losh wrote:


On Oct 16, 2015, at 2:12 AM, Hans Petter Selasky <h...@selasky.org> wrote:

On 10/16/15 08:21, Bruce Evans wrote:

In addition, making the file contiguous in LBA space doesn't
 improve the access times from flash devices because they have no seek
time.


This is not exactly true, like Bruce pointed out too. Maybe there should be a 
check, that if the block is too small reallocate it, else leave it for the sake 
of the flash. Doing 1K accesses versus 64K accesses will typically show up in 
the performance benchmark regardless of how fast the underlying medium is.


But that?s not what this does. It isn?t the defrag code that takes the 2-8k 
fragments and squashes them into 16-64k block size for the device. This takes 
the larger blocks and makes sure they are next to each other. This takes large 
contiguous space (like 2MB) and puts as much as possible in a cylinder group. 
That?s totally useless on a flash drive.

Since the sizes of the blocks are so large, moving them won?t change any 
benchmarks.


Not true.  First, the fragment size is actually 512-4K and the block size
is 4K-64K and the block size is that of ffs, not of the device (but the
ffs size should be chosen to be a multiple of the device size).

ffs_doreallocblks() doesn't take large (2MB) blocks and put them in in
a cylinder group.  It takes not very large (4K-64K) ffs blocks and tries
to make them contiguous.  Sometimes this gives large (2MB) contiguous
sets of blocks).  Cylinder groups are irrelevant except that contiguity
is impossible across them.

The pessimization that I was talking about was expanding the number of
i/o's by a large (anything more than 1%) factor by not grouping ffs
blocks.

I know too much about this from benchmarking and fixing the 10-20%
performance loss in soft updates for _reading_ of medium-size files
from pessimizal block allocation.  The loss was larger for soft
updates than for other cases because ffs used its delayed allocation
to perfectly pessimize the location of the indirect block.  10%-20%
is for files in /usr/src with an ffs block size smaller than its
current default of 32K, on a not very fast hard disks.  Most of
these files don't need an indirect block when the ffs block size is
32K or even 16K.  It is surprising that the effect is so large when
the block size is even 16K.  Then only files larger than 192K need
an indirect block.  The size of the effect also depends on the
amount of caching in disks.  With none, any seek kills performance
by much more than 10-20%.  With some, small seeks are almost free.
They just take another i/o to access the disk's cache provided the
disk does good read-ahead.

The bug seems to be that ffs_doreallocblks() does trimming for _all_
blocks that it looks at although most of them were just written with
a delayed write and never reached the disk.  Even the extra i/o's
for this are not good.  Hopefully the disk does nothing for trims
on unwritten blocks.

Bruce___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Re: svn commit: r289332 - head/tools/regression/lib/msun

2015-10-15 Thread Bruce Evans

On Thu, 15 Oct 2015, Konstantin Belousov wrote:


On Wed, Oct 14, 2015 at 08:22:12PM +, Garrett Cooper wrote:

Author: ngie
Date: Wed Oct 14 20:22:12 2015
New Revision: 289332
URL: https://svnweb.freebsd.org/changeset/base/289332

Log:
  Fix test-fenv:test_dfl_env when run on some amd64 CPUs

  Compare the fields that the AMD [1] and Intel [2] specs say will be
  set once fnstenv returns.

  Not all amd64 capable processors zero out the env.__x87.__other field
  (example: AMD Opteron 6308). The AMD64/x64 specs aren't explicit on what the
  env.__x87.__other field will contain after fnstenv is executed, so the values
  in env.__x87.__other could be filled with arbitrary data depending on how the
  CPU-specific implementation of fnstenv.

No Intel or AMD CPU write to __other field at all.


No, they all do.

Test on old i386 on old A64:

Initial state for fegetenv():
X   7F 12 00 00 00 00 80 1F FF FF FF FF 52 3F 67 C0
--cw- -mxhi --sw- -mxlo --tw- -pad- fip
  -
X 0010  08 00 05 01 48 5F 74 C0 10 00 FF FF
-fcs- -opc- foff--- -fds- -pad-
other[16]--

This always fails the test:
cw: passes
mxhi:   passes
sw: passes
pad+tw: passes
fip:always fails.  The test wants 0, but the value is always the kernel
eip for the instruction that is used to avoid the security hole.
(The A64 CPU gives the security hole, and the kernel code is
pessimal so it apparently runs always on the first use of the
NPX by a thread.)
fcs:always fails.  The test wants 0, but the value is always the kernel
cs, as above.
opc:always fails.  The value is the opcode for the kernel instruction
foff:   always fails.  The value is the data address for the kernel instruction
fds:always fails.  The value is the kernel ds extended by padding with 1's

Modified state for fxsave:
X 001C  7F 12 00 38 80 00 F0 3F EE EE EE EE EE EE EE EE
X 002C  EE EE EE EE EE EE EE EE 80 1F 00 00 FF FF 00 00

Partial result of an fxsave instruction after fld1; fstl in the test
program to try to change the kernel data.  A byte value of EE indicates
that the byte was written by memset() and not overwritten by fxsave.
There are 16 such bytes.  These correspond to __other[16].  You are
probably thinking of fxsave not writing these, but fegetenv() doesn't
use fxsave and always writes these.  I think fxsave on all Intel CPUs
write these (freefall's Xeon does), and all AMD CPUs write these after
an error.

The format is slightly different.  All the other values seem to be the
same as for fegetenv().

Modified state for fegetenv():
X 005C  7F 12 00 00 00 00 80 1F FF FF FF FF 63 82 04 08
--cw- -mxhi --sw- -mxlo --tw- -pad- fip
  -
X 006C  1F 00 1D 05 A0 92 04 08 2F 00 FF FF
-fcs- -opc- foff--- -fds- -pad-
other[16]--

This is from the same state as the fxsave.  It has the following
modifications relative to the initial state:
- all kernel data changed to user addresses, as intended

Modified state for fnsave:
X 003C  7F 12 FF FF 00 00 FF FF FF FF FF FF 63 82 04 08
X 004C  1F 00 1D 05 A0 92 04 08 2F 00 FF FF

This is from the same state as the fxsave.  It has the same format
and values as the state saved by fegetenv() except 2 padding
fields are padded with 1's and not repurposed for mxcsr.

Modified state for fnstenv:
X 0078  7F 12 FF FF 00 00 FF FF FF FF FF FF 63 82 04 08
X 0088  1F 00 1D 05 A0 92 04 08 2F 00 FF FF EE EE EE EE

This is from the same state as the fxsave.  It has the same format
and values as the state saved by fnsave.

So last instruction/data values are always in fenv_t on i386.  They
are undocumented, but might be used.  On amd64 in 64-bit mode, they
don't fit in the hardware format used by fnstenv.  fenv_t extends
this format only to append mxcsr.  amd64 should have used the env
part of the better-designed 64-bit fxsave format.

Test on -current amd64 on Xeon (freefall):

early fegetenv():
X   7f 03 ff ff 00 00 ff ff  ff ff ff ff 00 00 00 00
--cw- -pad- --sw- -pad-  --tw- -pad- ---
X 0010  00 00 00 00 00 00 00 00  00 00 ff ff 80 1f 00 00
---other[16] ---mxcsr---

The test passes.  The 2 bytes at the end of other[20] are reserved
and not part of the last instruction/data.  They are padded normally
with 1's and not with 0's.  The default env hard-codes the assumption
that these reserved bits are 1's.

Since Xeons are not AMD CPUs, the kernel doesn't execute the code that
closes the security hole and all the last instruction/data values are
0.  These values are written.

later fxsave:
X 0020  7f 03 00 00 00 00 1c 05  75 08 40 00 43 00 00 00
--cw- --sw- tw 00 -opc-  fip fcs
X 0030  20 62 60 00 3b 00 00 

Re: svn commit: r289332 - head/tools/regression/lib/msun

2015-10-15 Thread Bruce Evans

On Wed, 14 Oct 2015, Garrett Cooper wrote:


Log:
 Fix test-fenv:test_dfl_env when run on some amd64 CPUs

 Compare the fields that the AMD [1] and Intel [2] specs say will be
 set once fnstenv returns.

 Not all amd64 capable processors zero out the env.__x87.__other field
 (example: AMD Opteron 6308). The AMD64/x64 specs aren't explicit on what the
 env.__x87.__other field will contain after fnstenv is executed, so the values
 in env.__x87.__other could be filled with arbitrary data depending on how the
 CPU-specific implementation of fnstenv.


This is not very amd64-specific.  All of the state is basically
uninitialized.  The ifdef should be the same for i386, so as to only
check the part that we use.  This part is not properly initialized
either, but is less volatile and we need to check that its uninitialized
value is what we want.


Modified: head/tools/regression/lib/msun/test-fenv.c
==
--- head/tools/regression/lib/msun/test-fenv.c  Wed Oct 14 19:30:04 2015
(r289331)
+++ head/tools/regression/lib/msun/test-fenv.c  Wed Oct 14 20:22:12 2015
(r289332)
@@ -133,8 +133,35 @@ test_dfl_env(void)
fenv_t env;

fegetenv();


The fields were initialized in the kernel, not necessarily to our hard-
coded value, but we want to check that they were.  Then they may have
been changed by early code in the program.  There is a printf() there,
and printf() could easily use the FPU although it probably doesn't.

If anything earlier uses the FPU, the the state may be changed in
many MD ways.  E.g.:
- if the part of the FPU used is the NPX, then the state has
  last-insruction data that should be changed by any non-control
  operation
- the last-instruction data actually works on Intel CPU, but is
  broken by AMD optimizations
- the amd64 optimizations give a security hole.  Closing this hole
  modifies the last-instruction data, so the data may change in
  unexpected ways
- amd64 binaries are unlikely to use the NPX even if they use the
  FPU, since they use SSE except for long doubles
- i386 binaries should use the NPX if they use the FPU, but this is
  broken by clang optimizations in some cases
- the hardware format stored by fnstenv doesn't have enough space for
  all the last-instruction data in long mode.  Apparently, fstenv
  in long mode stores the same number of bytes as in 32-bit mode,
  with the leading bytes the same but the tailing bytes containing
  either truncated last-instruction data or garbage.  This is probably
  the bug fixed by this commit (it doesn't take earlier use to give
  MD garbage).  The kernel doesn't define any format for this case.
  fenv(3)'s fenv_t is a private format that has 12 leading bytes in
  common with the hardware format, then 16 opaque bytes in common with
  the unusable part of the hardware format, then an extension by 4
  bytes.  Testing this confused me.  I thought that FreeBSD userland
  was in long mode, but tests gave the 32-bit format even for fxsave.
  This format has fields for segment registers, and shows the nonzero
  %cs and the zero %ds.


+
+#ifdef __amd64__
+   /*
+* Compare the fields that the AMD [1] and Intel [2] specs say will be
+* set once fnstenv returns.
+*
+* Not all amd64 capable processors implement the fnstenv instruction
+* by zero'ing out the env.__x87.__other field (example: AMD Opteron
+* 6308). The AMD64/x64 specs aren't explicit on what the
+* env.__x87.__other field will contain after fnstenv is executed, so
+* the values in env.__x87.__other could be filled with arbitrary
+* data depending on how the CPU implements fnstenv.


Probably none do.  Most of the fields that are still tested are documented
in at least old amd64 manuals as being half "reserved, don't use".
Usually the reserved bits are not 0, but are 1.  Our tests usually pass
because we hard-code 1 instead of 0.


+*
+* 1. http://support.amd.com/TechDocs/26569_APM_v5.pdf
+* 2. http://www.intel.com/Assets/en_US/PDF/manual/253666.pdf
+*/


It should go without saying that fields that are unsupported by us
and named __other to inhibit their use should not be tested.

Verbose pointers to the man pages might be useful for the parts that
we use, but these parts are fundamental and well known and used without
comment throughout the implementaton.


+   assert(memcmp(__mxcsr, _DFL_ENV->__mxcsr,
+   sizeof(env.__mxcsr)) == 0);


memcmp is verbose and bogus for comparing scalar values.  All 32 bits
in the scalar are supported, so this is the only correct part of the
test.

Oops, this too is incorrect on i386.  On i386, __mxcsr doesn't exist.
The mxcsr scalar is split into 2 discontiguous parts on i386.  These
parts need to be tested separately, and the amd64 spelling fails at
compile time since neither part is named __mxcsr.


+   assert(memcmp(__x87.__control, 

Re: svn commit: r287217 - head/usr.sbin/syslogd

2015-08-30 Thread Bruce Evans

On Sun, 30 Aug 2015, Jilles Tjoelker wrote:


On Sun, Aug 30, 2015 at 11:53:00AM +0200, Ed Schouten wrote:

2015-08-28 16:38 GMT+02:00 Joerg Sonnenberger jo...@britannica.bec.de:

But the compiler can't tell if it is the *intention* that the function
never returns. The warning behavior exists because that can easily
change with macros etc.



I think it's important to keep in mind what this keyword was designed for.



The idea behind this attribute (and the C11 _Noreturn keyword) is to
allow for propagation of optimisation state across compilation units.
You use it to annotate functions in header files, so that the compiler
does not need to handle function return at the call site. This
knowledge can be automatically be inferred if the function is static.


Although you are right philosophically, in practice there are some
compilers and static analyzers that do benefit from noreturn attributes
on static functions. In particular, gcc 4.2.1 generates better code and
fewer warnings if static functions are annotated with noreturn if
appropriate, and some versions of the Clang/LLVM static analyzer do not
do any cross-procedural analysis at all and depend on noreturn
attributes on static functions to avoid false positives.


If this were important, then __dead2 would be used a lot for important
functions, not for usage() and die().  It is in fact so important that
__dead2 is used a whole 86 times in $(find /usr/src -name *.c):
- 10 uses for non-static functions where there is actually a reason to
  use __dead2, but about half of these 10 are due to missing staticization
- 30 uses for static usage().  All of these uses have the correct syntax
  with __dead2 at the end (some versions of gcc only allow attributes
  near the end)
- 46 other uses.  Mostly for functions like panic(), exit() or err().
  18 of these have syntax errors with __dead2 not at the end (or near
  the end, preceding other attributes).  There are 10
  'static __dead2 void's, 7 'static void __dead2's, and 1
  '__dead2 static void'.

Non-FreeBSD spellings of __dead2 are more popular.  $(find /usr/src -name
*.c) has 200 lines matching 'oreturn'.  About 73 of these use the hard-coded
gccism __attribute(()).  The C11ish _Noreturn is used just 4 times (3 times
in rlogin for static functions that don't need it and 1 time in
libstdthreads).  Outside of contrib, there are just 30 lines matching
'oreturn': 11 in crypto/heimdal, 2 in routed/rtqery, 8 in tools/regression
for static usage(), 4 _Noreturns as above, 1 in xlint, 1 in pkill, 2 in
comments and 1 unrelated match.


In practice this means that many compiler warnings need to be disabled
for the affected compilers.


An example is the LLVM static checker in 2008, but hopefully not later
versions if this or any compiler ever used in FreeBSD.  errexit() in
echo/echo.c always compiled at WARNS=6 with gcc, but was changed in
2008 to define (sic) it as __dead2 with the excuse that this reduces
warnings with the static checker.  The change is quite broken:
- it has the syntax error 'static __dead2 void' -
__dead2 is applied to the definition of the function.  It doesn't
  take -funit-at-a-time to see when a static function which is defined
  before it is used doesn't return.  This application point also makes
  the syntax problems larger.  It is now natural to place the attribute
  declaration where it is, and there is a good chance that the old
  versions of gcc that didn't like it there also didn't like it being
  at the end.
The errexit() function has many style bugs:
- no prototype before the function.  This is not a large style bug.
  The function is unsorted before main() so that it can be its own
  protoype.  But this leaves no natural place to put the attribute.
- initialization in declaration - missing blank line after declarations
- garbage reason for existence of the function.  4.4BSD doesn't have
  the function or its style bugs, but just uses printf().  /bin/echo
  was optimized in FreeBSD to avoid using printf().  But with dynamic
  linkage, the space optimization is almost null, and with libc bloat
  for static linkage, the optimization doesn't work (/bin/echo has
  size 11K in my version of 5.2 where the bloat is smaller, but 415K
  in -current).
The main() function in echo/echo.c has rotted similarly.  To avoid
using printf() in a loop, it uses complicated allocation and writev().
This optimization is about 5% faster for 1 args created by $(jot
1 0), but most uses of echo are with only a couple of args.  With
10 args created by $(jot 10 0), the optimization is about 2% slower.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r287217 - head/usr.sbin/syslogd

2015-08-29 Thread Bruce Evans

On Fri, 28 Aug 2015, Joerg Sonnenberger wrote:


On Fri, Aug 28, 2015 at 10:17:56PM +1000, Bruce Evans wrote:

-static voiddie(int);
+static voiddie(int) __dead2;


Since the function is static, it is very easy for the compiler to see
that it doesn't return.


But the compiler can't tell if it is the *intention* that the function
never returns. The warning behavior exists because that can easily
change with macros etc.


The compiler should trust the programmer to write correct functions.


Even gcc-4.2.1 does this by default, since
-O implies -funit-at-a-time for gcc-4.2.1.  For clang, there is no way
to prevent this (except possibly -O0) since, since -fno-unit-at-a-time
is broken in clang.


It is not broken. It is loadly ignored as unsupported. The very
existance of the option in GCC has always been a concession to broken
and badly written code, including of course GCC's own CRT.


Unsupported == incompatible == broken.

My use of this option can probably be reduced to -fno-toplevel-reorder,
but that is even more broken in clang (it and -ftoplevel-reorder are
unknown arguments, while -fno-unit-at-a-time is an unsupported
optimization, and -funit-at-a-time works).

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r287217 - head/usr.sbin/syslogd

2015-08-28 Thread Bruce Evans

On Thu, 27 Aug 2015, Xin LI wrote:


Log:
 die() would never return, mark it as so.


Why?  (Except to add a style bug.)


Modified: head/usr.sbin/syslogd/syslogd.c
==
--- head/usr.sbin/syslogd/syslogd.c Thu Aug 27 17:16:18 2015
(r287216)
+++ head/usr.sbin/syslogd/syslogd.c Thu Aug 27 18:11:00 2015
(r287217)
@@ -324,7 +324,7 @@ static const char *cvthname(struct socka
static void deadq_enter(pid_t, const char *);
static int  deadq_remove(pid_t);
static int  decode(const char *, const CODE *);
-static voiddie(int);
+static voiddie(int) __dead2;


Since the function is static, it is very easy for the compiler to see
that it doesn't return.  Even gcc-4.2.1 does this by default, since
-O implies -funit-at-a-time for gcc-4.2.1.  For clang, there is no way
to prevent this (except possibly -O0) since, since -fno-unit-at-a-time
is broken in clang.

Several other functions in the same file are still missing this style
bug:

- usage().  The style bug is clearly documented for usage() by its
  absence in style(9) and in most files that have usage().  Howvever,
  about 30 declarations of usage() in /usr/src have the style bug.

- timedout().  This is a signal handler, so doesn't really need it.
  This is broken signal handler.  It carefully uses _exit() so as
  to not use stdio in a signal handler, but defeats this by also
  using errx().

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286715 - head/lib/libc/string

2015-08-13 Thread Bruce Evans

On Thu, 13 Aug 2015, David Chisnall wrote:


On 13 Aug 2015, at 08:11, Marcelo Araujo araujobsdp...@gmail.com wrote:


The bcopy() was removed in IEEE Std 1003.1-2008 and it is marked as LEGACY in 
IEEE Std 1003.1-2004. However, BSD has its implementation before IEEE Std 
1003.1-2001.

In my understood it is obsolete on POSIX, but not truly obsolete for FreeBSD.
So I believe, this patch now address it in the correct way.


Its use should be strongly discouraged in FreeBSD (or, ideally, replaced with 
the macro from the POSIX man page).  LLVM does a load of optimisations for 
memmove and memcpy - using bcopy is a really good way of bypassing all of these.


That is a good reason to use bcopy.  Compilers are clueless about caches,
so their optimizations tend to be negative.  clang has a nice one now, not
related to caches.  It uses SSE if possible for small fixed-size memcpy's,
so takes a few hundred cycles longer for the first memcpy per context
switch if SSE would not have otherwise be used.  Compilers cannot know
if this is a good optimization.

If this were important, then someone except me would have noticed that
switching to -ffreestanding in the kernel turned off all builtins
including the one for memcpy().  But it really is unimportant.  Most
copying in the kernel is done by pagecopy(), and the possible gains
from optimizing that are on the order of 1% except in micro-benchmarks
that arrange to do excessive pagecopy()s.  The possible gains from
optimizations in the memcpy() builtin are much smaller.  The possible
gains from restoring all builtins may be as much as 1%.

I recently noticed bcmp() showing up in benchmarks.  Only 1% for
makeworld, but that is a lot for a single function.  clang doesn't
bother optimizing memcmp() and calls the library.  gcc-4.2.1 pessimizes
memcmp using rep cmpsb on x86 (gcc used to use a similar pessimization
for memcpy(), but learned better.  Strangely, rep movsb is now the
best method on Haswell except for small counts where gcc used to use it).
The library and kernel memcmp()s are not as bad as gcc's, but they
still use rep movs[lq] and this is still slow on Haswell.  Simple
C code is more than 6 times faster in micro-benchmarks on Haswell.
The FreeBSD library MI C code is a little too simple.  It uses bytewise
compares, so it it beats rep cmpsb but not rep cmpsl.

Bruce
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286715 - head/lib/libc/string

2015-08-12 Thread Bruce Evans

On Thu, 13 Aug 2015, Marcelo Araujo wrote:


Author: araujo
Date: Thu Aug 13 02:31:23 2015
New Revision: 286715
URL: https://svnweb.freebsd.org/changeset/base/286715

Log:
 Remove the mention of memcpy(3) that is build on top of bcopy(3).
 Fix some phrases to make it more clear.

 Differential Revision: D3378
 Reported by:   bde@
 Reviewed by:   wblock
 Approved by:   bapt, rodrigc (mentor)
 Sponsored by:  gandi.net


I don't like this version either :-).


Modified: head/lib/libc/string/bcopy.3
==
--- head/lib/libc/string/bcopy.3Thu Aug 13 01:04:26 2015
(r286714)
+++ head/lib/libc/string/bcopy.3Thu Aug 13 02:31:23 2015
(r286715)
@@ -31,7 +31,7 @@
.\ @(#)bcopy.38.1 (Berkeley) 6/4/93
.\ $FreeBSD$
.\
-.Dd August 11, 2015
+.Dd August 13, 2015
.Dt BCOPY 3
.Os
.Sh NAME
@@ -58,16 +58,14 @@ If
.Fa len
is zero, no bytes are copied.
.Pp
-This function is deprecated (marked as LEGACY in
-POSIX.1-2001): use
-.Xr memcpy 3
-or
+This function is obsolete (marked as LEGACY in
+POSIX.1-2001): please use


This function is not obsolete, but is fully supported in FreeBSD.

Its obsoletely doesn't follow at all from the clause in parentheses.
LEGACY doesn't even mean obsolencent.  In computerspeak it means
old stuff that we don't like.

I think you got the obsolete wording from NetBSD.  But I think
NetBSD started deprecating it in 1995.  It might really be obsolete
for them after 20 years of deprecation.

Please don't use pleasant nonsenses like please in man pages or
comments.


.Xr memmove 3
in new programs.


Still not quite right.  New programs should use the best function and
that is usually memcpy().  Only old programs that are being roto-tilled
should convert bcopy() to memmove() (so as to avoid having to understand
them well enough to know if they do overlapped copies).


-Note that the first two arguments are
-interchanged for
-.Xr memcpy 3
-and
+Note that
+.Fn bcopy
+takes its src and dst arguments in the opposite
+order from
.Xr memmove 3 .


OK, except Xr should be Fn.  .Xr memmove 3 is a man page, not a function.
Man pages don't take src and dst arguments :-).  This normally shows up
in the rendering -- in text mode, Fn gives highlighting but Xr doesn't.

Now I notice more markup problems: the src and dst arguments are not marked
up.  In text mode, Fa gives highlighting for args.


POSIX.1-2008 removes the specification of
.Fn bcopy .


Better put this before the LEGACY statement and maybe merge the LEGACY
statement into the HISTORY section.  Since I don't want to deprecate
bcopy(), I would put this in the history section too.

Technically, this function is not deprecated since it is still a standard
BSD function in strings.h.  The ifdef for this is already quite
complicated and broken, thoough it only attempts to keep up with some
of the POSIX churn: from strings.h:

#if __BSD_VISIBLE || __POSIX_VISIBLE = 200112
int  bcmp(const void *, const void *, size_t) __pure;   /* LEGACY */
void bcopy(const void *, void *, size_t);   /* LEGACY */
void bzero(void *, size_t); /* LEGACY */
#endif

It is POSIX LEGACY, but standard BSD.

The POSIX ifdef is quite broken:
- before 1996, POSIX didn't have this function, but it also didn't have
  strings.h so there is no problem (the ifdef is vacuously correct)
- between 1996 and 2001, there was an XSI version there may have been
  a POSIX version that had this function and strings.h too.  The ifdef
  is wrong for any such version since it doesn't mention XSI.  Such versions
  are just unsupported.
- between 2001 and 2008, POSIX apparently required bcopy() (and the other
  functions) but didn't declare them anywhere, except in the XSI case it
  requires them to be declared in strings.h.  The above supports the
  non-XSI case of this.  The XSI case is broken as in any 1996-2001 versions.
- the ifdef is not up to date with changing LEGACY to OBSOLETE, but since
  LEGACY already resulted in omission of the prototypes in the non-XSI case,
  nothing needed to be changed in this case, and since the XSI case is
  unsupported nothing need to be changed to turn off the prototypes for XSI
  either.

Correct code here wouldn't have the POSIX ifdef or LEGACY comments, but
might having a comment saying that XSI is intentionally unsupported.
XSI messes elswhere are supported, and I am suprised that not supporting
it here doesn't cause problems.

Ifdefs for historical mistakes would cause much larger messes if they
were complete.  Just near here, there is the mess of the string.h
and strings.h.  1980's source code had messy ifdefs in applications
for this.  C90 didn't exactly help by standardizing the SYSVish
string.h.  POSIX.1-2001 actively harmed by restoring the BSDish
strings.h.  FreeBSD handles this problem by including strings.h
in string.h if __BSD_VISIBLE and not documenting 

Re: svn commit: r286601 - head/usr.bin/patch

2015-08-11 Thread Bruce Evans

On Mon, 10 Aug 2015, Xin Li wrote:


On 8/10/15 22:51, Ed Schouten wrote:

2015-08-11 1:20 GMT+02:00 Xin Li delp...@delphij.net:

Do you have a better solution for this?  No, this is not ideal
but I didn't find a better alternative (or maybe I have missed
something obvious?)

I think with the current POSIX definition of the interface, we
have only two options: either strdup() or cast away const (I'd
rather hide this kind of uglyness in the library), no?


Casting const away should be all right in this case.


It is the Standard way.


Committed as r286617, please let me know if you want other
changes, etc.

Perhaps we should convert the tree over time to use __DECONST instead
then?


Gak!.  I use rmrf on code with __DECONST in it.

The misimplementation of __DECONST only works because -Wcast-qual is
too broken to detect removal of qualifiers by casting a pointer to an
integer.  Casting a pointer to an integer is much less safe than
casting away only a qualifer for a pointer, but both are Standard,
so they work unless the compiler is directed to be nonstandard using
-Wcast-qual -Werror.

Compilers should know about the language defects that make it impossible
to declare functions like execv() with the correct number of const's
and turn off -Wcast-qual warnings for these functions.

strchr() is another function that needs special.  It can take a pointer
const char and return a related pointer with the const qualifier removed
(subtract an offset to get the original pointer with its const qualifier
removed).  The warning about this from -Wcast-qual is not yet killed
using __DECONST() because libc is still compiled with a low WARNS.
The implementation uses a cast in the Standard way.  Only the
implementation would be affected by -Wcast-qual.  But a flag like
-Wcast-qual should direct the compiler to warn about removal of casts
by abusing strchr() too.

Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286651 - head/lib/libc/string

2015-08-11 Thread Bruce Evans

On Wed, 12 Aug 2015, Marcelo Araujo wrote:


Log:
 Describe that bcopy(3) is deprecated and marked as LEGACY in
 POSIX.1-2001 and removed from the specification in POSIX.1-2008.


Only new and old POSIX software and all Standard C software.  Standard
C never had bcopy().  POSIX didn't have it until API bloat restored
many old mistakes.  But BSD has it.  It is memmove(9undoc) that is
deprecated (but brought back by KPI bloat) in the kernel in FreeBSD,
but this is about userland.  Only portability requires preferring
mememove(3).


 New softwares shall use memcpy(3) or memmove(3).


Bad advice.  bcopy() is only similar to memmove().


Modified:
 head/lib/libc/string/bcopy.3

Modified: head/lib/libc/string/bcopy.3
==
--- head/lib/libc/string/bcopy.3Tue Aug 11 22:43:28 2015
(r286650)
+++ head/lib/libc/string/bcopy.3Wed Aug 12 00:49:20 2015
(r286651)
@@ -31,7 +31,7 @@
.\ @(#)bcopy.38.1 (Berkeley) 6/4/93
.\ $FreeBSD$
.\
-.Dd June 4, 1993
+.Dd August 11, 2015
.Dt BCOPY 3
.Os
.Sh NAME
@@ -57,6 +57,20 @@ The two strings may overlap.


The strings must not overlap for memcpy().


If
.Fa len
is zero, no bytes are copied.
+.Pp
+This function is deprecated (marked as LEGACY in
+POSIX.1-2001): use
+.Xr memcpy 3
+or
+.Xr memmove 3
+in new programs.


Bad advice, since these functions are not similar, and it doesn't follow
from deprecation all the versions of POSIX that have it that it is
deprecated in FreeBSD.  It follows from the nonexistence of the function
in POSIX before 2001 and after 2008 that the function is more than
deprecated.


+Note that the first two arguments are
+interchanged for
+.Xr memcpy 3
+and
+.Xr memmove 3 .


The first 2 args are not interchanged for memcpy() and memmove().  They
are only interchanged for bcopy() and memmove().



+POSIX.1-2008 removes the specification of
+.Fn bcopy .
.Sh SEE ALSO
.Xr memccpy 3 ,
.Xr memcpy 3 ,


Removing all mention of memcpy() (except the one in the Xr) would fix half
of the bugs.

POSIX has much better wording for this, as for most things.  From a 2001
draft:

X 5112 APPLICATION USAGE
X 5113memmove( ) is preferred over this function.

When bcopy() was only deprecated, it was un-preferred but not banned.


X 5114The following are approximately equivalent (note the order of 
the arguments):
X 5115bcopy(s1,s2,n)  = memmove(s2,s1,n)
X 5116For maximum portability, it is recommended to replace the 
function call to bcopy( ) as follows:
X 5117#define bcopy(b1,b2,len) (memmove((b2), (b1), (len)), (void) 
0)

No mention of memcpy(), but an example of how to translate with so much
attention to details that it is hard to read.  It even translates the
return type.

X 
X 5118 RATIONALE

X 5119None.
X 
X 5120 FUTURE DIRECTIONS

X 5121This function may be withdrawn in a future version.

It was apparently withdrawn in 2008.

X 
X 5122 SEE ALSO

X 5123memmove( ), the Base Definitions volume of IEEE Std 1003.1-200x, 
strings.h

No mention of memcpy() here either.  I don't like long lists of Xr's to
vaguely related man pages, with no hint of the exact relation, in FreeBSD
man pages.  A reader wishing to know any relation at all would have to
read all the man pages in the long list to see some relation, and understand
all their details to see the exact relation.

X 
X 5124 CHANGE HISTORY

X 5125First released in Issue 4, Version 2.
X 
X 5126 Issue 5

X 5127Moved from X/OPEN UNIX extension to BASE.
X 
X 5128 Issue 6

X 5129This function is marked LEGACY.

So bcopy() was apparently XSI in Issue 4, BASE in Issue 5, and
BASE plus LEGACY in Issue 6.  So why is it still XSI in 2001?
I don't know the dates of the Issues.  Oops, this is because only
the strings.h include with the prototype for bcopy() is XSI in
2001.  It doesn't exist in POSIX.1-1996.

Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286519 - head/contrib/binutils/gas/config

2015-08-09 Thread Bruce Evans

On Sun, 9 Aug 2015, Warner Losh wrote:


Since when is LEFT shifting a signed number undefined. It is RIGHT
shifting that???s undefined???


Always.  It always shifts out the sign bit in the usual 2's complement
representation, so must be undefined in that case.  It is undefined by
definition in other representions where the sign bit could have been
anywhere before C99.  Left shifting of a positive value is undefined
if the shift moves a bit into or through the sign bit.

Strangely, right shifting is never undefined.  It is implementation-
defined.  Now the sign bit may or may not be preserved or take part
in the shift, but other bits are never shifted into the sign bit
except possibly in weird pre-C99 representations, so overflow doesn't
occur and the usual reason for undefined behaviour of an expression --
that the correct result is unrepresentable -- doesn't apply.

I remembered this backwards too.

Bruce___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org

Re: svn commit: r286266 - head/usr.bin/ypmatch

2015-08-09 Thread Bruce Evans

On Mon, 10 Aug 2015, Marcelo Araujo wrote:


2015-08-09 19:51 GMT+08:00 Ed Schouten e...@nuxi.nl:


Hi Marcelo,

2015-08-04 4:34 GMT+02:00 Marcelo Araujo ara...@freebsd.org:

  Sync the code with the OpenBSD version.


That's a shame. It looks like improvements that we made to our version
have been undone because of this.
...
Though the style(9) conformance of tool wasn't ideal, this change made it
worse.


Agree with you!

These tools are original from OpenBSD, I'm trying first to sync the code
and then will rework them. I will try to send it back to upstream(OpenBSD)
as soon as I have time.


In fact, the changes were more than half to back out FreeBSD cleanups by
syncing with the 1994 OpenBSD version.  The cleanups were small (less
than 50 lines changed), and OpenBSD doesn't seem to have changed much
either.

Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286293 - head/sys/x86/include

2015-08-05 Thread Bruce Evans

On Tue, 4 Aug 2015, Jung-uk Kim wrote:


Log:
 Fix style(9) bugs.


Thanks.


Modified:
 head/sys/x86/include/_types.h

Modified: head/sys/x86/include/_types.h
==
--- head/sys/x86/include/_types.h   Tue Aug  4 17:47:11 2015
(r286292)
+++ head/sys/x86/include/_types.h   Tue Aug  4 18:59:54 2015
(r286293)
@@ -154,12 +154,11 @@ typedef   int ___wchar_t;
typedef __builtin_va_list   __va_list;  /* internally known to gcc */
#else
#ifdef __LP64__
-typedefstruct {
-   unsigned int__gpo;
-   unsigned int__fpo;
-   void*__oaa;
-   void*__rsa;
-} __va_list;
+struct __s_va_list {
+   __uint32_t  pad1[2];/* gp_offset, fp_offset */
+   __uint64_t  pad2[2];/* overflow_arg_area, reg_save_area */
+};


Er, the struct members need at least 1 underscore, since pad* is in the
application namespace and this is a very public file.


+typedef struct __s_va_list __va_list;


I said to use #define, but now prefer this since it is better for debuggers
and all the nearby __va_list's are typedefs.

This typedef should be followed by a tab like all the nearby ones.


#else
typedef char *  __va_list;
#endif


Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286281 - head/sys/kern

2015-08-04 Thread Bruce Evans

On Tue, 4 Aug 2015, Ed Schouten wrote:


2015-08-04 10:51 GMT+02:00 Edward Tomasz Napierala tr...@freebsd.org:

  Mark vgonel() as static.  It was already declared static earlier;
  no idea why compilers don't warn about this.


That's because according to the standard, those keywords are meant to pile up:

static void foo(void);
_Noreturn void foo(void);
void foo(void) {
 ...
}

is the same as:

static _Noreturn void foo(void) {
 ...
}


This is a bug in the standard.  You can build up enormously complicated
declarations, with the full declaration not visible in any one of the
parts.  E.g.:

int foo();
int foo(int (*foo1()));
int foo(int (*foo1(int (*foo2();
int foo(int (*foo1(int (*foo2(int (*foo3)());
/* ... */

Every declaration in this sequence completes the type a little more,
but the type is never complete.

To obfuscate this further, remove the function parameter names, or
use different ones in each prototype.

To complicate this further, there are many possibilities.  E.g., make
foo() have several function args and (partially) complete each of these
independently.  Or use attributes to give even more independence in
the steps.

I don't know of any useful use for this except to complete function
definitions in KR or bad code.  The code might have int foo(); in
it and you want to add a complete prototype without changing the
original declaration.  The complete prototypes should be declared in
one step to avoid further obfuscations.

Compilers have very bad support for warning about this.  -Wredundant-decls
is broken in gcc from when it was born up to at least gcc-4.2.1.  The
above declarations are non-redundant non-redeclarations, but gcc warns
that they are redundant redaclaration.  -Wredundant-decls is even more
broken in clang.  It doesn't warn even if all of the above declarations
are repeated so that every declaration in the second copy is redundant.

Building up doesn't work for all parts of declarations.  In particular,
the rules for linkage and storage class are confusing without building
up, and building up doesn't work for them.  Consider:

static int foo;
int foo;

vs

int foo;
static int foo;

According to TenDRA C90, the former is valid, but according to gcc and
clang it is invalid.  The latter is invalid according to all 3 compilers,
so building up by adding the static is always invalid.  I only trust
TenDRA here.  The former is a valid obfuscation (the opposite of building
up) for all 3 compilers if foo is replaced by foo().

Now change 'int foo;' to 'extern int foo;' in the former.  Then all 3
compilers accept it.

extern is generally more confusing than static, especially in
combination with inline.  For static inline, the most common obfuscation
is like this one for vgonel.  You declare the function as inline in its
prototype but not in its function body.  This makes it inline if it is
static in both.  With static in neither and extern in either, both or
neither, the behaviour is even more confusing.

Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


Re: svn commit: r286268 - head/usr.bin/wall

2015-08-04 Thread Bruce Evans

On Tue, 4 Aug 2015, Pedro F. Giffuni wrote:


Log:
 Revert r286144 leaving the original fix to the buffer overflow.

 Some developers consider the new code unnecessarily obfuscated.
 There was also a benign off-by-one.

 Discussed with:bde, vangyzen, jmallett


It is this version that is unnecessarily obfuscated.


Modified: head/usr.bin/wall/ttymsg.c
==
--- head/usr.bin/wall/ttymsg.c  Tue Aug  4 02:41:14 2015(r286267)
+++ head/usr.bin/wall/ttymsg.c  Tue Aug  4 02:56:31 2015(r286268)
@@ -62,7 +62,7 @@ ttymsg(struct iovec *iov, int iovcnt, co
struct iovec localiov[7];
ssize_t left, wret;
int cnt, fd;
-   char device[MAXNAMLEN];
+   char device[MAXNAMLEN] = _PATH_DEV;
static char errbuf[1024];
char *p;
int forked;


Half of the string initialization is done by an auto initializer.
Initializations in declarations are specifically forbidden in style(9),
and this is a good example of how to obfuscate code using them.  The
original version using a static initializer was almost as obfuscated,
but it was at least maximally efficient and had a technical reason
for using the initializer.


@@ -71,9 +71,8 @@ ttymsg(struct iovec *iov, int iovcnt, co
if (iovcnt  (int)(sizeof(localiov) / sizeof(localiov[0])))
return (too many iov's (change code in wall/ttymsg.c));

-   strlcpy(device, _PATH_DEV, sizeof(device));
+   strlcat(device, line, sizeof(device));


The other half of the initialization is done using strlcat().


p = device + sizeof(_PATH_DEV) - 1;
-   strlcpy(p, line, sizeof(device) - sizeof(_PATH_DEV));
if (strncmp(p, pts/, 4) == 0)
p += 4;
if (strchr(p, '/') != NULL) {


In private mail, I pointed out lots more bad code here.  The pointer
'p' was introdiced to avoid writing out the expression for it in
more places, especially in the pts/ check.  But the pts/ check
is badly written.  It checks the copy of 'line' in 'device'.  This
is an obfuscated way of checking 'line' itself.  It might be more
secure the check the final string, but it is actually more secure
to check 'line' since this will detect slashes that lost by strlcat()
with no error checking.

This method becomes even more unnatural when combined with the strcat().
We use the knowledge of the prefix to find the place to start for the
check, but don't use this to find the place to start for the concatenation.
These places are the same, and unwinding the obfuscation in the check
requires knowing that they are the same.

Bruce
___
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to svn-src-all-unsubscr...@freebsd.org


<    5   6   7   8   9   10   11   12   13   14   >