Re: [Bug 225626] r325865 malloc vs bzero

2018-02-04 Thread Bruce Evans

On Sat, 3 Feb 2018, Conrad Meyer wrote:


On Sat, Feb 3, 2018 at 3:52 AM, Bruce Evans  wrote:

On Fri, 2 Feb 2018 a bug that doesn't want repl...@freebsd.org wrote:


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

--- Comment #1 from Brooks Davis  ---

...

Note that memset should be used in preference to bzero as the compiler
should
be able to elide most of the cost of the memset since it can emit it
inline and
then delete the dead stores.


(To Brooks:)
Not in -ffreestanding by default, unfortunately.  We can give it that
hint back by defining memset() in terms of __builtin_memset(), though.
We have done so for some sizes of constant bzero(), but not for bcopy,
memcpy, or memmove, unfortunately.


Note that memset() should _not_ be used in preference to bzero() since:
- using memset() in the kernel is a style bug, except possibly with a
nonzero
  fill byte
- the existence of memset() in the kernel is an umplementation style bug,
  except possibly with a nonzero fill byte.


This is total nonsense.


This is total sense.  memset() was intentionally left out until someone broke
this.


...
- using memset() instead of bzero() in the kernel is a pessimization.  Since
  memset() is only compatibilty cruft and should not be used, it is
  intentionally not as optimized as bzero().
...


This can and should be fixed.


That would reward using the style bug.


Not so simlarly for memcpy().  Its use in the kernel is now just a style
bug, since the compiler is not allowed to inline it (except in my version
of course).


This should be fixed.


Yes, it is easier to fix (by removing it) than for memset(), because it has
no functionality that is not in bcopy().

My version actually translates memset() with fill byte 0 to bzero() and then
bzero with compile-time-constant size <= 32 to __builtin_memset().  It
doesn't remove memcpy(), but translates it to __builtin_memcpy() for all
sizes.

memcmp() is another pessimized KPI.  When memset() and memcmp() were first
misimplemented in the kernel, memcmp() was broken.  It called bcmp(), but
bcmp() returns 2 states while memcmp() returns 3 states.  memcmp(9) now uses
the fairly slow generic C version from libc/string.  bcmp(9) on x86 has
always been misoveroptimized.  It is rarely used, so its efficiency is
unimportant, so it shouldn't be MD, and the x86 version of it is only
optimized for the original i386 (or maybe the 8088 with 32-bit extensions).
So the pessimization doesn't matter.


...
FreeBSD was changed to use -ffreestanding because without it the compiler
is allowed to inline functions like printf() and gcc started doing that
(it converts printf(3) into puts() galore, and puts() doesn't exist in
the kernel).  This broke all inlining, but no one cared (except me of
course).


Isn't the other issue that non-freestanding links libgcc (GPL) into
the kernel?  We could work around puts() by adding a puts()
implementation, of course.


No, the kernel never used libgcc.  The kernel always used libkern, which
must contain all the libgcc functions that are actually used (not many,
and none of the more complicated FP ones.  The main complicated one is
64-bit division on 32-bit arches (__[u]divdi3())).

Also, I think we didn't care about (GPL2?) libgcc in the kernel any more
than in applications.

Bruce
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


Re: [Bug 225626] r325865 malloc vs bzero

2018-02-03 Thread Conrad Meyer
Hi Bruce, Brooks,

On Sat, Feb 3, 2018 at 3:52 AM, Bruce Evans  wrote:
> On Fri, 2 Feb 2018 a bug that doesn't want repl...@freebsd.org wrote:
>
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626
>>
>> --- Comment #1 from Brooks Davis  ---
> ...
>> Note that memset should be used in preference to bzero as the compiler
>> should
>> be able to elide most of the cost of the memset since it can emit it
>> inline and
>> then delete the dead stores.

(To Brooks:)
Not in -ffreestanding by default, unfortunately.  We can give it that
hint back by defining memset() in terms of __builtin_memset(), though.
We have done so for some sizes of constant bzero(), but not for bcopy,
memcpy, or memmove, unfortunately.

> Note that memset() should _not_ be used in preference to bzero() since:
> - using memset() in the kernel is a style bug, except possibly with a
> nonzero
>   fill byte
> - the existence of memset() in the kernel is an umplementation style bug,
>   except possibly with a nonzero fill byte.

This is total nonsense.

> ...
> - using memset() instead of bzero() in the kernel is a pessimization.  Since
>   memset() is only compatibilty cruft and should not be used, it is
>   intentionally not as optimized as bzero().
> ...

This can and should be fixed.

> Not so simlarly for memcpy().  Its use in the kernel is now just a style
> bug, since the compiler is not allowed to inline it (except in my version
> of course).

This should be fixed.

> ...
> FreeBSD was changed to use -ffreestanding because without it the compiler
> is allowed to inline functions like printf() and gcc started doing that
> (it converts printf(3) into puts() galore, and puts() doesn't exist in
> the kernel).  This broke all inlining, but no one cared (except me of
> course).

Isn't the other issue that non-freestanding links libgcc (GPL) into
the kernel?  We could work around puts() by adding a puts()
implementation, of course.

Conrad
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


Re: [Bug 225626] r325865 malloc vs bzero

2018-02-03 Thread Bruce Evans

On Fri, 2 Feb 2018 a bug that doesn't want repl...@freebsd.org wrote:


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

--- Comment #1 from Brooks Davis  ---
I'd agree it's pointless, but there's seriously nothing wrong with the fix
other than making a path that isn't performance relevant slightly slower.  If
you want to submit a patch that would likely be fine.  Bug reports aren't the
places for this discussion.


Changing to using malloc() is correct, since the data is too large to put on
the kernel stack.


Note that memset should be used in preference to bzero as the compiler should
be able to elide most of the cost of the memset since it can emit it inline and
then delete the dead stores.


Note that memset() should _not_ be used in preference to bzero() since:
- using memset() in the kernel is a style bug, except possibly with a nonzero
  fill byte
- the existence of memset() in the kernel is an umplementation style bug,
  except possibly with a nonzero fill byte.  It was intentionally left out in
  4.4BSD and in old versions of FreeBSD.  It is mainly compatibilty cruft
  for contrib'ed code that doesn't know kernel APIs and used to have private
  definitions of it duplicated ad nauseum.
- using memset() instead of bzero() in the kernel is a pessimization.  Since
  memset() is only compatibilty cruft and should not be used, it is
  intentionally not as optimized as bzero().  One of the optimizations is that
  bzero() is optimized to let the compiler inline it (up to a too-hard-coded
  size of 64 bytes), while memset() is pessimized to not let the compiler
  inline it.  The kernel is compiled with -ffreestanding.  This turns off
  all builtins, since a kernel function named foo() is in general unrelated
  to a standard function named foo().  None are turned back on in 
  headers, but bzero() is optimized using __builtin_memset().

Not so simlarly for memcpy().  Its use in the kernel is now just a style
bug, since the compiler is not allowed to inline it (except in my version
of course).  However, in old versions of FreeBSD which were not compiled
with -ffreestanding, memcpy() was supposed to be used instead of bcopy()
for all small fixed-sized copies that the compiler would inline up to a
MD size, but for no other cases.  The compatibility cruft of an extern
memcpy() was added for cases where the compiler didn't inline memcpy().
Since memcpy() was unimportant, it was intentionally not as optimized as
bzero().  It wasn't pessimized enough to prevent it being used as a style
bug.  Perhaps a linker warning like the one for gets() should have been
used to inhibit its use.  Warnings from -Winline are related.  This should
have been implemented like bcopy() is now, with an internal conversion to
__builtin_memcpy() for small fixed-sized copies, but with a fallback to
an implementation-detail function like __memcpy() to keep memcpy() out of
the KPI.

FreeBSD was changed to use -ffreestanding because without it the compiler
is allowed to inline functions like printf() and gcc started doing that
(it converts printf(3) into puts() galore, and puts() doesn't exist in
the kernel).  This broke all inlining, but no one cared (except me of
course).

It turns out that inlining and other optimizations and pessimizations make
little difference.  bcopy() was only significantly faster than memcpy() for
large copies on Pentium-1 in ~1997, using special optimizations for Pentium-1
that are pessimizations on most later x86 CPUs.  After removing these
optimizations, bcopy() is almost the same as memcpy() on x86.  bcopy() has
more setup overhead, so tends to be slower.  Another development is
"fast strings" on newer x86.  With this, "rep movsb" is faster than
"rep movsd" since it is has less setup and finishup overhead.  The
implementation of both bcopy() and memcpy() is still generic with some tuning
for the original i386, so it doesn't benefit much from this.  "rep movs*"
still has a lot of internal setup overhead, so it is a bad method for small
copies.  "fast strings" also affects inlining.  Compilers used to generate
lots of "[rep] movs*"s, but this was a bad method for almost all sizes so
compilers stopped doing this long ago.  For small sizes, it is bad because
of high internal setup overhead, and for large sizes it is bad because the
library function might be able to do it better and the compiler doesn't know
how much better or worse the library function is.  Now with "fast strings",
bot the compiler and library can just use "rep movsb" for large copies, but
this is hard to configure.  "large" is quite large -- normally 1K or even
4K.  The compiler can and does have zillions of variants depending on -march.
Having zillions of variants in the library is not so easy, and might end up
as a pessimization unless the correct variant is selected at compile time.

Bruce
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To uns

[Bug 225626] r325865 malloc vs bzero

2018-02-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

Eugene Grosbein  changed:

   What|Removed |Added

 CC||eu...@freebsd.org

--- Comment #4 from Eugene Grosbein  ---
(In reply to aler from comment #0)

> I see no reason why it was required to use malloc()

Previous revision abused kernel stack allocating large structure. This is bad
and may provoke kernel panic.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 225626] r325865 malloc vs bzero

2018-02-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

Conrad Meyer  changed:

   What|Removed |Added

 CC||c...@freebsd.org

--- Comment #3 from Conrad Meyer  ---
(In reply to Mark Millard from comment #2)
The structure in question is kld_file_stat, which is pretty large (2080 bytes).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 225626] r325865 malloc vs bzero

2018-02-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

Mark Millard  changed:

   What|Removed |Added

 CC||marklmi26-f...@yahoo.com

--- Comment #2 from Mark Millard  ---
(In reply to aler from comment #0)

sizeof(*stat)?
sizeof(*stat32)?

Is the kernel stack-space usage at issue
here? As I understand such allocations
and frees are frequently for avoiding
overuse of the limited kernel-stack spaces.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 225626] r325865 malloc vs bzero

2018-02-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

Brooks Davis  changed:

   What|Removed |Added

 CC||bro...@freebsd.org
 Resolution|--- |Works As Intended
 Status|New |Closed

--- Comment #1 from Brooks Davis  ---
I'd agree it's pointless, but there's seriously nothing wrong with the fix
other than making a path that isn't performance relevant slightly slower.  If
you want to submit a patch that would likely be fine.  Bug reports aren't the
places for this discussion.

Note that memset should be used in preference to bzero as the compiler should
be able to elide most of the cost of the memset since it can emit it inline and
then delete the dead stores.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 225626] r325865 malloc vs bzero

2018-02-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

a...@playground.ru changed:

   What|Removed |Added

Version|11.0-RELEASE|CURRENT

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 225626] r325865 malloc vs bzero

2018-02-02 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225626

Bug ID: 225626
   Summary: r325865 malloc vs bzero
   Product: Base System
   Version: 11.0-RELEASE
  Hardware: Any
OS: Any
Status: New
  Severity: Affects Only Me
  Priority: ---
 Component: kern
  Assignee: freebsd-bugs@FreeBSD.org
  Reporter: a...@playground.ru

https://svnweb.freebsd.org/base?view=revision&revision=325865

This patch fixing kldstat kernel stack bytes leak. I see no reason why it was
required to use malloc() and make the code slightly more complicated while it
was enough just to add bzero() for structs in question and not using heap
allocations at all.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"