Re: Header symbols that shouldn't be visible to ports?

2022-09-08 Thread David Chisnall
On 7 Sep 2022, at 15:55, Cy Schubert  wrote:
> 
> This is exactly what happened with DMD D. When 64-bit statfs was introduced 
> all DMD D compiled programs failed to run and recompiling didn't help. The 
> DMD upstream failed to understand the problem. Eventually the port had to 
> be removed.

I’m not sure that I understand the problem.  This should matter only for 
libraries that pass a statbuf across their ABI boundary.  Anyone using libc 
will see the old version of the symbol and just use the old statbuf.  Anyone 
using the old syscall number and doing system calls directly will see the 
compat version of the structure.  Anyone taking the statbuf and passing it to a 
C library compiled with the newer headers will see compat problems (but the 
same is true for a C library asking a C program to pass it a statbuf and having 
the two compiled against different kernel versions).

There’s a lot that we could do in system headers to make them more 
FFI-friendly.  For example:

 - Use `enum`s rather than `#define`s for constants.

 - Add the flags-enum attribute for flags, so that FFI layers that can parse 
attributes get more semantic information.

 - Add non-null attributes on all arguments and returns that 

 - Use `static inline` functions instead of macros where possible and expose 
them with a macro for `static inline` so that an FFI layer can compile the 
headers in a mode that makes these functions that they can link against.  For 
Rust, this can be compiled to LLVM IR and linked against and inlined into the 
Rust code, so things like the Capsicum permissions bitmap setting code wouldn’t 
need duplicating in Rust.

 - Mark functions with availability attributes so that FFI knows when it’s 
using deprecated / unstable values and can make strong ABI guarantees.

 - Add tests for the headers to the tree.

In 12.0, someone decided to rewrite a load of kernel headers to use macros 
instead of inline functions, which then broke C++ code in the kernel by 
changing properly namespaced things into symbols that would replace every 
identifier.  I’d love to see a concerted effort to use a post-1999 style for 
our headers.

David




Re: Header symbols that shouldn't be visible to ports?

2022-09-07 Thread Alan Somers
On Wed, Sep 7, 2022 at 8:55 AM Cy Schubert  wrote:
>
> In message  om>
> , Alan Somers writes:
> > On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov  
> > wro
> > te:
> > >
> > > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> > > > Our /usr/include headers define a lot of symbols that are used by
> > > > critical utilities in the base system like ps and ifconfig, but aren't
> > > > stable across major releases.  Since they aren't stable, utilities
> > > > built for older releases won't run correctly on newer ones.  Would it
> > > > make sense to guard these symbols so they can't be used by programs in
> > > > the ports tree?  There is some precedent for that, for example
> > > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> > > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> > > for userspace code that wants to dig into kernel structures.  Similarly
> > > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> > > definitions are guarded by additional defines not due to their 
> > > instability,
> > > but because using them in userspace requires (much) more preparation from
> > > userspace environment, which is either not trivial (_WANT_SOCKET) or
> > > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> > > sys/mount.h).
> > >
> > > >
> > > > I'm particular, I'm thinking about symbols like the following:
> > > > MINCORE_SUPER
> > > Why this symbol should be hidden?  It is implementation-defined and
> > > intended to be exposed to userspace.  All MINCORE_* not only MINCORE_SUPER
> > > are under BSD_VISIBLE braces, because POSIX does not define the symbols.
> >
> > Because it isn't stable.  It changed for example in rev 847ab36bf22
> > for 13.0.  Programs using the older value (including virtually every
> > Rust program) won't work on 13.0 and later.
> >
> > >
> > > > TDF_*
> > > These symbols coming from non-standard header sys/proc.h.  If userspace
> > > includes the header, it is already outside any formal standard, and I
> > > do not see a reason to make the implementation more convoluted there.
> > >
> > > > PRI_MAX*
> > > > PRI_MIN*
> > > > PI_*, PRIBIO, PVFS, etc
> > > > IFCAP_*
> > > These are all implementation-specific and come from non-standard headers,
> > > unless I am mistaken, then please correct me.
> > >
> > > > RLIM_NLIMITS
> > > > IFF_*
> > > Same.
> > >
> > > > *_MAXID
> > > This is too broad.
> >
> > I'm talking about symbols like IPV6CTL_MAXID, which record the size of
> > sysctl lists.  Obviously, these symbols can't be stable, and probably
> > aren't useful outside of the base system.
> >
> > >
> > > >
> > > > Clearly delineating private symbols like this would ease the
> > > > maintenance burden on languages that rely on FFI, like Ruby and Rust.
> > > > FFI basically assumes that symbols once defined will never change.
> > >
> > > Why e.g. sys/proc.h is ever consumed by FFI wrappers?
> >
> > I should add a little detail.  Rust uses FFI to access C functions,
> > and #define'd constants are redefined in the Rust bindings.  For most
> > Rust programs, the build process doesn't check the contents of
> > /usr/include in any way.  Instead, all of that stuff is hard-coded in
> > the Rust bindings.  That makes cross-compiling a breeze!  But it does
> > cause problems when the C library changes.  Adding a new symbol, like
> > copy_file_range, isn't so bad.  If your Rust program doesn't use it,
> > then the Rust binding will become an unused symbol and get eliminated
> > by the linker.  If your Rust program does use it OTOH, then it will be
> > resolved by the dynamic linker at runtime - if you're running on
> > FreeBSD 13 or newer.  Otherwise, your program will fail to run.  A
> > bigger problem is with symbols that change.  For example, the 64-bit
> > inode stuff.  Rust programs still use a FreeBSD 11 ABI (we're working
> > on that).  But other symbols change more frequently.  Things like
> > PRI_MAX_REALTIME can change between any two releases.  That creates a
> > big maintenance burden to keep track of them in the FFI bindings.  And
> > they also aren't very useful in cross-compiled programs targeting a
> > FreeBSD 11 ABI.  Instead, they really need to have bindings
> > automatically generated at build time.  That's possible, but it's not
> > the default.
>
> This is exactly what happened with DMD D. When 64-bit statfs was introduced
> all DMD D compiled programs failed to run and recompiling didn't help. The
> DMD upstream failed to understand the problem. Eventually the port had to
> be removed.

Ouch.  Does DMD not use ELF symbol versioning?  That's what Rust does.
So all Rust programs are still using the FreeBSD 11 version of "struct
statfs", and the libc function they link to is "statfs@FBSD_1.0"
instead of "statfs".

>
> >
> > So what the Rust community really needs is a way to know which symbols
> > will be stable across releases, and which might vary.  Are you
> > suggesting that anything from a non-POSIX header 

Re: Header symbols that shouldn't be visible to ports?

2022-09-07 Thread Cy Schubert
In message 
, Alan Somers writes:
> On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov  wro
> te:
> >
> > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> > > Our /usr/include headers define a lot of symbols that are used by
> > > critical utilities in the base system like ps and ifconfig, but aren't
> > > stable across major releases.  Since they aren't stable, utilities
> > > built for older releases won't run correctly on newer ones.  Would it
> > > make sense to guard these symbols so they can't be used by programs in
> > > the ports tree?  There is some precedent for that, for example
> > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> > for userspace code that wants to dig into kernel structures.  Similarly
> > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> > definitions are guarded by additional defines not due to their instability,
> > but because using them in userspace requires (much) more preparation from
> > userspace environment, which is either not trivial (_WANT_SOCKET) or
> > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> > sys/mount.h).
> >
> > >
> > > I'm particular, I'm thinking about symbols like the following:
> > > MINCORE_SUPER
> > Why this symbol should be hidden?  It is implementation-defined and
> > intended to be exposed to userspace.  All MINCORE_* not only MINCORE_SUPER
> > are under BSD_VISIBLE braces, because POSIX does not define the symbols.
>
> Because it isn't stable.  It changed for example in rev 847ab36bf22
> for 13.0.  Programs using the older value (including virtually every
> Rust program) won't work on 13.0 and later.
>
> >
> > > TDF_*
> > These symbols coming from non-standard header sys/proc.h.  If userspace
> > includes the header, it is already outside any formal standard, and I
> > do not see a reason to make the implementation more convoluted there.
> >
> > > PRI_MAX*
> > > PRI_MIN*
> > > PI_*, PRIBIO, PVFS, etc
> > > IFCAP_*
> > These are all implementation-specific and come from non-standard headers,
> > unless I am mistaken, then please correct me.
> >
> > > RLIM_NLIMITS
> > > IFF_*
> > Same.
> >
> > > *_MAXID
> > This is too broad.
>
> I'm talking about symbols like IPV6CTL_MAXID, which record the size of
> sysctl lists.  Obviously, these symbols can't be stable, and probably
> aren't useful outside of the base system.
>
> >
> > >
> > > Clearly delineating private symbols like this would ease the
> > > maintenance burden on languages that rely on FFI, like Ruby and Rust.
> > > FFI basically assumes that symbols once defined will never change.
> >
> > Why e.g. sys/proc.h is ever consumed by FFI wrappers?
>
> I should add a little detail.  Rust uses FFI to access C functions,
> and #define'd constants are redefined in the Rust bindings.  For most
> Rust programs, the build process doesn't check the contents of
> /usr/include in any way.  Instead, all of that stuff is hard-coded in
> the Rust bindings.  That makes cross-compiling a breeze!  But it does
> cause problems when the C library changes.  Adding a new symbol, like
> copy_file_range, isn't so bad.  If your Rust program doesn't use it,
> then the Rust binding will become an unused symbol and get eliminated
> by the linker.  If your Rust program does use it OTOH, then it will be
> resolved by the dynamic linker at runtime - if you're running on
> FreeBSD 13 or newer.  Otherwise, your program will fail to run.  A
> bigger problem is with symbols that change.  For example, the 64-bit
> inode stuff.  Rust programs still use a FreeBSD 11 ABI (we're working
> on that).  But other symbols change more frequently.  Things like
> PRI_MAX_REALTIME can change between any two releases.  That creates a
> big maintenance burden to keep track of them in the FFI bindings.  And
> they also aren't very useful in cross-compiled programs targeting a
> FreeBSD 11 ABI.  Instead, they really need to have bindings
> automatically generated at build time.  That's possible, but it's not
> the default.

This is exactly what happened with DMD D. When 64-bit statfs was introduced 
all DMD D compiled programs failed to run and recompiling didn't help. The 
DMD upstream failed to understand the problem. Eventually the port had to 
be removed.

>
> So what the Rust community really needs is a way to know which symbols
> will be stable across releases, and which might vary.  Are you
> suggesting that anything from a non-POSIX header file should be
> considered variable?
>

Rust and every other community.


-- 
Cheers,
Cy Schubert 
FreeBSD UNIX: Web:  http://www.FreeBSD.org
NTP:   Web:  https://nwtime.org

e^(i*pi)+1=0





Re: Header symbols that shouldn't be visible to ports?

2022-09-06 Thread Konstantin Belousov
On Tue, Sep 06, 2022 at 10:36:52AM -0600, Alan Somers wrote:
> On Tue, Sep 6, 2022 at 9:07 AM Warner Losh  wrote:
> >
> >
> >
> > On Tue, Sep 6, 2022 at 7:34 AM Konstantin Belousov  
> > wrote:
> >>
> >> On Mon, Sep 05, 2022 at 08:41:58AM -0600, Alan Somers wrote:
> >> > On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov 
> >> >  wrote:
> >> > >
> >> > > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> >> > > > Our /usr/include headers define a lot of symbols that are used by
> >> > > > critical utilities in the base system like ps and ifconfig, but 
> >> > > > aren't
> >> > > > stable across major releases.  Since they aren't stable, utilities
> >> > > > built for older releases won't run correctly on newer ones.  Would it
> >> > > > make sense to guard these symbols so they can't be used by programs 
> >> > > > in
> >> > > > the ports tree?  There is some precedent for that, for example
> >> > > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> >> > > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> >> > > for userspace code that wants to dig into kernel structures.  Similarly
> >> > > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> >> > > definitions are guarded by additional defines not due to their 
> >> > > instability,
> >> > > but because using them in userspace requires (much) more preparation 
> >> > > from
> >> > > userspace environment, which is either not trivial (_WANT_SOCKET) or
> >> > > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> >> > > sys/mount.h).
> >> > >
> >> > > >
> >> > > > I'm particular, I'm thinking about symbols like the following:
> >> > > > MINCORE_SUPER
> >> > > Why this symbol should be hidden?  It is implementation-defined and
> >> > > intended to be exposed to userspace.  All MINCORE_* not only 
> >> > > MINCORE_SUPER
> >> > > are under BSD_VISIBLE braces, because POSIX does not define the 
> >> > > symbols.
> >> >
> >> > Because it isn't stable.  It changed for example in rev 847ab36bf22
> >> > for 13.0.  Programs using the older value (including virtually every
> >> > Rust program) won't work on 13.0 and later.
> >> As Mark replied, older values still mostly work.  It was considered to
> >> not make unacceptable ABI change.
> >>
> >> >
> >> > >
> >> > > > TDF_*
> >> > > These symbols coming from non-standard header sys/proc.h.  If userspace
> >> > > includes the header, it is already outside any formal standard, and I
> >> > > do not see a reason to make the implementation more convoluted there.
> >> > >
> >> > > > PRI_MAX*
> >> > > > PRI_MIN*
> >> > > > PI_*, PRIBIO, PVFS, etc
> >> > > > IFCAP_*
> >> > > These are all implementation-specific and come from non-standard 
> >> > > headers,
> >> > > unless I am mistaken, then please correct me.
> >> > >
> >> > > > RLIM_NLIMITS
> >> > > > IFF_*
> >> > > Same.
> >> > >
> >> > > > *_MAXID
> >> > > This is too broad.
> >> >
> >> > I'm talking about symbols like IPV6CTL_MAXID, which record the size of
> >> > sysctl lists.  Obviously, these symbols can't be stable, and probably
> >> > aren't useful outside of the base system.
> >> The programs are not forced to use the symbols.  FFI bindings should not
> >> provide them, why do we need to specifically hide such defines?
> 
> Because if anybody ever adds it to the libc crate, then it's basically
> stuck there forever.  There's precedent for hiding defines like this:
> https://reviews.freebsd.org/D25816
> 
> >>
> >> >
> >> > >
> >> > > >
> >> > > > Clearly delineating private symbols like this would ease the
> >> > > > maintenance burden on languages that rely on FFI, like Ruby and Rust.
> >> > > > FFI basically assumes that symbols once defined will never change.
> >> > >
> >> > > Why e.g. sys/proc.h is ever consumed by FFI wrappers?
> >> >
> >> > I should add a little detail.  Rust uses FFI to access C functions,
> >> > and #define'd constants are redefined in the Rust bindings.  For most
> >> > Rust programs, the build process doesn't check the contents of
> >> > /usr/include in any way.  Instead, all of that stuff is hard-coded in
> >> > the Rust bindings.  That makes cross-compiling a breeze!
> >> Well, at the cost of the maintaining Rust libc crate.
> >> [Sorry, cannot refrain https://kib.kiev.ua/kib/rust_c_ffi.png ]
> >>
> >> > But it does
> >> > cause problems when the C library changes.  Adding a new symbol, like
> >> > copy_file_range, isn't so bad.  If your Rust program doesn't use it,
> >> > then the Rust binding will become an unused symbol and get eliminated
> >> > by the linker.  If your Rust program does use it OTOH, then it will be
> >> > resolved by the dynamic linker at runtime - if you're running on
> >> > FreeBSD 13 or newer.  Otherwise, your program will fail to run.
> >> The program would either fail at start if it does not reference the
> >> symbol version in some other way (due to other symbol), or at runtime
> >> when trying to do dynamic binding to that symbol otherwise.
> >>
> >> 

Re: Header symbols that shouldn't be visible to ports?

2022-09-06 Thread Alan Somers
On Tue, Sep 6, 2022 at 9:07 AM Warner Losh  wrote:
>
>
>
> On Tue, Sep 6, 2022 at 7:34 AM Konstantin Belousov  
> wrote:
>>
>> On Mon, Sep 05, 2022 at 08:41:58AM -0600, Alan Somers wrote:
>> > On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov  
>> > wrote:
>> > >
>> > > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
>> > > > Our /usr/include headers define a lot of symbols that are used by
>> > > > critical utilities in the base system like ps and ifconfig, but aren't
>> > > > stable across major releases.  Since they aren't stable, utilities
>> > > > built for older releases won't run correctly on newer ones.  Would it
>> > > > make sense to guard these symbols so they can't be used by programs in
>> > > > the ports tree?  There is some precedent for that, for example
>> > > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
>> > > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
>> > > for userspace code that wants to dig into kernel structures.  Similarly
>> > > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
>> > > definitions are guarded by additional defines not due to their 
>> > > instability,
>> > > but because using them in userspace requires (much) more preparation from
>> > > userspace environment, which is either not trivial (_WANT_SOCKET) or
>> > > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
>> > > sys/mount.h).
>> > >
>> > > >
>> > > > I'm particular, I'm thinking about symbols like the following:
>> > > > MINCORE_SUPER
>> > > Why this symbol should be hidden?  It is implementation-defined and
>> > > intended to be exposed to userspace.  All MINCORE_* not only 
>> > > MINCORE_SUPER
>> > > are under BSD_VISIBLE braces, because POSIX does not define the symbols.
>> >
>> > Because it isn't stable.  It changed for example in rev 847ab36bf22
>> > for 13.0.  Programs using the older value (including virtually every
>> > Rust program) won't work on 13.0 and later.
>> As Mark replied, older values still mostly work.  It was considered to
>> not make unacceptable ABI change.
>>
>> >
>> > >
>> > > > TDF_*
>> > > These symbols coming from non-standard header sys/proc.h.  If userspace
>> > > includes the header, it is already outside any formal standard, and I
>> > > do not see a reason to make the implementation more convoluted there.
>> > >
>> > > > PRI_MAX*
>> > > > PRI_MIN*
>> > > > PI_*, PRIBIO, PVFS, etc
>> > > > IFCAP_*
>> > > These are all implementation-specific and come from non-standard headers,
>> > > unless I am mistaken, then please correct me.
>> > >
>> > > > RLIM_NLIMITS
>> > > > IFF_*
>> > > Same.
>> > >
>> > > > *_MAXID
>> > > This is too broad.
>> >
>> > I'm talking about symbols like IPV6CTL_MAXID, which record the size of
>> > sysctl lists.  Obviously, these symbols can't be stable, and probably
>> > aren't useful outside of the base system.
>> The programs are not forced to use the symbols.  FFI bindings should not
>> provide them, why do we need to specifically hide such defines?

Because if anybody ever adds it to the libc crate, then it's basically
stuck there forever.  There's precedent for hiding defines like this:
https://reviews.freebsd.org/D25816

>>
>> >
>> > >
>> > > >
>> > > > Clearly delineating private symbols like this would ease the
>> > > > maintenance burden on languages that rely on FFI, like Ruby and Rust.
>> > > > FFI basically assumes that symbols once defined will never change.
>> > >
>> > > Why e.g. sys/proc.h is ever consumed by FFI wrappers?
>> >
>> > I should add a little detail.  Rust uses FFI to access C functions,
>> > and #define'd constants are redefined in the Rust bindings.  For most
>> > Rust programs, the build process doesn't check the contents of
>> > /usr/include in any way.  Instead, all of that stuff is hard-coded in
>> > the Rust bindings.  That makes cross-compiling a breeze!
>> Well, at the cost of the maintaining Rust libc crate.
>> [Sorry, cannot refrain https://kib.kiev.ua/kib/rust_c_ffi.png ]
>>
>> > But it does
>> > cause problems when the C library changes.  Adding a new symbol, like
>> > copy_file_range, isn't so bad.  If your Rust program doesn't use it,
>> > then the Rust binding will become an unused symbol and get eliminated
>> > by the linker.  If your Rust program does use it OTOH, then it will be
>> > resolved by the dynamic linker at runtime - if you're running on
>> > FreeBSD 13 or newer.  Otherwise, your program will fail to run.
>> The program would either fail at start if it does not reference the
>> symbol version in some other way (due to other symbol), or at runtime
>> when trying to do dynamic binding to that symbol otherwise.
>>
>> > A
>> > bigger problem is with symbols that change.  For example, the 64-bit
>> > inode stuff.  Rust programs still use a FreeBSD 11 ABI (we're working
>> > on that).
>> We did not changed symbols for ino64.  Old symbols were retained, the new
>> symbols were added under the new version.

Yes, I spoke imprecisely.  I 

Re: Header symbols that shouldn't be visible to ports?

2022-09-06 Thread Warner Losh
On Tue, Sep 6, 2022 at 7:34 AM Konstantin Belousov 
wrote:

> On Mon, Sep 05, 2022 at 08:41:58AM -0600, Alan Somers wrote:
> > On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov 
> wrote:
> > >
> > > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> > > > Our /usr/include headers define a lot of symbols that are used by
> > > > critical utilities in the base system like ps and ifconfig, but
> aren't
> > > > stable across major releases.  Since they aren't stable, utilities
> > > > built for older releases won't run correctly on newer ones.  Would it
> > > > make sense to guard these symbols so they can't be used by programs
> in
> > > > the ports tree?  There is some precedent for that, for example
> > > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> > > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> > > for userspace code that wants to dig into kernel structures.  Similarly
> > > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> > > definitions are guarded by additional defines not due to their
> instability,
> > > but because using them in userspace requires (much) more preparation
> from
> > > userspace environment, which is either not trivial (_WANT_SOCKET) or
> > > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> > > sys/mount.h).
> > >
> > > >
> > > > I'm particular, I'm thinking about symbols like the following:
> > > > MINCORE_SUPER
> > > Why this symbol should be hidden?  It is implementation-defined and
> > > intended to be exposed to userspace.  All MINCORE_* not only
> MINCORE_SUPER
> > > are under BSD_VISIBLE braces, because POSIX does not define the
> symbols.
> >
> > Because it isn't stable.  It changed for example in rev 847ab36bf22
> > for 13.0.  Programs using the older value (including virtually every
> > Rust program) won't work on 13.0 and later.
> As Mark replied, older values still mostly work.  It was considered to
> not make unacceptable ABI change.
>
> >
> > >
> > > > TDF_*
> > > These symbols coming from non-standard header sys/proc.h.  If userspace
> > > includes the header, it is already outside any formal standard, and I
> > > do not see a reason to make the implementation more convoluted there.
> > >
> > > > PRI_MAX*
> > > > PRI_MIN*
> > > > PI_*, PRIBIO, PVFS, etc
> > > > IFCAP_*
> > > These are all implementation-specific and come from non-standard
> headers,
> > > unless I am mistaken, then please correct me.
> > >
> > > > RLIM_NLIMITS
> > > > IFF_*
> > > Same.
> > >
> > > > *_MAXID
> > > This is too broad.
> >
> > I'm talking about symbols like IPV6CTL_MAXID, which record the size of
> > sysctl lists.  Obviously, these symbols can't be stable, and probably
> > aren't useful outside of the base system.
> The programs are not forced to use the symbols.  FFI bindings should not
> provide them, why do we need to specifically hide such defines?
>
> >
> > >
> > > >
> > > > Clearly delineating private symbols like this would ease the
> > > > maintenance burden on languages that rely on FFI, like Ruby and Rust.
> > > > FFI basically assumes that symbols once defined will never change.
> > >
> > > Why e.g. sys/proc.h is ever consumed by FFI wrappers?
> >
> > I should add a little detail.  Rust uses FFI to access C functions,
> > and #define'd constants are redefined in the Rust bindings.  For most
> > Rust programs, the build process doesn't check the contents of
> > /usr/include in any way.  Instead, all of that stuff is hard-coded in
> > the Rust bindings.  That makes cross-compiling a breeze!
> Well, at the cost of the maintaining Rust libc crate.
> [Sorry, cannot refrain https://kib.kiev.ua/kib/rust_c_ffi.png ]
>
> > But it does
> > cause problems when the C library changes.  Adding a new symbol, like
> > copy_file_range, isn't so bad.  If your Rust program doesn't use it,
> > then the Rust binding will become an unused symbol and get eliminated
> > by the linker.  If your Rust program does use it OTOH, then it will be
> > resolved by the dynamic linker at runtime - if you're running on
> > FreeBSD 13 or newer.  Otherwise, your program will fail to run.
> The program would either fail at start if it does not reference the
> symbol version in some other way (due to other symbol), or at runtime
> when trying to do dynamic binding to that symbol otherwise.
>
> > A
> > bigger problem is with symbols that change.  For example, the 64-bit
> > inode stuff.  Rust programs still use a FreeBSD 11 ABI (we're working
> > on that).
> We did not changed symbols for ino64.  Old symbols were retained, the new
> symbols were added under the new version.
>
> > But other symbols change more frequently.  Things like
> > PRI_MAX_REALTIME can change between any two releases.  That creates a
> > big maintenance burden to keep track of them in the FFI bindings.  And
> > they also aren't very useful in cross-compiled programs targeting a
> > FreeBSD 11 ABI.  Instead, they really need to have bindings
> > automatically generated 

Re: Header symbols that shouldn't be visible to ports?

2022-09-06 Thread Konstantin Belousov
On Mon, Sep 05, 2022 at 08:41:58AM -0600, Alan Somers wrote:
> On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov  
> wrote:
> >
> > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> > > Our /usr/include headers define a lot of symbols that are used by
> > > critical utilities in the base system like ps and ifconfig, but aren't
> > > stable across major releases.  Since they aren't stable, utilities
> > > built for older releases won't run correctly on newer ones.  Would it
> > > make sense to guard these symbols so they can't be used by programs in
> > > the ports tree?  There is some precedent for that, for example
> > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> > for userspace code that wants to dig into kernel structures.  Similarly
> > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> > definitions are guarded by additional defines not due to their instability,
> > but because using them in userspace requires (much) more preparation from
> > userspace environment, which is either not trivial (_WANT_SOCKET) or
> > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> > sys/mount.h).
> >
> > >
> > > I'm particular, I'm thinking about symbols like the following:
> > > MINCORE_SUPER
> > Why this symbol should be hidden?  It is implementation-defined and
> > intended to be exposed to userspace.  All MINCORE_* not only MINCORE_SUPER
> > are under BSD_VISIBLE braces, because POSIX does not define the symbols.
> 
> Because it isn't stable.  It changed for example in rev 847ab36bf22
> for 13.0.  Programs using the older value (including virtually every
> Rust program) won't work on 13.0 and later.
As Mark replied, older values still mostly work.  It was considered to
not make unacceptable ABI change.

> 
> >
> > > TDF_*
> > These symbols coming from non-standard header sys/proc.h.  If userspace
> > includes the header, it is already outside any formal standard, and I
> > do not see a reason to make the implementation more convoluted there.
> >
> > > PRI_MAX*
> > > PRI_MIN*
> > > PI_*, PRIBIO, PVFS, etc
> > > IFCAP_*
> > These are all implementation-specific and come from non-standard headers,
> > unless I am mistaken, then please correct me.
> >
> > > RLIM_NLIMITS
> > > IFF_*
> > Same.
> >
> > > *_MAXID
> > This is too broad.
> 
> I'm talking about symbols like IPV6CTL_MAXID, which record the size of
> sysctl lists.  Obviously, these symbols can't be stable, and probably
> aren't useful outside of the base system.
The programs are not forced to use the symbols.  FFI bindings should not
provide them, why do we need to specifically hide such defines?

> 
> >
> > >
> > > Clearly delineating private symbols like this would ease the
> > > maintenance burden on languages that rely on FFI, like Ruby and Rust.
> > > FFI basically assumes that symbols once defined will never change.
> >
> > Why e.g. sys/proc.h is ever consumed by FFI wrappers?
> 
> I should add a little detail.  Rust uses FFI to access C functions,
> and #define'd constants are redefined in the Rust bindings.  For most
> Rust programs, the build process doesn't check the contents of
> /usr/include in any way.  Instead, all of that stuff is hard-coded in
> the Rust bindings.  That makes cross-compiling a breeze!
Well, at the cost of the maintaining Rust libc crate.
[Sorry, cannot refrain https://kib.kiev.ua/kib/rust_c_ffi.png ]

> But it does
> cause problems when the C library changes.  Adding a new symbol, like
> copy_file_range, isn't so bad.  If your Rust program doesn't use it,
> then the Rust binding will become an unused symbol and get eliminated
> by the linker.  If your Rust program does use it OTOH, then it will be
> resolved by the dynamic linker at runtime - if you're running on
> FreeBSD 13 or newer.  Otherwise, your program will fail to run.
The program would either fail at start if it does not reference the
symbol version in some other way (due to other symbol), or at runtime
when trying to do dynamic binding to that symbol otherwise.

> A
> bigger problem is with symbols that change.  For example, the 64-bit
> inode stuff.  Rust programs still use a FreeBSD 11 ABI (we're working
> on that).
We did not changed symbols for ino64.  Old symbols were retained, the new
symbols were added under the new version.

> But other symbols change more frequently.  Things like
> PRI_MAX_REALTIME can change between any two releases.  That creates a
> big maintenance burden to keep track of them in the FFI bindings.  And
> they also aren't very useful in cross-compiled programs targeting a
> FreeBSD 11 ABI.  Instead, they really need to have bindings
> automatically generated at build time.  That's possible, but it's not
> the default.
> 
> So what the Rust community really needs is a way to know which symbols
> will be stable across releases, and which might vary.
Symbols, as something exported from libc/libthr/libm, are stable.
We 

Re: Header symbols that shouldn't be visible to ports?

2022-09-05 Thread Alan Somers
On Mon, Sep 5, 2022 at 8:53 AM Mark Johnston  wrote:
>
> On Mon, Sep 05, 2022 at 08:41:58AM -0600, Alan Somers wrote:
> > On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov  
> > wrote:
> > >
> > > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> > > > Our /usr/include headers define a lot of symbols that are used by
> > > > critical utilities in the base system like ps and ifconfig, but aren't
> > > > stable across major releases.  Since they aren't stable, utilities
> > > > built for older releases won't run correctly on newer ones.  Would it
> > > > make sense to guard these symbols so they can't be used by programs in
> > > > the ports tree?  There is some precedent for that, for example
> > > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> > > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> > > for userspace code that wants to dig into kernel structures.  Similarly
> > > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> > > definitions are guarded by additional defines not due to their 
> > > instability,
> > > but because using them in userspace requires (much) more preparation from
> > > userspace environment, which is either not trivial (_WANT_SOCKET) or
> > > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> > > sys/mount.h).
> > >
> > > >
> > > > I'm particular, I'm thinking about symbols like the following:
> > > > MINCORE_SUPER
> > > Why this symbol should be hidden?  It is implementation-defined and
> > > intended to be exposed to userspace.  All MINCORE_* not only MINCORE_SUPER
> > > are under BSD_VISIBLE braces, because POSIX does not define the symbols.
> >
> > Because it isn't stable.  It changed for example in rev 847ab36bf22
> > for 13.0.  Programs using the older value (including virtually every
> > Rust program) won't work on 13.0 and later.
>
> Why won't they work?  Code that tests (vec[i] & MINCORE_SUPER) using the
> old value will still give the same result when running on a newer
> kernel, since MINCORE_PSIND(1) is 0x20, the old MINCORE_SUPER value.
> This isn't to say that the change was perfectly backwards compatible,
> but I haven't seen an example of code which was broken by the change.

Well, from mincore(2):

In particular, applications compiled using the old value of
MINCORE_SUPER will not identify large pages with size index 2 as being
large pages.



Re: Header symbols that shouldn't be visible to ports?

2022-09-05 Thread Mark Johnston
On Mon, Sep 05, 2022 at 08:41:58AM -0600, Alan Somers wrote:
> On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov  
> wrote:
> >
> > On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> > > Our /usr/include headers define a lot of symbols that are used by
> > > critical utilities in the base system like ps and ifconfig, but aren't
> > > stable across major releases.  Since they aren't stable, utilities
> > > built for older releases won't run correctly on newer ones.  Would it
> > > make sense to guard these symbols so they can't be used by programs in
> > > the ports tree?  There is some precedent for that, for example
> > > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> > _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> > for userspace code that wants to dig into kernel structures.  Similarly
> > for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> > definitions are guarded by additional defines not due to their instability,
> > but because using them in userspace requires (much) more preparation from
> > userspace environment, which is either not trivial (_WANT_SOCKET) or
> > contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> > sys/mount.h).
> >
> > >
> > > I'm particular, I'm thinking about symbols like the following:
> > > MINCORE_SUPER
> > Why this symbol should be hidden?  It is implementation-defined and
> > intended to be exposed to userspace.  All MINCORE_* not only MINCORE_SUPER
> > are under BSD_VISIBLE braces, because POSIX does not define the symbols.
> 
> Because it isn't stable.  It changed for example in rev 847ab36bf22
> for 13.0.  Programs using the older value (including virtually every
> Rust program) won't work on 13.0 and later.

Why won't they work?  Code that tests (vec[i] & MINCORE_SUPER) using the
old value will still give the same result when running on a newer
kernel, since MINCORE_PSIND(1) is 0x20, the old MINCORE_SUPER value.
This isn't to say that the change was perfectly backwards compatible,
but I haven't seen an example of code which was broken by the change.



Re: Header symbols that shouldn't be visible to ports?

2022-09-05 Thread Alan Somers
On Sat, Sep 3, 2022 at 11:10 PM Konstantin Belousov  wrote:
>
> On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> > Our /usr/include headers define a lot of symbols that are used by
> > critical utilities in the base system like ps and ifconfig, but aren't
> > stable across major releases.  Since they aren't stable, utilities
> > built for older releases won't run correctly on newer ones.  Would it
> > make sense to guard these symbols so they can't be used by programs in
> > the ports tree?  There is some precedent for that, for example
> > _WANT_SOCKET and _WANT_MNTOPTNAMES.
> _WANT_SOCKET is clearly about exposing parts of the kernel definitions
> for userspace code that wants to dig into kernel structures.  Similarly
> for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
> definitions are guarded by additional defines not due to their instability,
> but because using them in userspace requires (much) more preparation from
> userspace environment, which is either not trivial (_WANT_SOCKET) or
> contradicts to standartized use of the header (_WANT_MNTOPTNAMES +
> sys/mount.h).
>
> >
> > I'm particular, I'm thinking about symbols like the following:
> > MINCORE_SUPER
> Why this symbol should be hidden?  It is implementation-defined and
> intended to be exposed to userspace.  All MINCORE_* not only MINCORE_SUPER
> are under BSD_VISIBLE braces, because POSIX does not define the symbols.

Because it isn't stable.  It changed for example in rev 847ab36bf22
for 13.0.  Programs using the older value (including virtually every
Rust program) won't work on 13.0 and later.

>
> > TDF_*
> These symbols coming from non-standard header sys/proc.h.  If userspace
> includes the header, it is already outside any formal standard, and I
> do not see a reason to make the implementation more convoluted there.
>
> > PRI_MAX*
> > PRI_MIN*
> > PI_*, PRIBIO, PVFS, etc
> > IFCAP_*
> These are all implementation-specific and come from non-standard headers,
> unless I am mistaken, then please correct me.
>
> > RLIM_NLIMITS
> > IFF_*
> Same.
>
> > *_MAXID
> This is too broad.

I'm talking about symbols like IPV6CTL_MAXID, which record the size of
sysctl lists.  Obviously, these symbols can't be stable, and probably
aren't useful outside of the base system.

>
> >
> > Clearly delineating private symbols like this would ease the
> > maintenance burden on languages that rely on FFI, like Ruby and Rust.
> > FFI basically assumes that symbols once defined will never change.
>
> Why e.g. sys/proc.h is ever consumed by FFI wrappers?

I should add a little detail.  Rust uses FFI to access C functions,
and #define'd constants are redefined in the Rust bindings.  For most
Rust programs, the build process doesn't check the contents of
/usr/include in any way.  Instead, all of that stuff is hard-coded in
the Rust bindings.  That makes cross-compiling a breeze!  But it does
cause problems when the C library changes.  Adding a new symbol, like
copy_file_range, isn't so bad.  If your Rust program doesn't use it,
then the Rust binding will become an unused symbol and get eliminated
by the linker.  If your Rust program does use it OTOH, then it will be
resolved by the dynamic linker at runtime - if you're running on
FreeBSD 13 or newer.  Otherwise, your program will fail to run.  A
bigger problem is with symbols that change.  For example, the 64-bit
inode stuff.  Rust programs still use a FreeBSD 11 ABI (we're working
on that).  But other symbols change more frequently.  Things like
PRI_MAX_REALTIME can change between any two releases.  That creates a
big maintenance burden to keep track of them in the FFI bindings.  And
they also aren't very useful in cross-compiled programs targeting a
FreeBSD 11 ABI.  Instead, they really need to have bindings
automatically generated at build time.  That's possible, but it's not
the default.

So what the Rust community really needs is a way to know which symbols
will be stable across releases, and which might vary.  Are you
suggesting that anything from a non-POSIX header file should be
considered variable?



Re: Header symbols that shouldn't be visible to ports?

2022-09-03 Thread Konstantin Belousov
On Sat, Sep 03, 2022 at 10:19:12AM -0600, Alan Somers wrote:
> Our /usr/include headers define a lot of symbols that are used by
> critical utilities in the base system like ps and ifconfig, but aren't
> stable across major releases.  Since they aren't stable, utilities
> built for older releases won't run correctly on newer ones.  Would it
> make sense to guard these symbols so they can't be used by programs in
> the ports tree?  There is some precedent for that, for example
> _WANT_SOCKET and _WANT_MNTOPTNAMES.
_WANT_SOCKET is clearly about exposing parts of the kernel definitions
for userspace code that wants to dig into kernel structures.  Similarly
for _WANT_MNTOPTNAMES, but in fact this thing is quite stable.  The
definitions are guarded by additional defines not due to their instability,
but because using them in userspace requires (much) more preparation from
userspace environment, which is either not trivial (_WANT_SOCKET) or
contradicts to standartized use of the header (_WANT_MNTOPTNAMES + 
sys/mount.h).

> 
> I'm particular, I'm thinking about symbols like the following:
> MINCORE_SUPER
Why this symbol should be hidden?  It is implementation-defined and
intended to be exposed to userspace.  All MINCORE_* not only MINCORE_SUPER
are under BSD_VISIBLE braces, because POSIX does not define the symbols.

> TDF_*
These symbols coming from non-standard header sys/proc.h.  If userspace
includes the header, it is already outside any formal standard, and I
do not see a reason to make the implementation more convoluted there.

> PRI_MAX*
> PRI_MIN*
> PI_*, PRIBIO, PVFS, etc
> IFCAP_*
These are all implementation-specific and come from non-standard headers,
unless I am mistaken, then please correct me.

> RLIM_NLIMITS
> IFF_*
Same.

> *_MAXID
This is too broad.

> 
> Clearly delineating private symbols like this would ease the
> maintenance burden on languages that rely on FFI, like Ruby and Rust.
> FFI basically assumes that symbols once defined will never change.

Why e.g. sys/proc.h is ever consumed by FFI wrappers?