Re: makesyscalls (moving forward)

2020-06-16 Thread Martin Husemann
On Mon, Jun 15, 2020 at 10:56:04PM +, David Holland wrote:
> A less glib example: line 3186 of vfs_syscalls.c, in stat or more
> precisely sys___stat50, has a handwritten call to copyout() to
> transfer the stat results back to userspace.

OK, this could be improved if we had the IOR/IOW/IORW flags for each
syscall pointer arg available in the description (like all interface
description languages have).

> The headline benefit of generating it rather than writing and
> maintaining it by hand (besides the obvious, lower maintenance costs)
> is that it becomes much harder to accidentally mix user and kernel
> pointers.

Since we have many architectures that just crash on such mistakes, I don't
think there is a big gain. But generated code is good anyway (I'm a lazy
programer).

>  > >- compat_otheros translation as well
>  > 
>  > I have no idea how that would work (or what exactly you mean).
> 
> A lot of it is the same kind of logic as compat32: for example the
> first half of ultrix_sys_open() is about translating the Ultrix
> representation of the arguments to the NetBSD representation.

Ok, so you want to read the syscall specification for NetBSD, the one
for Ultrix (both in the new input format), where the Ultrix one has
addititional references into the NetBSD one for syscalls that map
nearly 1:1, and then auto-generate the conversion code. This is a step
further than for the netbsd32 (or in general: same syscall, different ABI)
compat.

>  > Rump and anything that needs to serialize/deserialize syscalls are
>  > different beasts, and they could benefit from a common syscall
>  > "protocol" definition, and maybe in the end it could turn out that
>  > we do want to make that description the master source of our own
>  > syscall definitions.
> 
> The demarshaling done with copyout and the marshaling/demarshaling
> done to serialize system calls into a byte stream are the same kind of
> thing, even though the representations are different; if you can
> generate one, you can generate the others.

Yes, sure. But does it need to be the same tool? The output format will
be quite different. Same input specs: sure.

>  > I have no idea how sanitizers fit in here.
> 
> They need wrapper functions for system calls, which need to know much
> the same things as marshaling/demarshaling code.

OK, but again they need totaly different output, so might just be a different
tool (and also an upstream problem). We probably would have to provide some
tool to convert the shiny new syscall definition format into whatever the
upstream tool accepts as input.

>  > Maybe start with the basics and explain things from ground up
>  > before diverging into the hard issues (like the name and install
>  > location).
> 
> Well... sure except now I'm not sure where to start. Didn't want to
> start with lengthy explanations of things everyone already knows. :-/

IMO you did jump over quite a few steps and assume some special handling/
process model in your imaginary solution already, and I am not sure
everyone understands your model, the design decisions leading to it and
agrees with them.

Martin


Re: digest so far? (Re: makesyscalls (moving forward))

2020-06-15 Thread David Holland
On Tue, Jun 16, 2020 at 12:21:33AM +0100, Roy Marples wrote:
 > On 16/06/2020 00:07, David Holland wrote:
 > > Kamil thinks that. I don't see why. Compiling data into tools just
 > > complicates everything.
 > 
 > libterminfo.so has some terms compiled into it if the databases are
 > unreadable for any reason. A handy fallback.
 > 
 > dhcpcd also embedds dhcpcd.conf definitions for all RFC DHCP/DHCPv6 and
 > applicable ND options. There is an option to install it as an external file
 > (not enabled in NetBSD), but then it's just another moving part to
 > maintain.

Those aren't build-time tools and have much more severe operational
requirements :-)

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls (moving forward)

2020-06-15 Thread David Holland
On Mon, Jun 15, 2020 at 10:56:04PM +, David Holland wrote:
 > A less glib example: line 3186 of vfs_syscalls.c, in stat or more
 > precisely sys___stat50, has a handwritten call to copyout() to
 > transfer the stat results back to userspace.

To amplify:

Currently syscalls.master says:

439  STD  RUMP  { int|sys|50|stat(const char *path, struct stat *ub); }

Currently that generates the following objects related to handling
system calls:

   --

struct sys___stat50_args {
syscallarg(const char *) path;
syscallarg(struct stat *) ub;
};

struct sysent sysent[] = {
...
{
.sy_narg = sizeof(struct sys___stat50_args) / sizeof(register_t),
.sy_argsize = sizeof(struct sys___stat50_args),
.sy_flags = SYCALL_ARG_PTR,
.sy_call = (sy_call_t *)sys___stat50
},  /* 439 = __stat50 */
...

   --

and everything behond that is handwritten. The current structure of
the handwritten code is:

   sys___stat50 -> do_sys_statat -> namei and vn_stat

where sys___lstat50, sys_fstatat (recall that POSIX misnamed it, it's
what you'd expect to be called "statat") and several compat entry
points via do_sys_stat share do_sys_statat.

sys___stat50, sys___lstat50, sys_fstatat, and the various compat entry
points handle the copyout() for the stat buffer. The copyin() for the
path happens in do_sys_statat.

In a world where all this is generated, the entry point in
vfs_syscalls.c is do_sys_statat, and it receives only kernel
pointers. The system call table points to an autogenerated
sys___stat50 that looks something like this:

int
sys___stat50(struct lwp *l,
const struct sys___stat50_args *uap,
/*
syscallarg(const char *) path;
syscallarg(struct stat *) ub;
*/
register_t *retval)
{
struct pathbuf *pb;
struct stat sb;
int error;

error = pathbuf_copyin(SCARG(uap, path), );
if (error) {
return error;
}

error = do_sys_statat(l, AT_FDCWD, pb, FOLLOW, );
if (error) {
pathbuf_destroy(pb);
return error;
}

error = copyout(, SCARG(uap, ub), sizeof(sb));

pathbuf_destroy(pb);
return error;
}

sys___lstat50 would be the same except that it passes NOFOLLOW, and so
would the various compat entry points except that they'd also contain
a call to translate the output stat structure before copying it out.

One disadvantage is that there's now one copy of the pathbuf_copyin
call for each entry point instead of a single shared one; but (a) I'm
not convinced this is important and (b) it could be avoided in the
code generator if we really wanted to.

However, there are a number of advantages:
   - userspace pointers are not exposed to or handled by handwritten
 code and the chances of mishandling them (which is a security
 issue) decreases sharply;
   - the code generator is far less likely to produce wrong or missing
 error path code;
   - once this is all in place, nobody needs to think about this code
 again.

The current syscalls.master can't express everything needed to do
this; I would expect the declaration to look something like

   err statat(fd fd, pathname path, struct stat *OUT ub, atflags flags);
   err stat(pathname path, struct stat *OUT ub) =
  statat(AT_FDCWD, path, ub, 0);

then for the table entry, something like

   439 nb50 rump stat

not to mention

   38 compat43 modular stat
   188 compat12 modular stat
   278 compat30 nb13 modular stat
   387 compat50 nb30 modular rump stat

and in the compat_ultrix table,

   38 ultrix modular stat

as there have been quite a few versions of stat over the years.

Note that it distinguishes "err" and the particular type of flags from
"int", knows what a pathname is, etc., all of which is necessary or at
least helpful for generating the code one wants in various
circumstances.

There are probably some wrinkles with the compat declarations that I'm
not on top of yet, but it'll become clear later. (This has to be done
by incrementally replacing handwritten with generated code, or it'll
never work properly; the system and the interactions of all the compat
stuff is too complicated.)

-- 
David A. Holland
dholl...@netbsd.org


Re: digest so far? (Re: makesyscalls (moving forward))

2020-06-15 Thread Roy Marples

On 16/06/2020 00:07, David Holland wrote:

Kamil thinks that. I don't see why. Compiling data into tools just
complicates everything.


libterminfo.so has some terms compiled into it if the databases are unreadable 
for any reason. A handy fallback.


dhcpcd also embedds dhcpcd.conf definitions for all RFC DHCP/DHCPv6 and 
applicable ND options. There is an option to install it as an external file (not 
enabled in NetBSD), but then it's just another moving part to maintain.


Roy


Re: digest so far? (Re: makesyscalls (moving forward))

2020-06-15 Thread David Holland
On Mon, Jun 15, 2020 at 10:15:09PM +0200, Reinoud Zandijk wrote:
 > It would be great if that can be done for the ioctls dispatch code
 > as well and cover their copyin/copyout stuff, its now all over the
 > place; I think that would save a lot of hidden bugs. Is there a
 > specification that resembles the syscalls.master for ioctls?
 > Shouldn't there be one?

There is not and there should be, that's part of the goal.

 > Reading the discussion, I got the idea that the preferred place to
 > store the definitions for userland usage is internally in the
 > compiled makesyscalls; compiled with the source tables in
 > BSDSRCDIR/sys and not externally on disc.

Kamil thinks that. I don't see why. Compiling data into tools just
complicates everything.

Expect the installed description file to be something generated during
the system build, not a copy of syscalls.master.

Whether it matches the kernel is a red herring. If you have updated
the system properly, it will match /usr/include, and that is expected
to be consistent with the kernel the same way that it always has been.

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls (moving forward)

2020-06-15 Thread David Holland
On Mon, Jun 15, 2020 at 09:19:19AM +0200, Martin Husemann wrote:
 > > It seems to me that all of the following is mechanical and should be
 > > automatically generated, beyond what makesyscalls already does:
 > >- all the code that calls copyin/copyout
 > 
 > It is probably too early and I had too few coffee - but could
 > you point me at an example line of code that does copyin/copyout for
 > syscall args that you think should be replaced with automatically
 > generated code? How much of that generated code would be not from a verbatim
 > C block in the syscall description file?

Glib answer:
   % grep -w copyin kern/*.c | wc -l
151

A less glib example: line 3186 of vfs_syscalls.c, in stat or more
precisely sys___stat50, has a handwritten call to copyout() to
transfer the stat results back to userspace.

None of it needs to be verbatim blocks in the description file. The
last time I did this the code generator covered everything except argv
blocks, and that was with a pretty simpleminded awk script.

Generating this code has been standard practice in research systems
since the 90s.

The headline benefit of generating it rather than writing and
maintaining it by hand (besides the obvious, lower maintenance costs)
is that it becomes much harder to accidentally mix user and kernel
pointers.

 > >- compat32 translation for all syscalls and all ioctls
 > 
 > Tricky, but maybe doable. Not sure it will work for all.

I'm pretty sure it will. It's just a matter of reading the type
declarations and generating code that moves from 32 to 64 and back.
The hard part is getting all the type declarations in place so that
the tool can know what needs to be done, but having that is worthwhile
and part of the point of this exercise.

 > >- compat_otheros translation as well
 > 
 > I have no idea how that would work (or what exactly you mean).

A lot of it is the same kind of logic as compat32: for example the
first half of ultrix_sys_open() is about translating the Ultrix
representation of the arguments to the NetBSD representation. Other
parts are the same as the native NetBSD call handling code, except
using the Ultrix types. Some parts probably can't be done
automatically, like the second part of ultrix_sys_open() that
implements the vintage controlling tty behavior.

 > Rump and anything that needs to serialize/deserialize syscalls are
 > different beasts, and they could benefit from a common syscall
 > "protocol" definition, and maybe in the end it could turn out that
 > we do want to make that description the master source of our own
 > syscall definitions.

The demarshaling done with copyout and the marshaling/demarshaling
done to serialize system calls into a byte stream are the same kind of
thing, even though the representations are different; if you can
generate one, you can generate the others.

 > I have no idea how sanitizers fit in here.

They need wrapper functions for system calls, which need to know much
the same things as marshaling/demarshaling code.

 > Maybe start with the basics and explain things from ground up
 > before diverging into the hard issues (like the name and install
 > location).

Well... sure except now I'm not sure where to start. Didn't want to
start with lengthy explanations of things everyone already knows. :-/

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls (moving forward)

2020-06-15 Thread Jason Thorpe


> On Jun 15, 2020, at 12:15 PM, Reinoud Zandijk  wrote:
> 
> Trying to build LLVM or gcc from pkgsrc? That'll need NetBSD specific patches
> anyway and they can be created on package creation/update by the developer who
> has the source tree anyway or do you want the packages to create a bunch of
> NetBSD specific files on the fly using the tool on package compilation on
> NetBSD systems?

I think eventually that should not be the case.  We've been pushing hard to 
upstream local changes to the toolchain.

-- thorpej



digest so far? (Re: makesyscalls (moving forward))

2020-06-15 Thread Reinoud Zandijk
On Mon, Jun 15, 2020 at 05:16:01PM -, Christos Zoulas wrote:
> In article <20200615120806.gb1...@diablo.13thmonkey.org>, Reinoud Zandijk
>  wrote:
> >LLVM code and its fuzzing tools should be in tree anyway so it can be
> >created there on the fly too if requested.
> 
> How about strace, or any other program that wants access to the system call
> table information? Everything in the tree?
> 
> We have an opportunity here to do this better than we have been.  But we are
> trying to answer "how" first, not "why". Let's set up some goals and some
> properties that the new system call description table should have. What
> kinds of output should we be able to generate?

Sounds like a good plan. I tried starting by at least enumerating the users
but the discussion got a bit weildy.

The current info in syscalls.master (and other sources?) seems to be enough to
create syscall.h and syscallargs.h which is impressive. I guess this is the
base usage, to create these structs and constants. We can then, as per request
from David, use it for the in-kernel dispatch C code generation for these
system calls complete with their copyin/copyout.

It would be great if that can be done for the ioctls dispatch code as well and
cover their copyin/copyout stuff, its now all over the place; I think that
would save a lot of hidden bugs. Is there a specification that resembles the
syscalls.master for ioctls? Shouldn't there be one?

A second usage request is to use it for ktrace output decoding etc. I'd
welcome that for the current output is not really that parseable. A full
decoding of system call arguments and results complete with named structs
would greatly improve it. As above, please also do the ioctls, these are quite
incomprehencable in the current form and only a few are decoded IIRC.

A third usage request is to create code for pkgsrc call tracers/dumpers at
compilation time. I guess its for decoding arguments not unlike ktruss/kdump
does.

A fourth usage request is to create interfaces on-the-fly for non-C languages
when compiling the languages from pkgsrc. I presume creating 'C' like stubs or
data structures for them to parse and create code with?

Reading the discussion, I got the idea that the preferred place to store the
definitions for userland usage is internally in the compiled makesyscalls;
compiled with the source tables in BSDSRCDIR/sys and not externally on disc.
This has the advantage that its not corruptable and always compatible with its
sole user and BSDSRCDIR is not needed for usage. This of course has the
downside that that it is installed userland tailored and not tailored to the
running kernel version unless its updated; bit like config(1) dependency on
the format of the config files.

As for the storage of templates etc, those can be stored next to where they
are used. In BSDSRCDIR for kernel work, in a pkg when its doing its thing etc.
Sounds like a logical thing to do.

I must have missed a lot, and missed use cases but I hope this summary will
help the discussion a bit.

With regards,
Reinoud



Re: makesyscalls (moving forward)

2020-06-15 Thread Reinoud Zandijk
On Mon, Jun 15, 2020 at 02:58:31PM +0200, Kamil Rytarowski wrote:
> LLVM is an external project and only in a special case part of the
> basesystem. While there, there is the same issue with GCC sanitizers. We
> definitely don't want to request regular LLVM or GCC users building the
> toolchain to depend on TOOLDIR / BSDSRCDIR.

If you mean developers that work on the toolchain, you work with patches in
the tree in one way or another, doing pullups and the like but then the source
tree is already there so what is the problem then?

Trying to build LLVM or gcc from pkgsrc? That'll need NetBSD specific patches
anyway and they can be created on package creation/update by the developer who
has the source tree anyway or do you want the packages to create a bunch of
NetBSD specific files on the fly using the tool on package compilation on
NetBSD systems?

> > 4) some syscall bashing tools for testing etc. They are tailored anyway so
> > using a $BSDSRCDIR specfic program that is not installed is not that
> > relevant.
> I don't know what is syscall bashing tool for testing.

Wrong term i think; i meant sending the kernel random or on purpose random
ioctls and system calls to test input parameter checking. You mean this?
> We work on rumpkernel syscall fuzzers during the ongoing GSoC.

> As of today GDB, but other similar programs can/shall follow.
> 
> syscall tracers (I wrote picotrace, truss - both distributed in pkgsrc;
> there is strace)

I don't know these. Sounds like ktrace binary output displaying? Or do they
collect info in another way? Why not fix or enhance ktrace/kdump/ktruss ? ;)

> Language runtime, basically everything that is not using libc could use it
> (go, rust, D, etc).

I forgot that one yes, but this is serious feature creep going on! You want at
compilation time (of those languages) to auto generate the equivalent header
and system call stubs using the data? See below.

> For a developer it is fine to request BSDSRCDIR to be available, but for
> users (of e.g. GDB) this is certainly an overkill. makesyscalls(1) will be
> maybe up to 1MB. Just $BSDSRCDIR/sys takes around 450MB. If we want to
> depend on BSDSRCDIR for programs in pkgsrc, this is IMO a blocker.

Why would a runtime use of say gdb make use of the tool?

> Prebuilt picotrace takes less than 100kb. Adding a hard dependency on
> BSDSRCDIR would be severe overkill.

For compilation of picotrace and the likes you mean? A build in syscall table,
wich you also prefer IIRC, would hardcode it to the current installed userland
version, not the kernel version and is that preferable in pkgsrc settings?

Reinoud



Re: makesyscalls (moving forward)

2020-06-15 Thread Christos Zoulas
In article <20200615120806.gb1...@diablo.13thmonkey.org>,
Reinoud Zandijk   wrote:
>Small addendum,
>
>On Mon, Jun 15, 2020 at 01:44:19PM +0200, Reinoud Zandijk wrote:
>> What about not installing it at all? Its only going to be used during
>> definition updates or fixes. Compare it to the pcidevs.h and pcidevs_data.h
>> creation only this time it creates the relevant kernel/rump/fuzzer files. The
>> program can optionally be compiled and linked in /tmp and then called from
>> there to create all the variants using the templates or just be created in
>> place and cleaned up later.  No need to install it in base. The resulting
>> files can then be committed as `regen' just like the pcidevs variants.
>
>LLVM code and its fuzzing tools should be in tree anyway so it can be created
>there on the fly too if requested.

How about strace, or any other program that wants access to the system call
table information? Everything in the tree?

We have an opportunity here to do this better than we have been.
But we are trying to answer "how" first, not "why". Let's set up
some goals and some properties that the new system call description
table should have. What kinds of output should we be able to generate?

christos



Re: makesyscalls (moving forward)

2020-06-15 Thread Jason Thorpe


> On Jun 15, 2020, at 5:49 AM, Mouse  wrote:
> 
> I considered suggesting something like /usr/tools, but I don't really
> think that's a good idea.

I think it's reasonable to draw a distinction between "tools that are commonly 
used for non-system development" and "tools that are very specific to the 
system".

I think /usr/sys might be reasonable... /usr/sys/bin, /usr/sys/libdata, etc.

-- thorpej



Re: makesyscalls (moving forward)

2020-06-15 Thread Johnny Billquist

On 2020-06-15 15:39, Kamil Rytarowski wrote:

On 15.06.2020 15:21, Johnny Billquist wrote:


Anyway. Who here does not modify their path at login anyway.


The path has to be readily available for pkgsrc users with unprepared
environment.

However if we install the utility into /usr/sys (similar to /usr/games),
we can use a full path to the program and it will be good enough (for
me). Are there other programs that would be moved to this directory?


Using explicit paths is sometimes a good idea no matter what.

Obviously I think something like config should be moved out.

I would tend to look at /sbin as where tools required to manage the 
system or get information which "normal" users commonly would not care 
about. /bin would be where things a large portion of ignorant users 
programs could be found. Even things like compilers would make sense to 
have there, as also things like passwd. However, ifconfig for example I 
would keep out. If can be used even by normal users, in order to look at 
interface details, but it's outside of what I think a naive user would 
go for. Same for arp.


However, things needed to build kernel and userland is neither normal 
user tools, nor system administration tools in the normal sense, so 
neither /bin, nor /sbin really feels like the right place.


/sys would make more sense, but I'm not totally clear about if there are 
aspects I haven't thought about yet.


(And I'm currently saying /bin, /sbin and so on, without adding the 
corresponding /usr/... paths. Traditionally, the things under root were 
things you expected to need to be able to do without even having file 
systems mounted and so on, and I like that distinction. So I guess most 
things related to system building would only need to be in /usr/sys, if 
that was the path we'd go for.)



I have got a feeling that too many programs already rely on specific
kernel internals so making a distinction would only confuse people and
impose unclear conditions what belongs where. fsdb(8) or crash(8) are
definitely not going to be very usable with mixed kernel and userland
versions.

Something we possibly agree upon is that makesyscalls(1) would not be a
tool for administer a computer/server, so /usr/sbin /sbin is not a good
place.


I agree that /sbin also does not feel natural for makesyscalls. If I 
were to have to choose between /bin or /sbin, I would however go for 
/sbin. I have less issues creating kindof weird views for a system 
admin, which I expect should know a little more about what he is doing 
and why, than an ignorant user, for which I think a cohesive and 
meaningful environment is the most important. As such, something like 
this under /bin would just be a very confusing and not useful tool for 
that user.


It all boils down to what the purpose of the different directories are. 
And I cannot agree that things under /sbin should somehow only be tools 
for root, or required that you are root. It is in a different directory 
than /bin so that it don't have to be in the path of various people. If 
we are not interested in that path separation/distinction then we do not 
need the separate directories at all. /sbin for tools that mainly system 
admin tools sits in, do make sense. Even hinted at, if you read sbin as 
system binaries.


But there are just some tools that are very specific to kernel 
development, and that is really a kind of binaries for which I don't 
think we have a good place right now.
/libexec is another interesting place, but I consider that to be a place 
where binaries that are invoked by other binaries are located. And your 
description of your planned use of makesyscall makes it sound like you 
are planning to use it directly, and not just have it as a binary 
invoked by other tools...


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-15 Thread Jason Thorpe


> On Jun 15, 2020, at 6:39 AM, Kamil Rytarowski  wrote:
> 
> However if we install the utility into /usr/sys (similar to /usr/games),
> we can use a full path to the program and it will be good enough (for
> me). Are there other programs that would be moved to this directory?

The Device Tree compiler (/usr/bin/dtc) seems to fall into a similar category.

As has already been mentioned, /usr/bin/config (and maybe it gets a new name 
while we're at it).

There's also the collection of elf2{ecoff,aout} tools that we have now that we 
no longer use objcopy for those tasks; they are used only for transmogrifying 
boot loaders and/or kernel binaries.

-- thorpej



Re: makesyscalls (moving forward)

2020-06-15 Thread Kamil Rytarowski
On 15.06.2020 15:21, Johnny Billquist wrote:
> 
> Anyway. Who here does not modify their path at login anyway.

The path has to be readily available for pkgsrc users with unprepared
environment.

However if we install the utility into /usr/sys (similar to /usr/games),
we can use a full path to the program and it will be good enough (for
me). Are there other programs that would be moved to this directory?

I have got a feeling that too many programs already rely on specific
kernel internals so making a distinction would only confuse people and
impose unclear conditions what belongs where. fsdb(8) or crash(8) are
definitely not going to be very usable with mixed kernel and userland
versions.

Something we possibly agree upon is that makesyscalls(1) would not be a
tool for administer a computer/server, so /usr/sbin /sbin is not a good
place.



signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-15 Thread Johnny Billquist
Sorry for the top posting terse reply. On my phone as I'm in a meeting at work 
right now.

Anyway. Who here does not modify their path at login anyway. So arguments like 
"I need it in bin because I am going to use it a lot" seems like weak 
arguments. Think of what make sense for people that don't know much. For 
yourself whatever the location, you should not have a hard time to get it setup 
to work nicely for you. No matter where the binary ends up. Really.

  Johnny 


Kamil Rytarowski  skrev: (15 juni 2020 14:30:49 CEST)
>On 15.06.2020 14:16, Johnny Billquist wrote:
>> On 2020-06-15 14:12, Kamil Rytarowski wrote:
>>> On 15.06.2020 14:11, Johnny Billquist wrote:
>>>>
>>>> We should not clutter the directories that are in the normal users
>path
>>>> with things that a normal user would never care about.
>>>
>>> I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin.
>but
>>> I definitely would use makesyscall(1). If you have other argument
>that
>>> "I don't use it" please speak up.
>> 
>> I'm not convinced you are particularly representative of "users".
>> 
>
>NetBSD is a my daily driver so I'm a user!
>
>> But it would be interesting to hear how and when you are planning to
>use
>> makesyscalls.
>> 
>
>I work with the syscall layer almost continuously in various projects
>(debuggers, fuzzers, syscall tracers, sanitizers, non-libc language
>runtimes etc). Reiterating over the same list 10 times just increases
>the frustration and perception of lost time of repeating the same
>process in an incompatible way for another program. The tool shall
>centralize the whole knowledge about passed arguments, structs and
>export it to users through a flexible code generation.
>
>We already distribute to users /usr/include/sys/syscalls.h (and it is
>used e.g. by GDB to parse the syscalls, as parsing syscalls.master in
>that case was harder). makesyscalls(1) is intended to be a more
>specialized and generic version of the same functionality as
>distributed
>by this header.
>
>With some sort of fanciness, we could generate these lists on the fly
>in
>some projects (for e.g. GDB) and we would want the utility to be
>available in place. If it is restricted to build-only phase of various
>programs (that definitely shall be free from BSDSRCDIR dependency) it
>will be good enough.
>
>I'm for adding this program in PATH and I would be a user on a regular
>basis. I basically need it for pretty everything (2 GSoC ongoing
>projects are about covering the same syscalls in 2 different ways).
>Asking me for a use-case is odd to me as it is an elementary program
>that belongs to /usr/bin.
>
>>   Johnny
>> 

-- 
Skickat från min Android-enhet med K-9 Mail. Ursäkta min fåordighet.

Re: makesyscalls (moving forward)

2020-06-15 Thread Kamil Rytarowski
On 15.06.2020 14:35, Reinoud Zandijk wrote:
> On Mon, Jun 15, 2020 at 02:06:00PM +0200, Kamil Rytarowski wrote:
>> On 15.06.2020 13:44, Reinoud Zandijk wrote:
>>>  No need to install it in base. The resulting files can then be committed
>>>  as `regen' just like the pcidevs variants.
>>
>> I disagree as we don't want to pull ${BSDSRCDIR} dependency for users, for
>> building an application.
> 
> Lets try to make it clear then: who are the users?
> 
> 1) Kernel syscall and compat (module) code; only when updating calls
> 
> 2) ktrace (and friends) system calls decode. That would greatly increase
> readability ! Esp. if passed arguments could be automatically dumped too.
> 

The above are good for TOOLDIR.

Below:

> 3) (llvm) fuzzers for testing; this is intree too so no big deal
> 

LLVM is an external project and only in a special case part of the
basesystem. While there, there is the same issue with GCC sanitizers. We
definitely don't want to request regular LLVM or GCC users building the
toolchain to depend on TOOLDIR / BSDSRCDIR.

> 4) some syscall bashing tools for testing etc. They are tailored anyway so
> using a $BSDSRCDIR specfic program that is not installed is not that relevant.
> 

I don't know what is syscall bashing tool for testing.

> But what else? There are IMHO no other valid users.
> 

As of today GDB, but other similar programs can/shall follow.

syscall tracers (I wrote picotrace, truss - both distributed in pkgsrc;
there is strace)

Language runtime, basically everything that is not using libc could use
it (go, rust, D, etc).

kernel fuzzers (syzkaller)

We work on rumpkernel syscall fuzzers during the ongoing GSoC.

>> This utility shall receive ATF testing and thus shall be part of $PATH.
> 
> ATF testing sounds like a good idea; but does an utility have to be installed
> to be able to test code?
> 

Yes.

That could be worked around for ATF, but generally it needs testing.

> Also the generated files need to be updated in the kernel source tree and are
> tightly coupled to the kernel code templates.
> 

Yes.

>> Putting it to /kern would be bad as we will gain another kernel ABI
>> dependency and this program won't be usable in TOOLDIR neither when working
>> with different target NetBSD release than the developer's computer.
>>
>> I personally think that the definition file shall be embedded directly into
>> the program to avoid any issues with incompatible script version vs
>> makesyscalls(1) program.
> 
> You got a point there, and embedding it would make sense yes; but i still
> wouldn't install the program or its definition files as its kernel source
> version dependent and when building tools etc. $BSDSRCDIR is obviously
> available anyway.
> 

For a developer it is fine to request BSDSRCDIR to be available, but for
users (of e.g. GDB) this is certainly an overkill. makesyscalls(1) will
be maybe up to 1MB. Just $BSDSRCDIR/sys takes around 450MB. If we want
to depend on BSDSRCDIR for programs in pkgsrc, this is IMO a blocker.

Prebuilt picotrace takes less than 100kb. Adding a hard dependency on
BSDSRCDIR would be severe overkill.

The intention of this tool is too export its functionality to regular
programs that can be built in pkgsrc and there are plenty of them.

> Reinoud
> 

I'm definitely going to be a user of this program (in default PATH,
without BSDSRCDIR) and whenever possible, I will wire pkgsrc packages to
depend on it as soon as possible. None of the pkgsrc programs will need
BSDSRCDIR.



signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-15 Thread Mouse
>> We should not clutter the directories that are in the normal users
>> path with things that a normal user would never care about.
> I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin.

Me neither.  calendar, indxbib, texi2dvi, newsyslog, lastcomm, sdiff,
innetgr...actually, newsyslog strikes me as a sysadmin command that
should be in /usr/sbin, not /usr/bin.

> but I definitely would use makesyscall(1).  If you have other
> argument that "I don't use it" please speak up.

None of us are "normal user"s in the sense in which it's being used
here.

But I'd also point out that the division between "sysadmin" and "normal
user" is not as clear-cut as this discussion is making it sound.  It's
a spectrum, all the way from tourist types who find changing working
directory to be a complicated and dangerous concept to people who
routinely grub around in device drivers, locore, and pmap code.

I considered suggesting something like /usr/tools, but I don't really
think that's a good idea.  There are plenty of users who have no use
for compilers, either, yet would anyone suggest moving them out of
/usr/bin?  I'd personally prefer to see the distinction between bin and
sbin (ie, between /bin and /sbin, and between /usr/bin and /usr/sbin)
go away; no matter how you draw that line, there will be people who
fall on the "wrong" side of it, people with reason to keep directories
in their path which your line says they shouldn't.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: makesyscalls (moving forward)

2020-06-15 Thread Reinoud Zandijk
On Mon, Jun 15, 2020 at 02:06:00PM +0200, Kamil Rytarowski wrote:
> On 15.06.2020 13:44, Reinoud Zandijk wrote:
> >  No need to install it in base. The resulting files can then be committed
> >  as `regen' just like the pcidevs variants.
> 
> I disagree as we don't want to pull ${BSDSRCDIR} dependency for users, for
> building an application.

Lets try to make it clear then: who are the users?

1) Kernel syscall and compat (module) code; only when updating calls

2) ktrace (and friends) system calls decode. That would greatly increase
readability ! Esp. if passed arguments could be automatically dumped too.

3) (llvm) fuzzers for testing; this is intree too so no big deal

4) some syscall bashing tools for testing etc. They are tailored anyway so
using a $BSDSRCDIR specfic program that is not installed is not that relevant.

But what else? There are IMHO no other valid users.

> This utility shall receive ATF testing and thus shall be part of $PATH.

ATF testing sounds like a good idea; but does an utility have to be installed
to be able to test code?

Also the generated files need to be updated in the kernel source tree and are
tightly coupled to the kernel code templates.

> Putting it to /kern would be bad as we will gain another kernel ABI
> dependency and this program won't be usable in TOOLDIR neither when working
> with different target NetBSD release than the developer's computer.
> 
> I personally think that the definition file shall be embedded directly into
> the program to avoid any issues with incompatible script version vs
> makesyscalls(1) program.

You got a point there, and embedding it would make sense yes; but i still
wouldn't install the program or its definition files as its kernel source
version dependent and when building tools etc. $BSDSRCDIR is obviously
available anyway.

Reinoud



Re: makesyscalls (moving forward)

2020-06-15 Thread Kamil Rytarowski
On 15.06.2020 14:16, Johnny Billquist wrote:
> On 2020-06-15 14:12, Kamil Rytarowski wrote:
>> On 15.06.2020 14:11, Johnny Billquist wrote:
>>>
>>> We should not clutter the directories that are in the normal users path
>>> with things that a normal user would never care about.
>>
>> I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin. but
>> I definitely would use makesyscall(1). If you have other argument that
>> "I don't use it" please speak up.
> 
> I'm not convinced you are particularly representative of "users".
> 

NetBSD is a my daily driver so I'm a user!

> But it would be interesting to hear how and when you are planning to use
> makesyscalls.
> 

I work with the syscall layer almost continuously in various projects
(debuggers, fuzzers, syscall tracers, sanitizers, non-libc language
runtimes etc). Reiterating over the same list 10 times just increases
the frustration and perception of lost time of repeating the same
process in an incompatible way for another program. The tool shall
centralize the whole knowledge about passed arguments, structs and
export it to users through a flexible code generation.

We already distribute to users /usr/include/sys/syscalls.h (and it is
used e.g. by GDB to parse the syscalls, as parsing syscalls.master in
that case was harder). makesyscalls(1) is intended to be a more
specialized and generic version of the same functionality as distributed
by this header.

With some sort of fanciness, we could generate these lists on the fly in
some projects (for e.g. GDB) and we would want the utility to be
available in place. If it is restricted to build-only phase of various
programs (that definitely shall be free from BSDSRCDIR dependency) it
will be good enough.

I'm for adding this program in PATH and I would be a user on a regular
basis. I basically need it for pretty everything (2 GSoC ongoing
projects are about covering the same syscalls in 2 different ways).
Asking me for a use-case is odd to me as it is an elementary program
that belongs to /usr/bin.

>   Johnny
> 




signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-15 Thread Greg Troxel
David Holland  writes:

> Meanwhile it doesn't belong in sbin because it doesn't require root,
> nor does doing something useful with it require root, and it doesn't
> need to be on /, so... usr.bin. Unless we think libexec is reasonable,
> but if 3rd-party code is going to be running it we really want it on
> the $PATH, so...

I agree with that logic, that makesyscalls is kind of like config, and
that /usr/bin makes sense.  There's nothing admin-ish about it, as
building an operating system is not about configuring the host.

We could have a directory for tools used only for building NetBSD that
are not otherwise useful, and put config and makesyscalls there, but
given that we aren't overwhelming bin in a way that causes trouble, that
doesn't seem like a good idea.


Re: makesyscalls (moving forward)

2020-06-15 Thread Johnny Billquist




On 2020-06-15 14:16, Johnny Billquist wrote:

On 2020-06-15 14:12, Kamil Rytarowski wrote:

On 15.06.2020 14:11, Johnny Billquist wrote:


We should not clutter the directories that are in the normal users path
with things that a normal user would never care about.


I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin. but
I definitely would use makesyscall(1). If you have other argument that
"I don't use it" please speak up.


I'm not convinced you are particularly representative of "users".


Sorry, I left out "normal" in there. :-)
It should have said "normal users".

And I agree, there are things in /bin or /usr/bin that I never use 
either. No two persons are the same, and there has to be a line drawn 
somewhere.


But at the moment, that line seem to be drawn by some in a way that for 
me feels totally crazy.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-15 Thread Johnny Billquist

On 2020-06-15 14:12, Kamil Rytarowski wrote:

On 15.06.2020 14:11, Johnny Billquist wrote:


We should not clutter the directories that are in the normal users path
with things that a normal user would never care about.


I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin. but
I definitely would use makesyscall(1). If you have other argument that
"I don't use it" please speak up.


I'm not convinced you are particularly representative of "users".

But it would be interesting to hear how and when you are planning to use 
makesyscalls.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-15 Thread Johnny Billquist

On 2020-06-15 14:08, Reinoud Zandijk wrote:

On Mon, Jun 15, 2020 at 01:44:19PM +0200, Reinoud Zandijk wrote:
As for config(1), I never understood why it is installed in /usr/bin and is
called with such a generic name, but i guess thats historical.


It's not that historical. And I really think it's totally wrong that it 
is in /usr/bin.


In 2.11BSD, config is actually located in the conf directory. (But it's 
also just a shellscript, so a bit simpler than in NetBSD.)


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-15 Thread Kamil Rytarowski
On 15.06.2020 14:11, Johnny Billquist wrote:
> 
> We should not clutter the directories that are in the normal users path
> with things that a normal user would never care about.

I never used 90% of the programs from /usr/bin /usr/sbin /bin /sbin. but
I definitely would use makesyscall(1). If you have other argument that
"I don't use it" please speak up.



signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-15 Thread Johnny Billquist

On 2020-06-15 12:25, Kamil Rytarowski wrote:

On 15.06.2020 00:57, Johnny Billquist wrote:

On 2020-06-15 00:52, Kamil Rytarowski wrote:

On 15.06.2020 00:26, Johnny Billquist wrote:

But that's just me. I'll leave the deciding to you guys...



This is only me, but /sbin and /usr/sbin are for users with root
privileges, while /bin and /usr/bin for everybody. makesyscalls(1)
intends to be an end-user program that aids building software and this
is just another specialized program similar to flex(1) or yacc(1), just
a more domain specific code generator.


Is ping only for people with root privileges???



ping needs setuid so yes.


What kind of silly argument is that?
I don't at all understand how people first of all can think that stuff 
under /sbin is for root only. Second, setuid exists for exactly the 
reason that non-root people should be able to run some things that 
requires root provileges.


It seems we are starting to do the distinction on where programs should 
go based on whether you need to be root to run them or not, which for me 
is a totally crazy idea.


Should passwd then move to /sbin?

Just look around, there are plenty of programs under /bin and /usr/bin 
which are setuid root. They should *not* move into /sbin because of that.


/sbin and /usr/sbin holds tools that are generally not used by others 
that system administrators. Tools that are often needed to bring up a 
system at boot time. Tools that commonly are not in normal users path.


We should not clutter the directories that are in the normal users path 
with things that a normal user would never care about.


If we basically place any kind of tool in /bin or /sbin just because it 
don't require people to be root, then you will have all kind of stuff 
there are is totally meaningless for most users, while a lot of 
meaningful things will be in /sbin and /usr/sbin, at which point 
everyone will need to have all directories in their paths, at which 
point there is no point in even having a separate /bin and /sbin.


There has to be a reason to split binaries into different directories. A 
reason that have some practical meaning. Or else it's not actually 
useful, and instead becomes just some obscure inner circle "thing" of 
obscurity.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-15 Thread Reinoud Zandijk
Small addendum,

On Mon, Jun 15, 2020 at 01:44:19PM +0200, Reinoud Zandijk wrote:
> What about not installing it at all? Its only going to be used during
> definition updates or fixes. Compare it to the pcidevs.h and pcidevs_data.h
> creation only this time it creates the relevant kernel/rump/fuzzer files. The
> program can optionally be compiled and linked in /tmp and then called from
> there to create all the variants using the templates or just be created in
> place and cleaned up later.  No need to install it in base. The resulting
> files can then be committed as `regen' just like the pcidevs variants.

LLVM code and its fuzzing tools should be in tree anyway so it can be created
there on the fly too if requested.

Or are the external pkgsrc'd LLVM also to use this? Who can guarantee then
that (1) the syscall file that is installed, (2) the installed generator or
(3) its templates are even compatible or up to date? It would be a blind
guess.

> I wouldn't install the file in the FS; kernel and userland are often out of
> sync, maybe even versions apart with say NetBSD-8 userland but NetBSD-current
> kernel. If anywhere, request the data from the kernel by exposing it in /kern.
> Exposing it one way or another might be an attack vector too ...

If one really wants to install the program, you might store it somewhere in
/usr/libexec or the like and the definition file be exposed in /kern then
maybe it could help. The file could be stored compressed internally or be
installed in /stand/$arch/$version with a symlink in /kern

As for config(1), I never understood why it is installed in /usr/bin and is
called with such a generic name, but i guess thats historical.

Reinoud



Re: makesyscalls (moving forward)

2020-06-15 Thread Kamil Rytarowski
On 15.06.2020 13:44, Reinoud Zandijk wrote:
>  No need to install it in base. The resulting
> files can then be committed as `regen' just like the pcidevs variants.

I disagree as we don't want to pull ${BSDSRCDIR} dependency for users,
for building an application.

This utility shall receive ATF testing and thus shall be part of $PATH.

On 15.06.2020 13:44, Reinoud Zandijk wrote:
> I wouldn't install the file in the FS; kernel and userland are often
out of
> sync, maybe even versions apart with say NetBSD-8 userland but
NetBSD-current
> kernel. If anywhere, request the data from the kernel by exposing it
in /kern.
> Exposing it one way or another might be an attack vector too ...


Putting it to /kern would be bad as we will gain another kernel ABI
dependency and this program won't be usable in TOOLDIR neither when
working with different target NetBSD release than the developer's computer.

I personally think that the definition file shall be embedded directly
into the program to avoid any issues with incompatible script version vs
makesyscalls(1) program.



signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-15 Thread Reinoud Zandijk
On Sun, Jun 14, 2020 at 09:07:45PM +, David Holland wrote:
> As I mentioned a few days ago the reason I was prodding
> makesyscalls.sh is that I've been looking at generating more of the
> system call argument handling code.
...
> This raises two points that need to be bikeshedded:
> 
> (1) What's the new tool called, and where does it live in the tree?
> "usr.bin/makesyscalls" is fine with me but ymmv.

What about not installing it at all? Its only going to be used during
definition updates or fixes. Compare it to the pcidevs.h and pcidevs_data.h
creation only this time it creates the relevant kernel/rump/fuzzer files. The
program can optionally be compiled and linked in /tmp and then called from
there to create all the variants using the templates or just be created in
place and cleaned up later.  No need to install it in base. The resulting
files can then be committed as `regen' just like the pcidevs variants.

> (2) What is the installed syscall description file called and where
> does it go? It is likely to be derived from (i.e., not the same as)
> syscalls.master. It isn't a C header file so it doesn't belong in
> /usr/include. It isn't necessarily particularly human-readable. My
> suggestion would be to add a directory in /usr/share for API
> descriptions and put it there, something like maybe
> /usr/share/api/syscalls.def.

I wouldn't install the file in the FS; kernel and userland are often out of
sync, maybe even versions apart with say NetBSD-8 userland but NetBSD-current
kernel. If anywhere, request the data from the kernel by exposing it in /kern.
Exposing it one way or another might be an attack vector too ...

With regards,
Reinoud



Re: makesyscalls (moving forward)

2020-06-15 Thread Kamil Rytarowski
On 15.06.2020 00:57, Johnny Billquist wrote:
> On 2020-06-15 00:52, Kamil Rytarowski wrote:
>> On 15.06.2020 00:26, Johnny Billquist wrote:
>>> But that's just me. I'll leave the deciding to you guys...
>>>
>>
>> This is only me, but /sbin and /usr/sbin are for users with root
>> privileges, while /bin and /usr/bin for everybody. makesyscalls(1)
>> intends to be an end-user program that aids building software and this
>> is just another specialized program similar to flex(1) or yacc(1), just
>> a more domain specific code generator.
> 
> Is ping only for people with root privileges???
> 

ping needs setuid so yes.

>   Johnny
> 




signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-15 Thread Martin Husemann
On Sun, Jun 14, 2020 at 09:07:45PM +, David Holland wrote:
> It seems to me that all of the following is mechanical and should be
> automatically generated, beyond what makesyscalls already does:
>- all the code that calls copyin/copyout

It is probably too early and I had too few coffee - but could
you point me at an example line of code that does copyin/copyout for
syscall args that you think should be replaced with automatically
generated code? How much of that generated code would be not from a verbatim
C block in the syscall description file?

>- compat32 translation for all syscalls and all ioctls

Tricky, but maybe doable. Not sure it will work for all.

>- compat_otheros translation as well

I have no idea how that would work (or what exactly you mean).

Rump and anything that needs to serialize/deserialize syscalls
are different beasts, and they could benefit from a common syscall "protocol" 
definition, and maybe in the end it could turn out that we do want to make
that description the master source of our own syscall definitions.

I have no idea how sanitizers fit in here.

Maybe start with the basics and explain things from ground up before diverging
into the hard issues (like the name and install location).

Martin


Re: makesyscalls (moving forward)

2020-06-14 Thread David Holland
On Sun, Jun 14, 2020 at 02:21:07PM -0700, Paul Goyette wrote:
 > > (2) What is the installed syscall description file called and where
 > > does it go? It is likely to be derived from (i.e., not the same as)
 > > syscalls.master. It isn't a C header file so it doesn't belong in
 > > /usr/include. It isn't necessarily particularly human-readable. My
 > > suggestion would be to add a directory in /usr/share for API
 > > descriptions and put it there, something like maybe
 > > /usr/share/api/syscalls.def.
 > 
 > Perhaps /usr/share/sys/syscalls.def ?
 > 
 > I'd suspect we might find more .../sys/... stuff (compared to the
 > amound of .../api... stuff) in the future which could minimize the
 > lonliness of syscalls.def  :)  Or even maybe /usr/share/kern/...
 > might work.

I was thinking that other machine-readable API definitions might
appear in the future, e.g. for parts of libc and for other base
libraries.

Wherever we put it now, it's going to be lonely to begin with :-/

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls (moving forward)

2020-06-14 Thread David Holland
On Sun, Jun 14, 2020 at 11:59:54PM +0200, Johnny Billquist wrote:
 > > "usr.bin/makesyscalls" sounds good to me.
 > 
 > Uh? usr.bin is where stuff for /usr/bin is located, right? Anything there
 > should be pretty normal tools that any user might be interested in. Don't
 > seem to me as makesyscalls would be a tool like that?
 > 
 > Possibly some sbin thing, but in all honestly, wouldn't this make more
 > sense to have somewhere under sys? Don't we have some other tools and bits
 > which are specific for kernel and library building?

So this was, in fact, a discussion I was intending to provoke, because
right now we have no place to put tools that are part of the system
build but don't make any sense to install in /usr, and that seems like
a gap.

However, Kamil convinced me that there will be external users of this
thing. As I mentioned in the original mail, apparently the llvm
sanitizers already have a netbsd-specific script that tries to read
the existing syscalls.master, and any future sanitizer-like thing will
need something similar if we don't provide a facility. Having
3rd-party sources like this groping in our tree trying to read an
internal file whose format and semantics we don't support is not the
right way.

Meanwhile it doesn't belong in sbin because it doesn't require root,
nor does doing something useful with it require root, and it doesn't
need to be on /, so... usr.bin. Unless we think libexec is reasonable,
but if 3rd-party code is going to be running it we really want it on
the $PATH, so...

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls (moving forward)

2020-06-14 Thread Johnny Billquist

On 2020-06-15 00:07, Kamil Rytarowski wrote:

On 14.06.2020 23:59, Johnny Billquist wrote:

On 2020-06-14 23:21, Paul Goyette wrote:

On Sun, 14 Jun 2020, David Holland wrote:




This raises two points that need to be bikeshedded:

(1) What's the new tool called, and where does it live in the tree?
"usr.bin/makesyscalls" is fine with me but ymmv.


"usr.bin/makesyscalls" sounds good to me.


Uh? usr.bin is where stuff for /usr/bin is located, right? Anything
there should be pretty normal tools that any user might be interested
in. Don't seem to me as makesyscalls would be a tool like that?

Possibly some sbin thing, but in all honestly, wouldn't this make more
sense to have somewhere under sys? Don't we have some other tools and
bits which are specific for kernel and library building?



/usr/bin is appropriate and there are already similar tools (like
ioctlprint(1)). It's already in PATH and definitely in interest of some
end-users (like me) and I do want to have it.


It could certainly be questioned if ioctlprint should be in /usr/bin as 
well.


If we think tools like ping, which arguably a lot of people have heard 
about, and actually use, are in /sbin, what makes ioctlprint such a more 
commonly looked for, and used tool? I would say this is really a tool 
for pretty advanced users. But even so, I could more easily see 
ioctlprint in /usr/bin than I could makesyscalls. Anyone writing code 
that sits in the kernel is definitely in the area of users who have 
privileges and abilities that don't apply to normal users.


And the question was not about having it or not, but where it should be 
located.


Looking at hier(7), we have:

 /sbin/ System programs and administration utilities used in both
single-user and multi-user environments.

/usr/sbin/ System daemons and system utilities (normally
  executed by the super-user).

and

 /bin/  Utilities used in both single and multi-user environments.

 /usr/bin/  Common utilities, programming tools, and
  applications.


How "common" would you say makesyscalls is (or ioctlprint for that matter)?

I don't mind the name, but I also agree that this is mostly something 
for build.sh, which I am wondering if it wouldn't more appropriately fit 
in somewhere under sys?

Definitely not something I would expect a normal user to ever make use of.

But that's just me. I'll leave the deciding to you guys...

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-14 Thread Johnny Billquist

On 2020-06-15 00:50, Robert Swindells wrote>

Johnny Billquist  wrote:

On 2020-06-14 23:21, Paul Goyette wrote:

On Sun, 14 Jun 2020, David Holland wrote:




This raises two points that need to be bikeshedded:

(1) What's the new tool called, and where does it live in the tree?
"usr.bin/makesyscalls" is fine with me but ymmv.


"usr.bin/makesyscalls" sounds good to me.


Uh? usr.bin is where stuff for /usr/bin is located, right? Anything
there should be pretty normal tools that any user might be interested
in. Don't seem to me as makesyscalls would be a tool like that?


As config(1) is in /usr/bin that seems the best place for makesyscalls
too.


Ouch! What a rabbit hole! I should be quiet now. :-)

(I don't really think config makes any sense at all to have in 
/usr/bin... ;-) )



I would expect that the generated files would have the developer uid and
gid, I wouldn't want them owned by root.


I guess I fail to see the problem there. That all depends on who is 
running it, and where the files are placed. Don't really matter where 
the tool is located.


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-14 Thread Mouse
> This is only me, but /sbin and /usr/sbin are for users with root
> privileges, while /bin and /usr/bin for everybody.

Maybe mostly.  But there are exceptions enough that I think keeping
this with the likes of config(1) and genassym(1) makes sense,
especially in view of cross-builds.

Personally, the biggest exception in practice is probably ping(8),
which I routinely use as a non-root user.  Others of use to me even as
non-root include iostat(8), mount(8), and ntpdc(8).  In the other
direction, chflags(1) and lastcomm(1) strike me as among the most
likely to belong in /usr/sbin instead.

And there is enough history - config(1), genassym(1) - behind build
tools being in /usr/bin that I think this one belongs there too.  Given
its typical, and apparent design, use, revoke(1) too.

Looking at the lists of commands, I don't think it's so much "root" and
"non-root" as it is "sysadmin" and "non-sysadmin" - which is also part
of why it's significantly harder to find exceptions in /usr/bin than in
/usr/sbin, since "root commands" is closer to being a subset of
"sysadmin commands" than the other way around.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: makesyscalls (moving forward)

2020-06-14 Thread Johnny Billquist

On 2020-06-15 00:52, Kamil Rytarowski wrote:

On 15.06.2020 00:26, Johnny Billquist wrote:

But that's just me. I'll leave the deciding to you guys...



This is only me, but /sbin and /usr/sbin are for users with root
privileges, while /bin and /usr/bin for everybody. makesyscalls(1)
intends to be an end-user program that aids building software and this
is just another specialized program similar to flex(1) or yacc(1), just
a more domain specific code generator.


Is ping only for people with root privileges???

  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-14 Thread Kamil Rytarowski
On 15.06.2020 00:26, Johnny Billquist wrote:
> But that's just me. I'll leave the deciding to you guys...
> 

This is only me, but /sbin and /usr/sbin are for users with root
privileges, while /bin and /usr/bin for everybody. makesyscalls(1)
intends to be an end-user program that aids building software and this
is just another specialized program similar to flex(1) or yacc(1), just
a more domain specific code generator.

I don't see any reason why to restrict makesyscalls(1) to root-only.
/usr/bin is a settled path for native programs and chaing it is not
worth it (and I personally see no reason). If I want to plug
makesyscalls(1) into LLVM or GDB or some fuzzers, it would be certainly
cumbersome to pass full path to some /sys/bin or similar.



signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-14 Thread Robert Swindells


Johnny Billquist  wrote:
>On 2020-06-14 23:21, Paul Goyette wrote:
>> On Sun, 14 Jun 2020, David Holland wrote:
>> 
>> 
>> 
>>> This raises two points that need to be bikeshedded:
>>>
>>> (1) What's the new tool called, and where does it live in the tree?
>>> "usr.bin/makesyscalls" is fine with me but ymmv.
>> 
>> "usr.bin/makesyscalls" sounds good to me.
>
>Uh? usr.bin is where stuff for /usr/bin is located, right? Anything 
>there should be pretty normal tools that any user might be interested 
>in. Don't seem to me as makesyscalls would be a tool like that?

As config(1) is in /usr/bin that seems the best place for makesyscalls
too.

I would expect that the generated files would have the developer uid and
gid, I wouldn't want them owned by root.


Re: makesyscalls (moving forward)

2020-06-14 Thread Kamil Rytarowski
On 14.06.2020 23:59, Johnny Billquist wrote:
> On 2020-06-14 23:21, Paul Goyette wrote:
>> On Sun, 14 Jun 2020, David Holland wrote:
>>
>> 
>>
>>> This raises two points that need to be bikeshedded:
>>>
>>> (1) What's the new tool called, and where does it live in the tree?
>>> "usr.bin/makesyscalls" is fine with me but ymmv.
>>
>> "usr.bin/makesyscalls" sounds good to me.
> 
> Uh? usr.bin is where stuff for /usr/bin is located, right? Anything
> there should be pretty normal tools that any user might be interested
> in. Don't seem to me as makesyscalls would be a tool like that?
> 
> Possibly some sbin thing, but in all honestly, wouldn't this make more
> sense to have somewhere under sys? Don't we have some other tools and
> bits which are specific for kernel and library building?
> 

/usr/bin is appropriate and there are already similar tools (like
ioctlprint(1)). It's already in PATH and definitely in interest of some
end-users (like me) and I do want to have it.

makesyscalls(1) sounds like a good name.

/usr/share/sys/syscalls.de should be an internal detail for
makesyscalls(1). I actually think that syscalls.def should be builtin
into the program and we should avoid an external file dependency as it
is expected to be operational for only one kernel ABI release +
makesyscalls(1) version. There are expected no external consumers of
this .def file and all we need and want is to pass rules how to generate
syscall definitions.

makesyscalls(1) will likely quickly turn into a ./build.sh tool and
reducing management of an external file is especially a good idea.

There is already a prior art as ioctlprint(1) has a builtin database for
the ioctl codes and it works well.

>   Johnny
> 




signature.asc
Description: OpenPGP digital signature


Re: makesyscalls (moving forward)

2020-06-14 Thread Johnny Billquist

On 2020-06-14 23:21, Paul Goyette wrote:

On Sun, 14 Jun 2020, David Holland wrote:




This raises two points that need to be bikeshedded:

(1) What's the new tool called, and where does it live in the tree?
"usr.bin/makesyscalls" is fine with me but ymmv.


"usr.bin/makesyscalls" sounds good to me.


Uh? usr.bin is where stuff for /usr/bin is located, right? Anything 
there should be pretty normal tools that any user might be interested 
in. Don't seem to me as makesyscalls would be a tool like that?


Possibly some sbin thing, but in all honestly, wouldn't this make more 
sense to have somewhere under sys? Don't we have some other tools and 
bits which are specific for kernel and library building?


  Johnny

--
Johnny Billquist  || "I'm on a bus
  ||  on a psychedelic trip
email: b...@softjar.se ||  Reading murder books
pdp is alive! ||  tryin' to stay hip" - B. Idol


Re: makesyscalls (moving forward)

2020-06-14 Thread Paul Goyette

On Sun, 14 Jun 2020, David Holland wrote:




This raises two points that need to be bikeshedded:

(1) What's the new tool called, and where does it live in the tree?
"usr.bin/makesyscalls" is fine with me but ymmv.


"usr.bin/makesyscalls" sounds good to me.


(2) What is the installed syscall description file called and where
does it go? It is likely to be derived from (i.e., not the same as)
syscalls.master. It isn't a C header file so it doesn't belong in
/usr/include. It isn't necessarily particularly human-readable. My
suggestion would be to add a directory in /usr/share for API
descriptions and put it there, something like maybe
/usr/share/api/syscalls.def.


Perhaps /usr/share/sys/syscalls.def ?

I'd suspect we might find more .../sys/... stuff (compared to the
amound of .../api... stuff) in the future which could minimize the
lonliness of syscalls.def  :)  Or even maybe /usr/share/kern/...
might work.


++--+---+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses: |
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org   |
++--+---+


makesyscalls (moving forward)

2020-06-14 Thread David Holland
As I mentioned a few days ago the reason I was prodding
makesyscalls.sh is that I've been looking at generating more of the
system call argument handling code.

It seems to me that all of the following is mechanical and should be
automatically generated, beyond what makesyscalls already does:
   - all the code that calls copyin/copyout
   - compat32 translation for all syscalls and all ioctls
   - compat_otheros translation as well

The first of these has been routine in research OSes for 25+ years and
offers no particular difficulties other than it'll take a fair amount
of coding. The second and third follow directly from a good solution
to the first, modulo some semantic concerns about schema translation
that I believe aren't relevant for the things we need to do. (As in,
anything semantically complicated is going to end up being written by
hand anyway and the translation itself is about different
representations of the same data.)

Past experience (this will not be the first or I think even the second
syscall marshalling generator I've had the pleasure of writing) says
is that doing this well requires real compiler tooling and
infrastructure. The copyin/copyout code can be, but should not be,
generated with just an awk script. Going beyond that requires a real
tool with sources that get compiled.

Meanwhile looking at the large amount of cutpaste in the current
makesyscalls.sh, the number of similar output files already generated,
and talking with Kamil about what's needed by sanitizers, makes me
think that at least part of the generation should be driven by output
templates. That is, you run some tool and you feed it an output
template that describes one set of things and you get the in-kernel
copyin/copyout code. Then, you feed it some other template and out
comes the rump version, a third template and you get sanitizer
wrappers, etc. etc. I will need to think some about how to do this
effectively (it is not as straightforward as just spewing code out)
but I don't think it is going to be particularly difficult.

However, it means that the tool needs to be installed in the system
and so does the master description file for at least NetBSD's own
system calls. (Currently, it seems that the llvm sanitizers build
reads syscalls.master itself from a source tree, so it needs a system
source tree to build, which is bad.)

This raises two points that need to be bikeshedded:

(1) What's the new tool called, and where does it live in the tree?
"usr.bin/makesyscalls" is fine with me but ymmv.

(2) What is the installed syscall description file called and where
does it go? It is likely to be derived from (i.e., not the same as)
syscalls.master. It isn't a C header file so it doesn't belong in
/usr/include. It isn't necessarily particularly human-readable. My
suggestion would be to add a directory in /usr/share for API
descriptions and put it there, something like maybe
/usr/share/api/syscalls.def.


Note that the eventual tool that gets committed will be written in C,
but the development version will probably be written in something with
stronger types. For this reason I expect to not be committing
development versions until things are pretty much done.

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls

2020-06-14 Thread David Holland
On Fri, Jun 12, 2020 at 04:40:28PM +0200, Reinoud Zandijk wrote:
 > > Yes, it can be rewritten in C as a subsequent step. *After* quite a
 > > bit of tidying. And no, I'm not doing that now. Among other problems,
 > > compiling it requires bikeshedding where to put it in the tree. Feel
 > > free to sort that out.
 > 
 > I'd say 'C'. If the specification is read in, sanity checking can
 > be added on the read in datastructure too wich is hard to do in awk
 > and `friends'. Things like missing compat definitions, missing
 > syscall numbers etc can be printed out to be noted as non-existent
 > etc. I don't know about structure versioning and system calls as
 > for compat but generating them seems the right way to go.

C is not a great language for writing code generators, but it's what
we've got. There's nothing better that we want in base or that I'd be
willing to suggest accepting an external dependence for.

Anyway, the conclusion seems to me that I'm not going to check in the
Python version of the script.

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls

2020-06-12 Thread Reinoud Zandijk
On Wed, Jun 10, 2020 at 08:58:57AM +, David Holland wrote:
> On Wed, Jun 10, 2020 at 01:25:03AM -0400, Thor Lancelot Simon wrote:
>  > Could you translate your prototype into a
>  > different language, one that's less basically hostile to our build system
>  > and its goals and benefits?
> 
> Like which one? You removed the part of the post that explained that
> there aren't other reasonable choices.
> 
> Yes, it can be rewritten in C as a subsequent step. *After* quite a
> bit of tidying. And no, I'm not doing that now. Among other problems,
> compiling it requires bikeshedding where to put it in the tree. Feel
> free to sort that out.

I'd say 'C'. If the specification is read in, sanity checking can be added on
the read in datastructure too wich is hard to do in awk and `friends'. Things
like missing compat definitions, missing syscall numbers etc can be printed
out to be noted as non-existent etc. I don't know about structure versioning
and system calls as for compat but generating them seems the right way to go.

I'm not into the machanics of the syscall code and compat code generation but
implementing it in 'C' seems like the most logical choice.

Reinoud



Re: makesyscalls

2020-06-11 Thread J. Lewis Muir
On 06/10, David Holland wrote:
> On Wed, Jun 10, 2020 at 01:25:03AM -0400, Thor Lancelot Simon wrote:
>  > Could you translate your prototype into a
>  > different language, one that's less basically hostile to our build system
>  > and its goals and benefits?
> 
> Like which one?

How about mruby?

  https://mruby.org/

It's small and lightweight.  IMO, the Ruby language is quite nice.  I
believe mruby can be cross-compiled since in

  https://github.com/mruby/mruby/blob/master/doc/guides/compile.md

it says:

  mruby uses Rake to compile and cross-compile all libraries and
  binaries.

It looks like it supports GCC and Clang.

Unfortunately, one prerequisite it lists is a Ruby 2.0 or later
implementation.  That obviously makes compiling harder. :-(  Given that
it seems to be targeting being linked and embedded in an application, I
guess I can see why they don't mind this dependency.  For the idea of
possibly including it in base, though, it's unfortunate.

Anyway, I'm talking about mruby just because it's a small and
lightweight implementation of the Ruby language that might have a
better chance of getting included in base if that were ever considered
desirable.  Since you're just looking for a reasonable language, though,
I still think Ruby would be an excellent choice, and the de facto
standard CRuby (a.k.a. Matz's Ruby Interpreter or Ruby MRI)

  https://www.ruby-lang.org/

would be a nice implementation.  I believe it can be cross-compiled.
And it's obviously available in pkgsrc.

> You removed the part of the post that explained that there aren't other
> reasonable choices.
> 
> Yes, it can be rewritten in C as a subsequent step. *After* quite a
> bit of tidying. And no, I'm not doing that now. Among other problems,
> compiling it requires bikeshedding where to put it in the tree. Feel
> free to sort that out.
> 
> As for lua: it has the same headline issues as awk, namely it doesn't
> enforce function arguments and doesn't require that variables are
> declared before use. Again, I see no reason to think it'll be any more
> maintainable.

I don't know anything about Lua, and I might be completely
misunderstanding what you're saying, but I think Ruby satisfies both of
those as follows (mruby and CRuby output shown):

* It enforces function arguments:

  
  $ cat func_args.rb
  def add(x, y)
x + y
  end

  puts(add(5))
  $ mruby func_args.rb
  trace (most recent call last):
  func_args.rb:5: 'add': wrong number of arguments (1 for 2) (ArgumentError)
  $ ruby func_args.rb
  Traceback (most recent call last):
  1: from func_args.rb:5:in `'
  func_args.rb:1:in `add': wrong number of arguments (given 1, expected 2) 
(ArgumentError)
  

* It requires variables to be assigned before use:

  
  $ cat var_use.rb
  puts(foo)
  $ mruby var_use.rb
  trace (most recent call last):
  var_use.rb:1: undefined method 'foo' (NoMethodError)
  $ ruby var_use.rb
  Traceback (most recent call last):
  var_use.rb:1:in `': undefined local variable or method `foo' for 
main:Object (NameError)
  

Lewis


Re: makesyscalls

2020-06-10 Thread David Holland
On Wed, Jun 10, 2020 at 01:25:03AM -0400, Thor Lancelot Simon wrote:
 > Could you translate your prototype into a
 > different language, one that's less basically hostile to our build system
 > and its goals and benefits?

Like which one? You removed the part of the post that explained that
there aren't other reasonable choices.

Yes, it can be rewritten in C as a subsequent step. *After* quite a
bit of tidying. And no, I'm not doing that now. Among other problems,
compiling it requires bikeshedding where to put it in the tree. Feel
free to sort that out.

As for lua: it has the same headline issues as awk, namely it doesn't
enforce function arguments and doesn't require that variables are
declared before use. Again, I see no reason to think it'll be any more
maintainable.

-- 
David A. Holland
dholl...@netbsd.org


Re: makesyscalls

2020-06-09 Thread Thor Lancelot Simon
Python is essentially uncrosscompilable and its maintainers have repeatedly
rudely rejected efforts to make it so.

If that weren't the case, and the way installed Python modules were "managed"
(I use the term liberally) were made sane, I'd think it were a fine thing to
use in base.  But it is the case, and that won't be made sane, and so I think
it belongs nowhere near NetBSD.  Could you translate your prototype into a
different language, one that's less basically hostile to our build system
and its goals and benefits?



re: makesyscalls

2020-06-09 Thread matthew green
i'm not very interested in a solution that doesn't use tools
available to the build.  you've not shown that there is
sufficient pain here to force an external solution.  i'm not
sure i buy your claims about awk and size of program.  IME,
it just requires that one is strict to rules.

if you want to use python for creating files in the tree
for _our_ things, then i think you have to propose adding
python to src/tools.  [*]

why aren't you willing to discuss a lua version?  it has
most of the features you complained awk is missing, and would
make it relatively easy to unit-test the components easily.


.mrg.


[*] i support this generally, and as a non-public visible
library that our gdb can use as well.  i do not think we
should add it to /usr/bin (but i might be convinceable that
it can be installed in a non-standard location, using non-
standard paths for libararies, as long as it does not have
anything to do with or interfere with pkgsrc or other ways
of using python.)


Re: makesyscalls

2020-06-09 Thread Kamil Rytarowski
On 10.06.2020 01:13, David Holland wrote:
> The question is: do we want the Python version in the tree now

For this, I would say "NO", at least as long Python is out of base and
IMO it shall not be there.

But it is fine to put into othersrc/.

On 10.06.2020 01:13, David Holland wrote:
> Rewriting in C is a possible future step. The code generator I have in
> mind going forward should not be done in Python. But again, more on
> that later.

I would like to have mksyscalls (and at some point makeioctls) much more
flexible and as a tool scriptable. I had to iterate a dozen of times
over all our syscalls in various fuzzers, sanitizers, debuggers, tracers
etc.

Something that is very needed is knowing the full serialized struct
passed as a pointer to each syscall. It's a lot of work to teach the
tool about it, but it could be finally centralized and time saved of
repeatedly teaching all the other programs about this property of syscalls.



signature.asc
Description: OpenPGP digital signature


makesyscalls

2020-06-09 Thread David Holland
Between various forms of prompting I got tired of waiting for a gsoc
student to take up code generation for syscall argument handling. We
should really be generating all the calls to copyin/copyout, not to
mention all the compat32 and compat_* glue, not maintaining them by
hand. More on that later.

In the meantime the first step on this was to assess the current
makesyscalls.sh and decide whether to (a) leave it alone as a separate
mechanism, (b) integrate it into the new stuff, or (c) ignore it and
write new stuff from scratch.

My conclusion after wading into it is that between additions for rump,
dtrace, and other things it's become quite a bit too large for an awk
script. (Experience suggests that the limit is about 1000 lines,
depending on how well structured the script is and what it's doing.
Beyond this point, between there being no variable declarations,
function arguments being unchcked, and other properties of awk, it
becomes difficult to modify the script safely and alterations made
regardless tend to resort to cutpaste and make the situation worse.
This script is 1200 lines and it's definitely showing signs of
deterioration.)

This means that regarding (a) it shouldn't be left alone because it's
itself becoming a problem. Meanwhile, regarding (c), it does too many
things and has too many tentacles to be safely ignored or easily
reimplemented. Consequently, as a step towards (b) I have translated
it into Python; Python is still untyped but is substantially more
robust than awk and has a decidedly larger size limit.

The question is: do we want the Python version in the tree now, or
should I just treat it as an intermediate development prototype that
nobody has to see?

Arguments pro:
   - this is definitely a step forward for maintainability
   - having it in the tree means other people can work on it readily
   - having it in the tree means it's there even if my future plans
 get derailed or don't materialize

Arguments con:
   - it's Python and we don't have Python in base (and don't
 particularly want it)
   - Python being Python all the code paths I haven't managed to test
 by the time it's committed will probably crash the first time
 they're reached
   - definitely a risk of having introduced bugs (also found some, but
 so far all of them have been cosmetic things that awk doesn't
 trap on, like passing extra arguments to printf)

Arguments I'm not interested in listening to:
   - it's Python and Python is a terrible language (might be so but
 see below)
   - it should be Lua (don't see any reason Lua beats awk for this)
   - sh/awk is fine for this and you're doing it wrong (no I'm not, I
 think I have as much experience using sh and awk on nontrivial
 things as anyone)

There are not many languages to choose from: in base we have sh, awk,
lua, and C (or C++), and of languages not in base I think Python is
the only one sufficiently ubiquitous to justify using it for important
build infrastructure. Well, maybe Perl is, but I'm trying to make the
script _more_ maintainable :-)

Note before flaming that it doesn't actually run during builds, only
when someone regenerates the outputs after changing syscalls.master.
Needing to have Python installed for this is not a showstopper.

Rewriting in C is a possible future step. The code generator I have in
mind going forward should not be done in Python. But again, more on
that later.

-- 
David A. Holland
dholl...@netbsd.org