Re: Time to merge the pgoyette-compat branch (take two)

2018-09-12 Thread Alexander Nasonov
Martin Husemann wrote:
> Also I wonder if we could do some nm digging and awk scripts with tsort
> to find potential symbol collisions or missing symbols not properly
> covered by module dependencies.

BTW, I'm working on Lua bindings for elftoolchain [1]. In principle,
it should be a better alternative to nm+awk hacks but I'm not sure if
the code can handle all required functionality. You're welcome to take
a look :-)

[1] https://www.github.com/xmmswap/luaelftoolchain

-- 
Alex


re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread Paul Goyette

On Mon, 10 Sep 2018, matthew green wrote:


why bother keeping the monolithic compat module?  i was
expecting it to be replaced by this merge, not as an
additional way to get the same thing.


I am in the process of removing both "monolithic" modules - the big 
compat and the smaller compat_sysv



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread matthew green
why bother keeping the monolithic compat module?  i was
expecting it to be replaced by this merge, not as an
additional way to get the same thing.


.mrg.


Re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread Paul Goyette

On Sun, 9 Sep 2018, Martin Husemann wrote:


On Sun, Sep 09, 2018 at 06:15:01PM +0800, Paul Goyette wrote:

On Sun, 9 Sep 2018, matthew green wrote:




Paul, i 100% agree this isn't a new problem. but your branch has
pushed it back into the foreground again :-)


Being in the foreground is a good thing, right?  :)  At least now it
can get the attention it really needs, rather than just hiding it
under the carpet.


I think we should get rid of libcompat for kernel builds (dunno if it is
used for modules).

Also I wonder if we could do some nm digging and awk scripts with tsort
to find potential symbol collisions or missing symbols not properly
covered by module dependencies. Alternatively build two special modular
kernels (like ALL and MINIMAL) and do runtime load/unload tests with
all modules we deliver.


After having removed the "alias name" stuff, I've run into some problems
with module dependencies.  Most of this stems from the desire to retain
a "monolithic" compat module as well as the more granular modules for
each version.  As an example of the problem, the compat_netbsd32 module
needs to depend on _either_ the monolithic compat module _or_ some of
the version-specific modules.  (This was easy with the alias mechanism:
I simply had the monolithic module advertise lots of aliases, and then
the compat_netbsd32 module could depend on whichever version-specific
module(s) it needed.)

Our module system doesn't currently know how to do the _either_or_
thing.  So it appears that I'm going to need to completely remove the
monolithic module, and depend entirely on the individual modules.

This is going to take some time to get it right, for both built-in and
loaded modules.


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


Re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread Martin Husemann
On Sun, Sep 09, 2018 at 06:15:01PM +0800, Paul Goyette wrote:
> On Sun, 9 Sep 2018, matthew green wrote:
> 
> 
> 
> > Paul, i 100% agree this isn't a new problem. but your branch has
> > pushed it back into the foreground again :-)
> 
> Being in the foreground is a good thing, right?  :)  At least now it
> can get the attention it really needs, rather than just hiding it
> under the carpet.

I think we should get rid of libcompat for kernel builds (dunno if it is
used for modules).

Also I wonder if we could do some nm digging and awk scripts with tsort
to find potential symbol collisions or missing symbols not properly
covered by module dependencies. Alternatively build two special modular
kernels (like ALL and MINIMAL) and do runtime load/unload tests with
all modules we deliver.

Martin


re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread Paul Goyette

On Sun, 9 Sep 2018, matthew green wrote:




Paul, i 100% agree this isn't a new problem. but your branch has
pushed it back into the foreground again :-)


Being in the foreground is a good thing, right?  :)  At least now it
can get the attention it really needs, rather than just hiding it
under the carpet.

:)


+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


Re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread David Holland
On Sun, Sep 09, 2018 at 04:09:20PM +1000, matthew green wrote:
 > again, without looking too closely, i think the ultimate problem
 > here is that libcompat in the kernel is the wrong design, and it
 > should not be a kernel library.
 > 
 > it is probably this way to ease maintenance of other lists, since
 > what depends upon what code is fairly interdependant, and if you
 > just shove all the code in a library and link it, the stuff you
 > need is there, and the stuff you need isn't.
 > 
 > but really, all the code that depends upon other code should be
 > listed more explicitly, so that config/make can deal with it
 > instead of leaving it for the linker.  christos?  can you think
 > of any other/better ways to avoid this?  i think the only real
 > answer is to abandon kernel libcompat as-is.

If it's going to be modules it's all got to be sorted out explicitly,
so that the module dependence logic can address it, or it will all
never really work properly.

now, it would be great if we could have one config system in which to
describe this instead of three...

-- 
David A. Holland
dholl...@netbsd.org


re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread matthew green
> The original change from .a -> .o was made by maxv, in order to avoid
> having to determine where a couple of support objects were needed, and
> to avoid having to determine if there were any other such objects.  By
> changing from .a to .o method, maxv simply ensured that these support
> objects were always included in the kernel.  Unfortunately, that meant
> that _everything_ in the library was included, whether or not needed
> (based on the COMPAT_xx options present).

again, without looking too closely, i think the ultimate problem
here is that libcompat in the kernel is the wrong design, and it
should not be a kernel library.

it is probably this way to ease maintenance of other lists, since
what depends upon what code is fairly interdependant, and if you
just shove all the code in a library and link it, the stuff you
need is there, and the stuff you need isn't.

but really, all the code that depends upon other code should be
listed more explicitly, so that config/make can deal with it
instead of leaving it for the linker.  christos?  can you think
of any other/better ways to avoid this?  i think the only real
answer is to abandon kernel libcompat as-is.

Paul, i 100% agree this isn't a new problem. but your branch has
pushed it back into the foreground again :-)


.mrg.


re: Time to merge the pgoyette-compat branch (take two)

2018-09-09 Thread matthew green
Martin Husemann writes:
> On Sat, Sep 08, 2018 at 02:19:41PM +1000, matthew green wrote:
> > Greg Troxel writes:
> > > I am just barely paying attention, but I think modules working well is
> > > important, and also having minimal code for what's needed.  So if mrg's
> > > main concerns have been addressed (aliases), I'm in favor (in a somewhat
> > > weak, not really clued in sort of way) of this.
> > 
> > indeed, i don't flat out object to the object/library change.
> > it's just wrong, and we should figure out how to fix it, but
> > i don't consider it a show stopper for the branch merge.
> 
> Seconded.
> 
> But I am not quite sure what you think is wrong about the .a method - if
> everything else is correct, it should not make a difference now - or am I
> missing something?

there are a couple of basic rules here:

- modular kernels should use kernel libraries as .o
- static kernels should use kernel libraries as .a

modular kernels can't know what they _might_ need so they should
include all objects (the .o), where as static kernels can use the
linker to elide unused objects (the .a).

it sounds like the branch ends up with duplicate symbols in some
cases, and using the .a avoids it, but the real fix would be to
remove the cause of the duplicate symbols in this case directly,
rather than by using the .a.

without looking closely i can't really judge what is up..


.mrg.


Re: Time to merge the pgoyette-compat branch (take two)

2018-09-08 Thread Paul Goyette

On Sat, 8 Sep 2018, Martin Husemann wrote:


On Sat, Sep 08, 2018 at 02:19:41PM +1000, matthew green wrote:

Greg Troxel writes:

I am just barely paying attention, but I think modules working well is
important, and also having minimal code for what's needed.  So if mrg's
main concerns have been addressed (aliases), I'm in favor (in a somewhat
weak, not really clued in sort of way) of this.


indeed, i don't flat out object to the object/library change.
it's just wrong, and we should figure out how to fix it, but
i don't consider it a show stopper for the branch merge.


Seconded.

But I am not quite sure what you think is wrong about the .a method - if
everything else is correct, it should not make a difference now - or am I
missing something?


As far as I can tell, it has _always_ made a difference.

The original change from .a -> .o was made by maxv, in order to avoid
having to determine where a couple of support objects were needed, and
to avoid having to determine if there were any other such objects.  By
changing from .a to .o method, maxv simply ensured that these support
objects were always included in the kernel.  Unfortunately, that meant
that _everything_ in the library was included, whether or not needed
(based on the COMPAT_xx options present).

That change caused some problems with the compat_sysv module, which
Christos worked-around.  That work-around broke auto-loading of the
compat_sysv module, which was acknowledged in his commit log.

My changes on the pgoyette-compat branch simply undo Christos's change
and revert the build to the original form.  So, I'm simply undoing a
workaround and undoing the commit that introduced the need for a
work-around.  (Someone on IRC - martin@ perhaps? - recently referred
to maxv's original commit as a hack;  I agree with that assessment.)

It is quite possible that further work needs to be done in this area,
to limit the contents of the compat library to only those objects that
are needed, regardless of the library format.  If so, the need already
exists in the current HEAD code;  it is not introduced by my changes
on the branch.




+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++


Re: Time to merge the pgoyette-compat branch (take two)

2018-09-08 Thread Martin Husemann
On Sat, Sep 08, 2018 at 02:19:41PM +1000, matthew green wrote:
> Greg Troxel writes:
> > I am just barely paying attention, but I think modules working well is
> > important, and also having minimal code for what's needed.  So if mrg's
> > main concerns have been addressed (aliases), I'm in favor (in a somewhat
> > weak, not really clued in sort of way) of this.
> 
> indeed, i don't flat out object to the object/library change.
> it's just wrong, and we should figure out how to fix it, but
> i don't consider it a show stopper for the branch merge.

Seconded.

But I am not quite sure what you think is wrong about the .a method - if
everything else is correct, it should not make a difference now - or am I
missing something?

Martin


Re: Time to merge the pgoyette-compat branch (take two)

2018-09-07 Thread Paul Goyette

At the request of reviewers, I have removed the "alias names" stuff,
and I've re-written the description of the .o --> .a change.  Here's
the revised request-for-review.

(I was going to just abandon this effort/branch, but the rest of the
changes are IMHO too important to discard.)



After several months of work, it's now (nearly) time to merge the
pgoyette-compat branch.  (Yeah, I know that many/most of you don't
"allow" modules into your environments in the first place, so none
of this really affects you at all.)


This branch includes the following major changes:

* Separation of the single "monolithic" compat module into multiple
 modules, one for each older version of NetBSD.

 With this, you can load only as much compat code as you want, without
 having to build a custom compat module.  (You still need a custom
 kernel to start with, in order not to have all the compat code being
 built-in.)  Sys-calls that are implemented in the compat module will
 now load only the necessary compat code, rather than the entire
 monolithic module.

* Elimination of the limit on the number of "required" modules (or
 "dependencies") for a module.  There was previously a limit of
 MAXMODDEPS (value 10), with a statically-allocated array;  now the
 array is dynamically allocated and expanded as needed.


  * Also eliminate the limit on the depth of recursive loading of
required modules.  Previously there was a limit of six levels of
recursion (MODULE_MAX_DEPTH), referencing a statically allocated
array.  Now, the elements of that array are dynamically allocated
as needed, and recursion depth has no practical limit.


* Extraction of some COMPAT_xx code that was previously intertwined
 with the main kernel build and not available in non-built-in modules.
 One example of this is the COMPAT_70 code for rtsock (which bit me
 when it was introduced, and was the main reason for undertaking the
 work on this branch).

* Removed linking of the .o kernel compat library into all kernels.

This was introduced some time ago as a hack for including some
support routines that weren't being properly included in the
modules that needed them.  These support routines are now in their
own module, and we no longer need to use the .o build method.  So
we return to the original use of a .a compat library.  This also
allows removal of a work-around that was deliberately introduced
into the compat_sysv module, which had a side-effect of breaking
auto-load of compat_sysv.


Other activity is documented in src/doc/COMPAT-branch-notes file.

There are a number of activities that still need to be worked on;  these
are also listed in the src/doc/COMPAT-branch-notes file.  None of these
activites should prevent merging of the branch, as the short-comings are
already present on the mainline.

I welcome any timely review and constructive feedback.  I'd like to get
this committed sometime in the next two or three weeks if possible.



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++