On Thu 24 Aug 2006 at 04:07AM, Roland Mainz wrote: > > > > ``HORRIBLE'' appears to be a rhetorical overstatement > > It was not thought as "rhetorical overstatement" - it is what I think > how it will end. I've logged a few hundred hours getting various > scenarious tested and the two prototypes set up ("prototype001" was a > complete failure, partially because it tried to bypass the libcmd > problem) and I think I can at least outline the scenario of such a > implementation. > > But my first question would be: Who actually benefits from moving the > functions in the current Solaris libcmd.so to another library ? AFAIK > noone will benefit - but many [at Sun] will suffer. And the result will > not even be "more clean" than the current design.
I think Alan, Jim and I have clearly articulated at least two clean ways to do this with expediency and minimal suffering. As with many problems, you solve it in stages, not all at the same time. > > there is a > > problem here, and I feel that we should solve in an architecturally > > clean way-- I don't see why merging two completely unrelated libraries > > makes any architectural sense > > I'd suggest to dismantle libraries like "libnspr" or "libc", too - they > are merging various functionality and fusing them together in one > library. Or think about "librt" which was consumed by "libc" - that's > another example of such fusions. You've quoted one precedent which AFAIK OpenSolaris has no control over (nspr) and another which is totally orthogonal to the question before us (rt/aio). To address the rt/aio point: There are at times compelling architectural reasons to merge libraries-- in the case of aio and librt you had an intertwined set of libraries which knew about each other's innards. You should read 6416832. Here in part is what Roger wrote in that bug report: > For one, and most annoying, libaio must interpose on > libc's sigaction() and close() interfaces, an arrangement > that breaks if an application dlopen()s libaio.so.1 > > Likewise, librt interposes on libc's close() interface, > just so it can call libaio's version, another arrangement > that breaks if an application dlopen()s librt.so.1 > > All of the POSIX aio_*() and lio_*() interfaces are defined > in librt, but the actual implementation is in libaio, > so we get librt's aio_read(), for example, calling > libaio's __aio_read(), a consolidation-private interface. > > The sigwait() interface is defined in libc, but sigwaitinfo() > and sigtimedwait() are defined only in librt, even though they > all call the common underlying libc __sigtimedwait() interface. > > Then, finally, there is the annoyance that to call nanosleep() > one must link with -lrt even though all is does is call the > libc __nanosleep() interface. The same is true for all of > the clock_*() and timer_*() interfaces, where they just call > the corresponding libc __clock_*() and __timer_*() interfaces. > > Folding all of libaio and librt into libc would solve these > dependency problems and yield a simpler and more stable system. So there you go--- merging these libraries yields a more architecturally clean whole, and reduces fragile cross-component dependencies. It passes the test with flying colors. This is very different from the current situation: "liba and liba have the same name, but are otherwise authored by different people, are architecturally distinct, and have distinct consumers" which appears to me to be the case before us. > > Your statement is inaccurate. Things in OS/Net can depend upon things > > not in OS/Net. For example, Zones depends upon libxml2, which is in > > SFW. > > Erm... the last thing I heared is that libxml2 is moved into the OS/Net > codebase... or do I mix things up ? AFAIK the general policy is that > such extra dependicies should be avoided, right ? It is possible that someone is considering such a move. But if you check the source base, it currently isn't the case. While someone might correct me I don't know of a specific policy which dictates that cross-consolidation dependencies are inherently bad. > The integration into OS/Net is a requirement for the > "migration". Again, why? I don't see why the ksh93 must be in OS/Net to someday deliver the file /usr/bin/ksh. Now, you might make an argument around preservation of package boundaries (since packages don't span consolidations) or highlight some other intermingling between ksh and the rest of the system but I haven't seen such an explanation yet. > > > > understand that ksh93 has its own build system, and > > > > I was wondering which consolidation would best accomodate > > > > that uniqueness. > > > > > > We do not use the original AST build system for ksh93 in OS/Net - I > > > > Why not? Doesn't that just increase the maintenance burden when > > ksh93 revs to a new version? With the SFW build system, you just > > drop in the new .tar.gz, update some build metadata, and be done. > > You've totally wrong at this point. > The current ksh93 integration into OS/Net is done in a way that you can > simply make an "unfied diff" (e.g. /usr/bin/diff -u ...) between the old > and the new source version of ksh93, adjust the source paths in the > patch file, apply that patch and regenerate the include files. That's > all. I've done that for each single alpha release of ksh93r+_alpha and > this update scheme works fine. Good, I'm glad that this is thought out. Thanks for making it clear that this has been designed in. > For comparisation - building ksh93 with > the native build system can be VERY tricky as the built system in it's > default configuration probes the underlying build machine (and not any > proto area etc.) for information which makes it very tricky as even > minor changes in the build machine can turn features on/off - and that > would be a perfect "call generator". You can fix that - but then you're > getting more or less to that point where we are currently with our > integration into OS/Net... :-) If that is true, then it's a shame-- as a reference, how do other software distributions (debian, fedora, etc.) handle building a sane ksh93? > Question back: Why does "perl" actually live in OS/Net ? If I apply the > same pattern as your're using for ksh93 "perl" would be the last thing > needed in OS/Net... it could - in theory - happily live in /usr/sfw/hin/ > ... =:-) Well I'll defer to Stephen and Alan Burlison on why perl needs to be in OS/Net (or if it must at all). It could be due to build system issues similar to the ones you have articulated. It may simply be grandfathered. I don't know. [Python OTOH is not in OS/Net]. > See above... there are items like the future "migration" of the old ksh > to ksh93 or (for example) Solaris-specific goodies that we're going to > add a 64bit ksh93 - for the first time Solaris would have a scripting > language which is capable to deal with large datasets (and ksh93 is NOT > slow :-) ), something which even "perl" didn't archive yet. And > applications in OS/Net may be intestested to use libshell.so which would > make the creation of all the "*adm" tools MUCH easier as they could be > build on top of libshell's customizeable framework (if you ever worked > with "dbx" - it's based on another bariant of "ksh" - and in the long > term it may be nice to switch "dbx" to use libshell to avoid the massive > code duplication the various Sun products). I'm not opposed per se to using libshell (I had some minor experience working on a research debugger many years ago which embedded ksh). Again, I believe that libshell could be in use by various consolidations while being delivered in the SFW consolidation. To be clear, I'm glad that we all are getting ksh93; if it is possible for it to replace old-ksh with ksh93 in a clean way which doesn't cause breakage, then I'm for that, too. Onward and upward and all that. I am just trying to make sure that the plan here makes sense. If you convince me, I can hopefully be a helpful ally in helping you finish. -dp -- Daniel Price - Solaris Kernel Engineering - dp at eng.sun.com - blogs.sun.com/dp