On Fri, Apr 19, 2013 at 6:13 AM, Nick Wellnhofer <[email protected]> wrote:
> I spent a bit of time researching how shared library versioning works on
> different operating systems and how the stand-alone C library could use it.
Thanks for the taking the time to hunt down reference materials and assemble
this survey.
Dynamic library versioning is a core concern for Clownfish because of of its
potential to either facilitate or impede distributed development. We've spent
a lot of time on ABIs, minimizing the impact that independent groups working
on interrelated projects might have on each other so long as nobody breaks API
compatibility. Now, having largely addressed the "fragile ABI" versioning
problem, it's natural to extend our efforts to API versioning.
None of the DSO loading mechanisms presented by various operating systems
provide satisfactory solutions to the kinds of problems Clownfish sets out to
solve, so I think we should be prepared to layer our own mechanism on top.
The first component of our solution should be a Clownfish::Version class, with
the following characteristics:
* Version numbers are fundamentally arrays of integers. They are not
strings. They are not floating point numbers.
* Auxiliary properties of parcels, such as "alpha", "testing", "release
candidate" and so on, must not affect the result of comparing two Version
objects.
Each Clownfish parcel should be associated with a Version object made
available at runtime via the singleton pattern. The code to produce that
singleton should be autogenerated using the "version" entry in the parcel
specification. These per-parcel objects can then be used to validate
dependency specifications at bootstrap time.
> First, let's recap some basics about versioning. A version usually consists
> of
>
> * Major version. This is increased whenever the API or ABI changes
> in a way that is not backward-compatible. The dynamic linker
> should never load a library with a different major version than
> the one specified in the executable, no matter if it's a newer
> or older version.
>
> * Minor version. This is increased whenever the API changes in a
> backward-compatible way. Executables linked against an older
> version of a library should continue to work with newer versions
> of the library as long as the major version doesn't change.
> On the other side, executables linked against a newer library
> version shouldn't run with older library versions as they might
> use some of the newer features.
>
> * Patch level. This is increased whenever a new version of the
> library is released that doesn't change the API at all,
> typically for bug fixes. An executable should work with every
> library that has the same major or minor version, regardless
> whether the patch level is higher or lower.
Certainly, best practice for library authors would be to produce code which
conforms to these recommendations. It's agressive to throw an exception on
minor version regression -- but that seems unlikely to happen under normal
circumstances, since dependencies can be checked for minimum version numbers
at install-time and old versions will not ordinarily be installed over newer
ones.
I would also argue that we should support Version objects with an arbitrary
number of integer sub-versions, and that omitted numbers should be treated as
trailing zeroes for the purposes of comparision.
# all the same
2
2.0
2.0.0
2.0.0.0
...
> So if we have version strings
I know you're just working with what one particular operating system gives you
here, but we should avoid treating version numbers as strings. Let's not
build a system where comparing "5.16.1" against "5.8.8" gives the wrong
result.
> of the form "major.minor.patchlevel" and a
> library with version "1.3.4", the following should happen with executables
> linked against different versions:
>
> * Executable version "0.9.2": FAIL
> * Executable version "1.0.3": OK
> * Executable version "1.3.0": OK
> * Executable version "1.3.4": OK
> * Executable version "1.3.9": OK
> * Executable version "1.4.0": FAIL
> * Executable version "2.2.2": FAIL
>
> Or, if the executable version is linked against "1.3.4":
>
> * Library version "0.9.2": FAIL
> * Library version "1.2.5": FAIL
> * Library version "1.3.0": OK
> * Library version "1.3.4": OK
> * Library version "1.3.9": OK
> * Library version "1.4.0": OK
> * Library version "2.2.2": FAIL
I largely agree with the outcomes, but I think we should consider a different
mechanism.
I'd like to propose that major version numbers be considered part of the
parcel identifier and that we use a name mangling scheme to embed them in
exported symbols, while providing macro aliases for programmer convenience.
// For Lucy version 0.x:
#define lucy_Indexer_new
org_apache_lucy_0_Indexer_new
// For Lucy version 1.x:
#define lucy_Indexer_new
org_apache_lucy_1_Indexer_new
Unless I've missed something, I believe that scheme will allow multiple major
versions of a library to coexist within the same process. And since the
user-level symbols are the same, client code can be adapted and recompiled to
use different versions of a library with minimal source code churn.
> Now, let's see how different operating systems or, more specifically,
> different object file formats load libraries and support library versioning.
I think the ideal would be to store different DSOs in different directories
derived from the full parcel identifier, including major version number.
$CLOWNFISHLIB/org/apache/lucy/0/lucy.so
$CLOWNFISHLIB/org/apache/lucy/1/lucy.so
However, I'm not sure we need that right away.
> For minor versions, some ELF implementations support "version scripts" in
> which versions can be specified for every symbol. This allows for very
> sophisticated version checks. On Linux, you can even specify multiple
> versions of a symbol in a single library. The details are explained in-depth
> in chapter 3 of Ulrich Drepper's DSO Howto:
>
> http://www.akkadia.org/drepper/dsohowto.pdf
The technique described in the Drepper paper for resolving symbols differently
according to the requested version seems fragile and cumbersome -- and of
course it's not portable either.
I can't see how multiple library versions can be supported within the same
process on multiple platforms without embedding the version number into
symbols directly.
> Another solution would be to implement minor version checks explicitly in
> lucy_bootstrap_parcel:
>
> void lucy_bootstrap_parcel(int minor_version) {
> if (minor_version > x) {
> // throw
> }
> ...
> }
+1 to using bootstrap_parcel, though with a slightly different implementation.
Marvin Humphrey