I am performing a port for a client of some in-development code from
Win32 to OS X. I've tracked down the problem, I think. This message
is (1) for information and (2) to advocate a solution. I know what
I'm going to do about; I'm not seeking advice.
== Information ==
I have seen the following error while running unit tests (formatted
with extra line breaks):
unknown location(0): fatal error in "my_unit_test_function":
std::exception: NameValuePairs:
type mismatch for 'InputBuffer',
stored 'N8CryptoPP23ConstByteArrayParameterE',
trying to retrieve 'N8CryptoPP23ConstByteArrayParameterE'
This error is generated at cryptlib.h:231 in the constructor for the
ValueTypeMismatch exception. Now it sure looks like that comparison
ought to match, but it doesn't. I should mention now that this same
unit test runs without error under MSVC 8, the environment of its
original writing.
After far too many hours of flailing at this (and to boot, this is
the first Mac development I've done, much less Xcode), I've finally
tracked down what appears to be the problem. I don't have a fix; my
diagnosis is that it's either in GCC error or in the dynamic loader
in OS X, neither of which I'm going to dig in to.
The unit test runner program consists of three (relevant) components:
-- a small portion of the Crypto++ library (version 5.2.3), namely
SHA256 and filter chains. This is statically linked as an OS X ".a" library.
-- the application library that uses Crypto++ functions. There are
two functions; they hash strings and files, respectively, and output
a Base64 string. About as simple as it can be. This library is
dynamically linked as an OS X ".dylib". This linkage is how the
component will be deployed.
-- a test program that exercises the library. Crucially, this test
program also links in the Crypto++ library statically. FYI, this is
in order to convert test vectors published by NIST from hex encoding
(as published) to Base64 (as used).
The offending defect in the compiler/run-time system manifests itself
in the definition of "operator==" for std::type_info, at typeinfo:108
in my header (from Xcode 2.4). What's happening is that the type
names (defined as "char *", not "std::string") are being compared by
address, not by content. An explanatory comment says the following:
// In new abi we can rely on type_info's NTBS being unique,
// and therefore address comparisons are sufficient.
Starting with GCC 3.0 (from what I can ascertain), there was a
requirement that type_info names be merged at load time. Apparently
this is not happening correctly on the Xcode/GCC/OS X platform where
this error surfaced. There are two copies of
"CryptoPP::ConstByteArrayParameter" (arising from a StringSource),
one in the dylib, one in the test binary.
My personal action on this is to rework the test environment to avoid
double linking the Crypto++ library across a dynamic-vs.-static load
boundary. I expect this will make this problem go away for me.
It will not, however, go away for everyone else.
== Advocacy ==
I should state a personal preference about the kind of code I prefer
to work on: I hate working with dynamically-type languages. They
have lower entry costs for writing code, since they can be bashed on
more easily without incurring the same up-front analysis that static
typing often requires. On the other hand, defects are deferred to
run-time rather than eliminated; in the longest run this is rarely
worth it. What's critical is that large amounts of software never
get to the end of this longest run, so dynamic binding has its place,
and permanently so.
That place, however, is not embedded into the back of a utility
library. The "NameValuePair" class in Crypto++ is a mechanism for
dynamic binding, implemented with RTTI. I can understand why it was
useful at the outset of library development, but it's not nearly so
valuable in a mature library. Changing Crypto++ to eliminate dynamic
binding would entail significant work, likely with lots of templates
(whose implementation unreliability was certainly worse than dynamic
binding a decade ago). While I'd prefer that approach as a matter of
correctness and elegance, it's quite a large change; I can't recommend it.
What I can recommend is a different form of dynamic binding, namely
using the boost::parameter library. That library does some
wonderfully clever things to enable named parameters in function
calls, which greatly improved code legibility. (Aside: Ada has this
syntax, and it's one of the reasons it has become my implementation
language for personal projects.) The best reason to choose a Boost
library, though, is that the Boost folks have done lots of work to
deal with various compiler defects; their portability is
excellent. (Nevertheless, the parameter library doesn't pass
regression tests on a couple of not-hugely-common compilers, as it
uses significant metaprogramming internally.)
Boost Parameter Library
http://www.boost.org/libs/parameter/doc/html/index.html
In the longer term, personally, I'd like to see Crypto++ become the
Boost crypto library (they don't have one now). This seems like a
decent first step toward that direction. A second such step would be
to start using the Boost iostreams library, which could replace the
existing filter system.
Eric