On 25/04/2017 18:56, Cantor, Scott wrote:
Since we are sharing plans, we (as in Code Synthesis) are planning
to package Xerces-C++ for build2 in the near future (but no
definite time-frame). While I haven't looked into this closely
yet, the options we consider range between just packaging it as
is to pretty much forking it. The main reasons for forking would
be: (1) to switch to git (life is just too short for svn), (2)
to get rid of the Apache bureaucracy, and (3) rip all the legacy
parts out and clean things up (maybe even switching to C++11/14).
(1) doesn't matter to me, but +1000 to (2) and I have very little compunction
about (3), aside from the obvious fact that once you start pulling that thread,
you're on slippery ground.
I wasn't prepared to really go so far as to start tossing things out or
proposing really invasive changes but it sounds like cleaning up and releasing
the trunk would serve both short term and longer term ends here.
Switching to git would be wonderful. We could also enable CI testing
with e.g. Travis or some other CI service on github at that time to
enable testing of all PRs, if that would be accceptable. Or does the
Apache project provide any equivalent services internally?
Regarding (3), it's a bit outside the scope of this CMake ticket. My
intentions here were to get a build system which would provide a working
build on all platforms, including the unit tests. I didn't want to go
down the rabbit hole at the same time. Ideally, if we merge this to the
trunk and branch off a 3.2 and release that, more adventurous changes
could be then done on the trunk. I'd rather have a working release with
the CMake support included than to do both and not have an immediately
usable and API compatible release!
That said, I'd not be averse to including support for standard C++;
using Xerces is often a bugbear due to its age. All our code is now
C++11, with RAII wrappers to make Xerces play nicely. Primarily the
lack of RAII, non-standard exception types, odd memory management
semantics and transcoding all input. Something worth noting is that our
(optional) ICU dependency switched to requiring C++11 with ICU 59.1. It
switched to using the standard char16_t as its XML string type. If
Xerces were to also switch (or at least use a suitable typedef), we
could be using const char16_t* foo = u"UTF-16 strings" and/or u8"UTF-8"
strings directly in both the xerces sources and in client programs. A
major usability improvement.
In a recent performance testing exercise at work, we found string
transcoding inside xerces-c to be a major time sink--using valgrind
callgrind--it was one of the major uses of CPU time during parsing and
DOM processing. It was slower than xerces-j for the same operations,
and this was likely to be a major cause.
Certainly cleaning up and releasing trunk would be a step towards any of
that, should there be a consensus for that.
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org