On 25/04/2017 18:56, Cantor, Scott wrote:
Since we are sharing plans, we (as in Code Synthesis) are planning
to package Xerces-C++ for build2[1] in the near future (but no
definite time-frame). While I haven't looked into this closely
yet, the options we consider range between just packaging it as
is to pretty much forking it. The main reasons for forking would
be: (1) to switch to git (life is just too short for svn), (2)
to get rid of the Apache bureaucracy, and (3) rip all the legacy
parts out and clean things up (maybe even switching to C++11/14).

(1) doesn't matter to me, but +1000 to (2) and I have very little compunction 
about (3), aside from the obvious fact that once you start pulling that thread, 
you're on slippery ground.

I wasn't prepared to really go so far as to start tossing things out or 
proposing really invasive changes but it sounds like cleaning up and releasing 
the trunk would serve both short term and longer term ends here.

Switching to git would be wonderful. We could also enable CI testing with e.g. Travis or some other CI service on github at that time to enable testing of all PRs, if that would be accceptable. Or does the Apache project provide any equivalent services internally?

Regarding (3), it's a bit outside the scope of this CMake ticket. My intentions here were to get a build system which would provide a working build on all platforms, including the unit tests. I didn't want to go down the rabbit hole at the same time. Ideally, if we merge this to the trunk and branch off a 3.2 and release that, more adventurous changes could be then done on the trunk. I'd rather have a working release with the CMake support included than to do both and not have an immediately usable and API compatible release!

That said, I'd not be averse to including support for standard C++; using Xerces is often a bugbear due to its age. All our code is now C++11, with RAII wrappers to make Xerces play nicely. Primarily the lack of RAII, non-standard exception types, odd memory management semantics and transcoding all input. Something worth noting is that our (optional) ICU dependency switched to requiring C++11 with ICU 59.1. It switched to using the standard char16_t as its XML string type. If Xerces were to also switch (or at least use a suitable typedef), we could be using const char16_t* foo = u"UTF-16 strings" and/or u8"UTF-8" strings directly in both the xerces sources and in client programs. A major usability improvement.

In a recent performance testing exercise at work, we found string transcoding inside xerces-c to be a major time sink--using valgrind callgrind--it was one of the major uses of CPU time during parsing and DOM processing. It was slower than xerces-j for the same operations, and this was likely to be a major cause.

Certainly cleaning up and releasing trunk would be a step towards any of that, should there be a consensus for that.


Regards,
Roger


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org

Reply via email to