Dear all,

We had a similar conversation to this on the Xalan mailing list a few months 
back:

https://lists.apache.org/list?d...@xalan.apache.org:2022-6 (Future of Xalan-C)
https://lists.apache.org/list?d...@xalan.apache.org:2022-7 (Retire Xalan to the 
Attic) [Xalan-J]

In both cases there was no definitive decision, but if you take the time to 
read through the first thread it provides some overview of the historical 
contributions to the project and the past and current maintainers.  The bottom 
line is this: there are no maintainers, haven't been for over a decade..  I'm 
stopping the small effort I made, and the previous people involved have moved 
onto Saxon, and no longer use it.  The project, while notionally functional, is 
not maintained, and it has no future unless people step up with real commitment 
to do the necessary work.  No real work has been done on the project since 2005 
other than two point releases incorporating small bugfixes.  It's dead, and 
it's been dead for a long time now.  I will revisit this and try to get a 
conclusion, and while any decision made for Xerces-C++ will also affect Xalan 
immediately, it should retired in any case.

For Xalan-J, this actually spurred someone into action, and some work is 
currently being done.  However, it remains to be seen if this is sufficient to 
revive the project--a short term burst of activity does not mean the project is 
healthy for the longer term.

For Xerces-C++, I think we're in a very similar situation to Xalan-C.  We have 
sporadic maintenance work by Scott and myself, and recently we've had Even 
Rouault provide fixes based upon static analysis to fix some longstanding 
issues.  But other than that, we don't have much activity.  I do think we need 
to be honest with our users about the true status of this project.  If it's not 
maintained, we should consider moving it to the Attic.  I personally think that 
would be the correct course of action.  I don't think we should be encouraging 
new use of Xerces-C++ if it's an unmaintained project with a legacy codebase, 
it's doing them a disservice to pretend that it's viable and well supported, 
which it is not.

I originally used Xerces+Xalan for XML and XSLT processing in C++.  As Boris 
said, they are the only game in town.  But that doesn't necessarily mean they 
are a sensible choice or the right choice.  I only joined the project because I 
was struggling so much to incorporate them into a modern C++11 (now C++14)  
codebase.  It's really painful:

* It predates adoption of Standard C++ (98)
* It doesn't use standard exception types
* It doesn't use conventional memory allocation strategies
* It doesn't use conventional character types
* Memory management is a pain, which is exacerbated by the need to do so much 
manual string conversions (e.g. UTF-8 to UTF-16 and vice versa).
* The build system was problematic on modern platforms

A lot of the work I've done is to gradually address some of these.  Like 
allowing char16_t as XMLCh.  And adding CMake support.  Clang support and fixes 
to work with modern compilers.  And adding support for other standard library 
features.  But while all of these things are small improvements which have kept 
Xerces-C++ usable on modern platforms and improved its interoperability with 
other libraries, they don't even start to tackle some of the bigger problems.  
It's still an archaic codebase from the '90s, with all of the design problems 
implicit in that.  It's not a library anyone is going to use out of choice, but 
out of necessity in the absence of any other options.  If Xerces-C++ was to be 
retired, projects will need to consider other options, be that other libraries, 
or other languages.

I did want to rectify some of these, which is why I started some clean-ups on 
the 4.0.0 branch.  But we have to be realistic: there is not the will or the 
manpower to do this, and I suspect there isn't much user demand for this 
either--this is a legacy project with legacy users.

I've updated the git statistics I did earlier in the year, which can be viewed 
or downloaded here: 
https://codelibreconsulting.sharepoint.com/:x:/s/Opensourcesoftware/EabAzxgzU3pCjUSKSVvWjZgBlUGZUb91q2PVMkGk1oaIHw?e=MVBvPA.
  This includes Scott's recent changes for this month to date.  Overall, you 
can see that the major development happened early on, that it reached maturity 
in the late 2000s, and it has been moribund for the last decade.  This is the 
contributor summary since 01 Oct 2022:

$ git shortlog -s --oneline --all --since "01 OCT 2012"
    35  Alberto Massari
     1  Chris Mc
    26  Even Rouault
     1  Fred Hornsey
   183  Roger Leigh
   171  Scott Cantor

I only joined out of necessity to keep it going for use by my work projects, 
continued to do some work afterward, and I'm going to cease that entirely.  
Even is in a similar situation.  I get the impression Scott might be in a 
similar situation as well.  I don't think any of us have a deep understanding 
of the internals--we have done essential maintenance only, tinkering around the 
edges.  There are no active maintainers who can actually maintain or refactor 
any of the core code to address some of the bugs we do not have the 
understanding or the resources to tackle.

Realistically, who are our current userbase, and what is the future for the 
project?  If you want to parse and validate XML, there are other (better) 
choices without all of the legacy baggage.  If you need to use esoteric 
features then Xerces-C++ is the only choice, but the number of people who would 
want that is a tiny and dwindling minority as XML ceases to be the technology 
of choice.  I only used it because 15 years previously, some project architects 
had jammed every obscure XML and XSLT feature possible into the project and 
forced its use as a result.  None of that was strictly necessary, and nowadays 
few people would make those technical choices.  You would keep it simple and 
use a simpler library.  Given a free choice, I would have chosen something 
else.  It was a massive pain for our end users, and I was contracted to replace 
it by one of them to replace it with something else because they really didn't 
want it as a dependency (they went with QtXML which has solid support for the 
DOM).

I don't want to sound too negative here.  Xerces-C++ has served the world well, 
was pioneering in its time, and will continue to serve its existing users for 
time to come.  But I think we also need to be honest and realistic about the 
status of the project today, and let it be retired with grace.


Kind regards,
Roger

> -----Original Message-----
> From: Cantor, Scott <canto...@osu.edu>
> Sent: 06 October 2022 13:14
> To: c-dev@xerces.apache.org
> Subject: Re: Prepping a 3.2.4 release
> 
> >    Getting off to what? Xerces-C++, with all its warts, is the only working,
> >    open source XML Schema implementation for C/C++. And I know for a
> fact
> >    that it is being used without much problems by quite a few people.
> 
> That's not relevant to the question. "There is nothing" is a real world 
> answer,
> though I thought libxml2 had schema support also. I haven't checked it in a
> long time and it was moving that way even back when I picked Xerces.
> 
> People use EOL software all the time. People don't make rational decisions a
> lot of the time because they think the risk is outweighed by the cost, and 
> that
> goes great until it doesn't.
> 
> >    What I think would be reasonable to state is that the project, due to
> >    limited resources, may not be able to promptly fix bugs and security
> >    vulnerabilities.
> 
> That is disqualifying for production software. But let's just agree to 
> disagree.
> You're not going to convince me and I'm not in the "convincing" business.
> 
> If the rest of the PMC agrees to post such a warning, then I will when I
> update the site for the release.


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org

Reply via email to