> -----Original Message-----
> From: Boris Kolpackov <bo...@codesynthesis.com>
> Sent: 10 October 2022 17:17
> To: c-dev@xerces.apache.org
> Subject: Re: Prepping a 3.2.4 release
> 
> Cantor, Scott <canto...@osu.edu> writes:
> 
> > On 10/10/22, 10:14 AM, "Boris Kolpackov" <bo...@codesynthesis.com>
> wrote:
> >
> > > What would be the other options for XML Schema validation usable
> > > from C++?
> >
> > Libxml2?
> >
> > Says it supports XML Schema 1.0 (which is all Xerces ever did AFAIK).
> 
> Last time I checked (which was admittedly a few years ago), while they listed
> XML Schema support on their front page, if you dug deeper, it quickly
> became apparent that support was WIP/incomplete.

It might not support every feature, but is the current implementation "good 
enough" for its userbase?  It's worked with the schemas I've validated with it 
in the past.  If there are unimplemented bits [the docs could simply be 
incorrect here], and those are restricted to obscure features which aren't in 
common use, then the lack of these bits has no practical effect upon using the 
library.

Another option is QtXML which also offers schema validation, and which also has 
worked well for me in the past.

Also Microsoft MSXML.  I've not used it myself, but if you're using this 
platform it's an obvious option.

> > >    And, no, rewriting everything in a different language just because
> > >    Xerces-C++ has some bugs is not a sensible step.
> >
> > That is a matter of opinion, because if a security bug pops up (*)
> > that nobody can fix, you (and I) are going to be in a very, very bad 
> > position.
> > Moving to a different language is the only sensible option if in fact
> > there is nothing else to use, and I am doing exactly that, despite the
> > many hours it will take.
> 
> Not every application that uses Xerces-C++ is security sensitive. In fact, 
> IMO,
> it's insane to parse untrusted XML regardless of the
> implementation/language used -- the format is just too complex to have any
> trust in the implementation. Also note that if you think
> Xerces-C++ is somehow exceptionally bad, you are mistaken.

I don't think it's especially relevant to compare Xerces-C++ against other 
libraries in this manner.  Whether other libraries are better or worse than 
Xerces-C++ is a distraction from the question being addressed: is Xerces-C++ as 
an Apache project a viable ongoing concern with a community which are willing 
and able to support it for the future?

>From the discussion so far, it's clear that there are strong differences of 
>opinion on the list regarding whether or not it is time to retire Xerces-C++.  
>We will need to call for a vote on this after this discussion is completed, 
>but before we do that it would be useful to look at this a bit more 
>objectively and assess the criteria by which we might judge the activity and 
>health of the project.  By what criteria would we judge the project to be in a 
>healthy state?  Or an unhealthy state?

As an example of one way to summarise the data, this is a summary of committers 
as of today: 
https://codelibreconsulting.sharepoint.com/:x:/s/Opensourcesoftware/EYoNqCnjG-JBnHq-yZqhhDkBxYx2CDbWNT4j7QAWqL_maA?e=iBwfSf
 and it includes each unique committer, the number of commits they made and the 
last date they made a commit.  However, note that "most recent" is quite a 
crude view of a person's contribution history.  I've just repointed the repo 
location from SVN to Git at https://openhub.net/p/xerces-c and this will 
generate some reasonably detailed contribution graphs and other repository 
metrics once it fetches and processes the current xerces-3.2 branch.

One simple question to ask would be: how many active contributors does the 
project have today?

I would think the answer to this would be: effectively zero.  Neither myself 
nor Scott are actively contributing to the project, and we are the two most 
recent contributors of anything more than trivial one-off fixes.  Next after 
that is Even who has done some static analysis work--all bugfixing of resource 
leaks etc.  After that there is Alberto who has made infrequent and irregular 
commits.  Then there is yourself, with an 8 year gap in between.  Looking at 
the contributors, there has been a gap of over a decade during which time there 
was really only one person who made any changes at all (Alberto), and his 
activity tailed off significantly after ~2014.  Scott (starting 2015) and 
myself (starting 2017) did some maintenance releases and did some bugfixing, 
but that's basically it.  I'm leaving; I no longer use Xerces or even much XML 
at all.  Scott said he was migrating away gradually.  Where does that leave the 
project?  Who is actually going to be maintaining it, doing all of the security 
work, user support, maintenance releases, testing, CI upkeep, PR review and 
testing of contributed fixes, etc.  Both Scott and myself have put in a *lot* 
of work behind the scenes to keep this project going for the last 8 years.  
Without anyone picking up these ongoing commitments, the project will not be a 
viable ongoing concern.  I've already let quite a few of these slip as my other 
commitments have taken priority, and it will only continue to get worse 
(example: the AppVeyor CI has been partially broken for over a year).

Another question to ask would be: Is the niche which Xerces-C++ serves growing, 
or declining?

For this, I would argue that both C++ in general, and XML in general are known 
to be declining relative to other languages.  For the intersection of C++ and 
XML, I would argue that this is declining even faster.  For people still using 
XML, it's better supported by Java, C# and Python.  I myself have been using 
Python ElementTree for common stuff; it's quicker and simpler, even if it's not 
as performant nor as featureful.  So we have a declining interest and a 
declining community.  This is also borne out in developer interest and overall 
community engagement.  It's all nearly nonexistent.  Not as bad as for 
Xalan-C++, which is even more niche, but it's not a pretty picture.  With this 
borne in mind, a following question would be this: at what point would we 
consider the Xerces-C++ project to have made the transition between being a 
living, viable project and being a dead, non-viable project?

This is only my take on this.  But I think Xerces-C++ made that transition in 
2015, based upon the commit records.  That was the end of sustained developer 
interest in Xerces-C++.  Everything after that has been remedial work by Scott 
and myself to keep it chugging along on contemporary platforms.  As soon as we 
stop doing that work it will start to bitrot.  The amount of work that went 
into keeping it going with contemporary compilers, runtimes and dependent 
libraries should not be underestimated.  Between us, we have sunk hundreds of 
man-hours into it to keep it going.

We cannot in good faith let the project drift into unsupported obsolescence 
without clearly acknowledging this reality and properly retiring the project.  
This won't stop anyone from continuing to use it in their products.  But it 
will clearly communicate the true support status to the wider community, and 
dependent projects will need to assess what they will need to do in 
consequence.  And if that means projects will have to look at other 
alternatives, then that is what they will have to do.  I already did one port 
to QtXML; it was not difficult.  It's either that or step up to do what Scott 
and myself did when our projects had critical dependencies upon Xerces-C++: 
join the project and do the needed maintenance work.


Kind regards,
Roger





Kind regards,
Roger

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscr...@xerces.apache.org
For additional commands, e-mail: c-dev-h...@xerces.apache.org

Reply via email to