Re: Xerces-C API changes for XQilla

Boris Kolpackov Wed, 16 Apr 2008 02:27:05 -0700

Hi John,

John Snelson <[EMAIL PROTECTED]> writes:


> 1) Problem:
>
> XPath 2.0 is just different to XPath 1.0. We've therefore got our own
> version of DOMXPathResult (XPath2Result) which makes more sense in this
> context:
>
> http://xqilla.sourceforge.net/docs/dom3-api/classXPath2Result.html
>
> Solution:
>
> It's probably simple enough to either extend DOMXPathResult to include
> the extra functionality in XPath2Result, or to include it as a new class
> called DOMXPath2Result.

I did a quick check and it appears that the DOMXPathResult is very
similar to DOMXPath2Result. I would therefore suggest that we try
to add the missing functionality to DOMXPathResult as non-standard
extensions (though we should try to use names that will likely be
used in the next version of DOM3 when it is updated to include
support for XPath 2, for example getIntegerValue instead of asInt).
What is your feeling on this approach? Also did you base your
DOMXPath2Result on any draft spec (e.g., where do the asDouble,
asInt, etc., names come from)?


> 2) Problem:
>
> It's necessary to get access to DOMDocumentImpl, which isn't in the
> public API, in order to implement the DOM3 XPath API. Needing access to
> the Xerces-C source code to compile XQilla is a big problem for our
> maintainers. We need DOMDocumentImpl for a number of reasons:
>
> [...]
>
> Solution:
>
> Put DOMDocumentImpl in the public API.

The DOMDocumentImpl.hpp is now installed with the rest of the headers.
I've also changed all private data members and functions to be protected
in all DOM*Impl classes. Is there anything else we need to do?


> 3) Problem:
>
> We need access to DOMWriterImpl in order to override it to know how to
> write namespace nodes. DOMWriterImpl isn't in the public API.
>
> Solution:
>
> Put DOMWriterImpl in the public API, or implement namespace node
> handling in it.

The same situation as with DOMDocumentImpl.hpp. I tend to prefer to
leave the namespace node implementation in XQilla for now since it
is XPath-specific and is not used by the limited built-in XPath
support in Xerces-C++.


> 4) Problem:
>
> XQilla can construct typed DOMDocuments, so we need a way to set the
> type information on these nodes. Currently we use DOMTypeInfoImpl,
> DOMAttrImpl and DOMElementNSImpl to do this, which aren't in the public API.
>
> Solution:
>
> Implement a method of setting the type information on an element or
> attribute, or put DOMTypeInfoImpl, DOMAttrImpl and DOMElementNSImpl in
> the public API.

I don't think an end user would ever need this functionality since the
type information is tied to the grammar being used and can only be set
by a parser that can control both. The *Impl headers are available for
XQilla to use.


> 5) Problem:
>
> RegularExpression is not thread safe or consistent with it's use of
> MemoryManager. It's also not quite flexible enough to implement XSLT
> 2.0's analyze-string, and it has bugs in the replace() methods.
>
> http://www.w3.org/TR/xslt20/#analyze-string
>
> Solution:
>
> I have a patch that fixes all of this in Xerces-C 2.8, and I can update
> it to apply to 3.0. I'm in the process of getting permission to sign the
> contributor agreement.

Sounds good.


> 6) Problem:
>
> The socket and WinSock HTTP InputStream implementations have fixed
> buffers which can result in buffer overflow. They needlessly duplicate a
> whole load of code that could be shared. In addition, a lot of
> algorithms need access to the HTTP "Content-Type" header, to decide how
> to parse a file, or what encoding it is in - for instance see XSLT 2.0's
> unparsed-text() function:
>
> http://www.w3.org/TR/xslt20/#unparsed-text
>
> Solution:
>
> I have a patch that implements this functionality for
> UnixHTTPURLInputStream and BinHTTPURLInputStream (WinSock) in Xerces-C
> 2.8. I added BinInputStream::getContentType() to get access to the
> "Content-Type" header. I can update this code for Xerces-C 3.0.

Sounds good. There are also Curl, MacOS, and libWWW net accessors.
Hopefully it will be easy to implement getContentType() for them.


> 7) Problem:
>
> GrammarResolver has a bug where it fails to initialize it's XSModel if
> the XMLGrammarPool it is created with is locked.
>
> Solution:
>
> We hack this at the moment, but it would be great if this could be fixed.

Would you be willing to work on a patch? Also I hit a bug in this area
once that may be related. This code works:

    auto_ptr<GrammarResolver> gr (new GrammarResolver (0));

    // load some schemas into gr

    XMLGrammarPool* gp = gr->getGrammarPool ();
    gp->lockPool ();
    XSModel* xsm = gp->getXSModel ();

While if I remove lockPool(), the returned XSModel is invalid. Or may
be this is how it is supposed to work.

Boris

-- 
Boris Kolpackov, Code Synthesis Tools   http://codesynthesis.com/~boris/blog
Open source XML data binding for C++:   http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Xerces-C API changes for XQilla

Reply via email to