See responses inline:

On Jun 28, 2011, at 6:26 PM, Adam Barth wrote:

> A question and a comment:
> 
> 1) Will this let us to remove the code for both the libxml2 and the
> QtXml parsers?  I'd certainly much rather have one XML parser than
> three.

This won't replace libxslt or QtXmlPatterns for XSL-T, as they depend on the 
respective XML libraries. The goal for this XML parser is to be able to replace 
the core XML parser itself. XSL-T support would have to come later.

> 2) One thing we found very helpful in working on the HTML parser was a
> good test suite.  Presumably there are existing XML parsing test
> suites.  You might consider landing one (or more) of these test suites
> as a first step.
> 
> Adam

I know that W3C provides a test suite, but it's probably not that 
comprehensive. I can try to find more online; I'm sure that some of the open 
source projects like libxml2 provide some.

Jeffrey Pfau

> 
> On Tue, Jun 28, 2011 at 6:12 PM, Jeffrey Pfau <jp...@apple.com> wrote:
>> Currently, WebCore uses libxml2, or, if available, QtXml to parse incoming 
>> XML. However, QtXml isn't always available, and using libxml2 exposes its 
>> own share of problems. As such, I'm undertaking writing an XML parser that 
>> uses no external libraries.
>> 
>> The first step to doing this is to add a new flag that switches off the 
>> other two parsers. As the parsers are already independent and can be 
>> switched between by checking USE(QXMLSTREAM), I am adding USE(LIBXML2) 
>> checks, replacing the #else conditionals, and also a new ENABLE check, 
>> tentatively called NEW_XML (although names such as NATIVE_XML or XML_NATIVE, 
>> etc, may be more appropriate).
>> 
>> As there will probably be a new slew of files pertaining to XML parsing, I 
>> will put these files in WebCore/xml/parser, and move the existing 
>> XMLDocumentParser* file into this new directory. As far as I know, the 
>> placement of these files in WebCore/dom/ is legacy, and, assuming the build 
>> on each platform is changed, it makes sense to move them.
>> 
>> Once all the files are in a logical place, I plan to make a new file for a 
>> skeleton of the new XMLDocumentParser, at least to get it to link until a 
>> real one is in place, even if the XML parser at that point is just a data 
>> sink.
>> 
>> From there, I plan to copy and modify a good chunk of the lower level HTML 
>> tokenization and parsing code, and make changes as necessary to make it work 
>> on generalized XML, at least until I can generalize the common code in such 
>> a way that the HTML and XML tokenizers can be subclasses and use common 
>> code. I'd probably do the refactoring at the end.
>> 
>> I'm still exploring the existing parsing code, but I'd probably work my way 
>> up from there. I've read a lot of the XML 1.0 spec in preparation, as well, 
>> but it doesn't have much on implementation itself. If QtWebKit or parsing 
>> people have any comments, concerns, or help, I'd be more than willing to 
>> listen--I'm just starting here, and I'm not completely familiar with the 
>> codebase.
>> 
>> Although no code is checked in so far, I've started on this list already and 
>> have gotten as far as the new flags, a skeleton XMLDocumentParserNew.cpp, 
>> and making a tokenizer that compiles and links, but is completely untested.
>> 
>> Jeffrey Pfau
>> _______________________________________________
>> webkit-dev mailing list
>> webkit-dev@lists.webkit.org
>> http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev
>> 

_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Reply via email to