I'm a little negative of developing a new XML parser. I'm afraid that the
new parser introduces a lot of security/stability problems which existing
parsers already resolved.

How about importing Expat parser to WebKit repository and maintain it by
ourselves?


On Wed, Jun 29, 2011 at 10:12, Jeffrey Pfau <jp...@apple.com> wrote:

Currently, WebCore uses libxml2, or, if available, QtXml to parse incoming
XML. However, QtXml isn't always available, and using libxml2 exposes its
own share of problems. As such, I'm undertaking writing an XML parser that
uses no external libraries.

The first step to doing this is to add a new flag that switches off the
other two parsers. As the parsers are already independent and can be
switched between by checking USE(QXMLSTREAM), I am adding USE(LIBXML2)
checks, replacing the #else conditionals, and also a new ENABLE check,
tentatively called NEW_XML (although names such as NATIVE_XML or XML_NATIVE,
etc, may be more appropriate).

As there will probably be a new slew of files pertaining to XML parsing, I
will put these files in WebCore/xml/parser, and move the existing
XMLDocumentParser* file into this new directory. As far as I know, the
placement of these files in WebCore/dom/ is legacy, and, assuming the build
on each platform is changed, it makes sense to move them.

Once all the files are in a logical place, I plan to make a new file for a
skeleton of the new XMLDocumentParser, at least to get it to link until a
real one is in place, even if the XML parser at that point is just a data
sink.

 From there, I plan to copy and modify a good chunk of the lower level HTML
tokenization and parsing code, and make changes as necessary to make it work on generalized XML, at least until I can generalize the common code in such
a way that the HTML and XML tokenizers can be subclasses and use common
code. I'd probably do the refactoring at the end.

I'm still exploring the existing parsing code, but I'd probably work my way up from there. I've read a lot of the XML 1.0 spec in preparation, as well,
but it doesn't have much on implementation itself. If QtWebKit or parsing
people have any comments, concerns, or help, I'd be more than willing to
listen--I'm just starting here, and I'm not completely familiar with the
codebase.

Although no code is checked in so far, I've started on this list already
and have gotten as far as the new flags, a skeleton
XMLDocumentParserNew.cpp, and making a tokenizer that compiles and links,
but is completely untested.

Jeffrey Pfau
_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev




--
TAMURA Kent
Software Engineer, Google




_______________________________________________
webkit-dev mailing list
webkit-dev@lists.webkit.org
http://lists.webkit.org/mailman/listinfo.cgi/webkit-dev

Reply via email to