Tim Starling has uploaded a new change for review. ( https://gerrit.wikimedia.org/r/328143 )
Change subject: Better README.md ...................................................................... Better README.md Change-Id: Ic41dd895a6761d7b2e5c0cb55afbb58f237e2ab2 --- M README.md 1 file changed, 37 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/mediawiki/libs/RemexHtml refs/changes/43/328143/1 diff --git a/README.md b/README.md index 94b6f24..adfe2af 100644 --- a/README.md +++ b/README.md @@ -1 +1,37 @@ -Work in progress on a compliant PHP parser for HTML 5. +RemexHtml is a parser for HTML 5, written in PHP. + +RemexHtml aims to be: + +- Modular and flexible. +- Fast, as opposed to elegant. +- Robust, aiming for O(N) worst-case performance. + +RemexHtml contains the following modules: + +- A compliant preprocessor and tokenizer. This generates a token event stream. +- Compliant tree construction, including error recovery. This generates a tree + mutation event stream. +- A fast integrated HTML serializer, compliant with the HTML fragment + serialization algorithm. +- DOMDocument construction. + +RemexHtml presently lacks: + +- Encoding support. The input is expected to be valid UTF-8. +- Scripting. +- XML infoset coercion and XHTML serialization. +- Precise compliance with specified parse error generation. + +RemexHtml aims to be compliant with W3C recommendation HTML 5.1, except for +minor backported bugfixes. We chose to implement the W3C standard rather than +the latest WHATWG draft because our application needs stability more than +feature completeness. + +RemexHtml passes all [html5lib tests](https://github.com/html5lib/html5lib-tests]), +except for parse error counts and tests which reference a future version of the +standard. + +**WARNING** This is a new project, we are still developing use cases. So the API +is subject to change. + +For example code, see `bin/test.php`. -- To view, visit https://gerrit.wikimedia.org/r/328143 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic41dd895a6761d7b2e5c0cb55afbb58f237e2ab2 Gerrit-PatchSet: 1 Gerrit-Project: mediawiki/libs/RemexHtml Gerrit-Branch: master Gerrit-Owner: Tim Starling <tstarl...@wikimedia.org> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits