This latest version of pyparsing has a few minor bug-fixes and enhancements, and a performance improvement of up to 100% increase in parsing speed.
This release also includes some new examples: - parsePythonValue.py - parses strings representing lists, dicts, and tuples, with nesting support - sql2dot.py - SQL diagram generator, parsed from schema table definitions - htmlStripper.py - strips HTML tags from HTML pages, leaving only body text Download pyparsing 1.4.5 at http://pyparsing.sourceforge.net. The pyparsing Wiki is at http://pyparsing.wikispaces.com -- Paul ======================================== Pyparsing is a pure-Python class library for quickly developing recursive-descent parsers. Parser grammars are assembled directly in the calling Python code, using classes such as Literal, Word, OneOrMore, Optional, etc., combined with operators '+', '|', and '^' for And, MatchFirst, and Or. No separate code-generation or external files are required. Pyparsing comes with a number of parsing examples, including: - "Hello, World!" (English, Korean, and Greek) - chemical formulas - configuration file parser - web page URL extractor - 5-function arithmetic expression parser - subset of CORBA IDL - chess portable game notation - simple SQL parser - Mozilla calendar file parser - EBNF parser/compiler - Python value string parser (lists, dicts, tuples, with nesting) (new) - HTML tag stripper (new) Version 1.4.5 - December, 2006 ------------------------------ - Removed debugging print statement from QuotedString class. Sorry for not stripping this out before the 1.4.4 release! - A significant performance improvement, the first one in a while! For my Verilog parser, this version of pyparsing is about double the speed - YMMV. - Added support for pickling of ParseResults objects. (Reported by Jeff Poole, thanks Jeff!) - Fixed minor bug in makeHTMLTags that did not recognize tag attributes with embedded '-' or '_' characters. Also, added support for passing expressions to makeHTMLTags and makeXMLTags, and used this feature to define the globals anyOpenTag and anyCloseTag. - Fixed error in alphas8bit, I had omitted the y-with-umlaut character. - Added punc8bit string to complement alphas8bit - it contains all the non-alphabetic, non-blank 8-bit characters. - Added commonHTMLEntity expression, to match common HTML "ampersand" codes, such as "<", ">", "&", " ", and """. This expression also defines a results name 'entity', which can be used to extract the entity field (that is, "lt", "gt", etc.). Also added built-in parse action replaceHTMLEntity, which can be attached to commonHTMLEntity to translate "<", ">", "&", " ", and """ to "<", ">", "&", " ", and "'". - Added example, htmlStripper.py, that strips HTML tags and scripts from HTML pages. It also translates common HTML entities to their respective characters. -- http://mail.python.org/mailman/listinfo/python-list