I was recently trying to use generateDS with lxml. generateDS by default parses the xml using SAX parser and creates a minidom Node object. It then traverses the node starting from root, to build the actual required object according to a schema. Now, the first part (parsing the xml string) can be easily converted to lxml, which returns an lxml etree Node object. However, I encountered some problems traversing this object with the generateDS code. What I find is that, though the algorithm used is generic and can be used to traverse any kind of node, the code itself is deeply tied to the minidom node. For example, functions like "getChildren()", attributes like "nodeValue" and "nodeType" and node types like "ELEMENT_NODE" or "TEXT_NODE" have been used, which are specific to minidom but are not found in other node elements - like in the node returned by lxml parsing.
The core functionality of the generateDS module should be separated form the type of node being operated on - so that the module becomes node-agnostic - and the same generateDS functions can be integrated with any parsing module - lxml, SAX, or anything else. This is especially important since lxml provides significant improvements in parsing performance (I noticed speed-ups of almost 100 times) compared to minidom, especially for large xmls of over 30-40 MBs.
------------------------------------------------------------------------------
_______________________________________________ generateds-users mailing list generateds-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/generateds-users