> That wouldn't work, as the document contains items that aren't in the schema,
I recall that was the whole point of your walking it yourself rather than using a yes-no external oracular validator, yes. To truly validate to a schema, you must recursively parse the file under control of the annotative grammar which is the schema. I don't expect querying the schema by xpath to work in general case, but your schema may be such you know it will. If it's as easy as determining that this tag isn't even in our lexicon, or our tag of this name doesn't have this attribute, our tag of this name is only allowed with certain parents that do not include the current document's usage, you should be able to mark #FAIL as you go, whichever is driving. You could preprocess the schema into a hash of rules to validate, possibly using Xpath as semantics for magic string values. If the schema is sufficiently abstract that you need backtracking search to determine what bits of the document XML have to be excluded for the rest to validate, you have a horror on your hands. > so you'd have to keep track of which nodes you validated, > and then still query or walk the document looking for things > that weren't covered by the schema. you can delete them when they fail validation, which is what I thought you said you'd do, or keep a list of refs to what fails. > I haven't tried using XML::Twig::XPath, but it deviates from the DOM > API, which can make it harder to port your code to a different XML library. Yes, it's a just-barely-sufficient-magic hack, not 100% solution. XML was supposed to be less overly-general than SGML, but it's still too too, too. bill _______________________________________________ Boston-pm mailing list [email protected] http://mail.pm.org/mailman/listinfo/boston-pm

