stevedlawrence opened a new pull request, #873: URL: https://github.com/apache/daffodil/pull/873
When using the CLI performance subcommand with the SAX infoset type, the performance loop currently includes serializing SAX events to XML text when parsing, and deserializing XML text to SAX events on unparsing. This means that the performance numbers when using the -I sax option are inflated since they include serialization/deserialization actions that would normally be handled outside of Daffodil or potentially never even take place in some usages. This changes the logic of the CLI performance subcommand with -I sax to avoid that. When parsing, we use a DefaultHandler which is a no-op for all SAX events. When unparsing, outside of the performance loop we convert the XML infoset to an array of custom SaxEvents--inside the loop we using a custom XMLReader that replays those events. This ensures there is no serialization/deserialization of XML in the performance loop. For the CLI parse/unparse subcommands, we maintain the current behavior sso that users can see a visualization of the SAX events as XML text, similar to how we pretty print JDOM, Scala XML, etc. Additionally, the infoset type logic in Main.scala has gotten pretty complicated, especially with the more recent additions of the SAX API and EXI support--lots of infoset type logic is spread out and handled with a scattering of Either objects. To make these changes easier and to make adding new infoset types easier in the future, all the infoset handling is refactored into separate InfosetHandler classes. Each class is specific to an infoset type, with functions split so that we can ensure preprocessing and initialization for infosets happen outside the performance loop. DAFFODIL-2400 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
