Re: CLI -I sax feature - remove?

Steve Lawrence Mon, 18 Jul 2022 06:44:37 -0700

By this argument, should we also remove JDOM, W3CDOM, and Scala XMLinfoset outputters from the CLI? These are also effectively API's aswell. The CLI just takes the results and converts them to a string foroutput.

Maybe the CLI only cares about outputting to XML using the XML TextInfoset Outputter, and there shouldn't even be an option for otheroutputters?

One counter argument, the -I command is useful from a testingperspective to make sure we are building the API objects correctly, withthe assumption that to .toString of those objects is correct. But ourTDML runner already does that, so maybe we can just rely on those testsand this isn't necessary.

If we do remove the -I option, what are thoughts on keeping the optionfor the performance subcommand? This is a quick way to test the overheadof different InfosetOutputter's and is something we have done in thepast multiple times. Note that for the performance subcommand, thenon-text InfosetOutputters don't actually convert to a string, so itonly measures the speed to create the API objects.

The exception is with SAX, which does covert the SAX events to textduring the performance subcommand (using theDaffodilParseOutputStreamContentHandler). That is probably wrong, sincethat measures the speed of creating SAX events AND whatever theContentHandler does, which it most cases will not be the same as whatthis content handler does. We really just want a measure of how long ittakes Daffodil to parse data and convert it to SAX events.

This is further complicated for the performance --unparse option, wherethe SAX performance includes the time to parse the infoset AND convertto SAX events for Daffodil to unparse. Ideally parsing the infoset wouldnot be included since that is overhead that most SAX implementationswon't have, or would at least be outside of Daffodil. One way to resolvethis is to preprocess the infoset into a list of "events" (similar toTestInfosetEvent in TestJavaAPI.scala), and use a custom XMLReader thatuses that list for creating SAX events to unparse. Adds a bit ofcomplication though.


I'd suggest the changes should be:

1. Remove the -I option from the parse/unparse subcommands only, andalways use the XMLTextInfosetInputter/Outputter

2. Keep the -I option the same for the performance subcommand, withthese two changes:

2a. Change SAX performance --parse so that it outputs to a "null"content handler

2b. Change SAX performance --unparse so that we preprocess the infosetand convert to a list of "events". A custom XMLReader is used to feedthese events to Daffodil for unparsing.



On 7/11/22 3:19 PM, Mike Beckerle wrote:

In the CLI, there is the -I option to specify infoset-type.

One of the choices is 'sax'.

This is a mistake I think. This is really "XML text by way of calling the
SAX API". It's effectively a testing mode for us.

SAX usage is inherently an API.

I believe we should remove this feature from the CLI, because it creates a
lot of confusion. It requires test-mode things to be in the main-libraries
where the CLI can find them.

If we require SAX to be used as it is intended, from applications calling
Daffodil via APIs, then all this "xml text to/from SAX-event" code all ends
up in src/test where it belongs.

Thoughts?

Re: CLI -I sax feature - remove?

Reply via email to