stevedlawrence commented on code in PR #812:
URL: https://github.com/apache/daffodil/pull/812#discussion_r926589378
##########
daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala:
##########
@@ -421,7 +425,7 @@ class CLIConf(arguments: Array[String]) extends
scallop.ScallopConf(arguments) {
val config = opt[File](short = 'c', argName = "file", descr = "XML file
containing configuration items")
val vars = props[String](name = 'D', keyName = "variable", valueName =
"value", descr = "Variables to be used when parsing. Can be prefixed with
{namespace}.")
- val infosetType = opt[InfosetType.Type](short = 'I', argName =
"infoset_type", descr = "Infoset type to output or unparse. Use 'xml',
'scala-xml', 'json', 'jdom', 'w3cdom', 'sax', or 'null'. Defaults to 'xml'.",
default = Some(InfosetType.XML))
+ val infosetType = opt[InfosetType.Type](short = 'I', argName =
"infoset_type", descr = "Infoset type to output or unparse. Use 'xml',
'scala-xml', 'json', 'jdom', 'w3cdom', 'sax', 'exi', or 'null'. Defaults to
'xml'.", default = Some(InfosetType.XML))
Review Comment:
I don't think we should remove SAX from the performance command. We know
there are issues with SAX performance and this is really the only way we
currently have to test that.
##########
daffodil-cli/src/it/scala/org/apache/daffodil/parsing/TestCLIParsing.scala:
##########
@@ -1278,18 +1278,18 @@ class TestCLIparsing {
}
}
- @Test def test_XXX_CLI_Parsing_SimpleParse_sax(): Unit = {
+ @Test def test_XXX_CLI_Parsing_SimpleParse_exi(): Unit = {
val schemaFile =
Util.daffodilPath("daffodil-test/src/test/resources/org/apache/daffodil/section00/general/generalSchema.dfdl.xsd")
val (testSchemaFile) = if (Util.isWindows) (Util.cmdConvert(schemaFile))
else (schemaFile)
val shell = Util.start("")
try {
- val cmd = String.format(Util.echoN("Hello") + "| %s parse -I sax -s %s
-r e1", Util.binPath, testSchemaFile)
+ val cmd = String.format(Util.echoN("Hello") + "| %s parse -I exi -s %s
-r e1 | md5sum", Util.binPath, testSchemaFile)
Review Comment:
Is this going to fail on windows for some users? I don't think md5sum is
usually available on Windows. We might need to output to a tmp file and then
compare with a known good file, like input14.exi or input18.exi?
##########
daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala:
##########
@@ -873,8 +883,6 @@ object Main {
val parseResult = eitherOutputterOrHandler match {
case Right(saxContentHandler) =>
- // reset in case we are streaming
- saxContentHandler.reset()
Review Comment:
Should we error or warning if streaming is enabled? The result of streamed
EXI seems useless since we throwing NUL bytes as separators. And it's going to
be impossible to unparse stream since EXI data likely has NUL bytes in it that
we use for a separator In fact, we use a java.util.Scanner for finding the NUL
delimiter, which is text based. So even if there were no NUL bytes in the input
aside from our delimiter, I'm imagine things still wouldn't work. We'll need a
completely different mechanism for supporting streaming EXI.
##########
daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala:
##########
@@ -1400,8 +1408,15 @@ object Main {
private def unparseWithSAX(
is: InputStream,
- contentHandler: DFDL.DaffodilUnparseContentHandler): UnparseResult = {
- val xmlReader = DaffodilSAXParserFactory().newSAXParser.getXMLReader
+ contentHandler: DFDL.DaffodilUnparseContentHandler,
+ infosetType: InfosetType.Type): UnparseResult = {
+ val xmlReader = infosetType match {
+ case InfosetType.EXI => {
+ val exiSource = new EXISource()
+ exiSource.getXMLReader
+ }
+ case _ => DaffodilSAXParserFactory().newSAXParser.getXMLReader
Review Comment:
What was the reason for removing SAX support aside from it's a little
unintuitive since it's an API, very similar to JDOM, Scala-XML, W3CDOM, etc,
that we do still support? It looks like the way you've made the changes SAX
should all still work. In fact, right here we still have support for normal SAX.
##########
daffodil-cli/src/main/scala/org/apache/daffodil/Main.scala:
##########
@@ -1400,8 +1408,15 @@ object Main {
private def unparseWithSAX(
is: InputStream,
- contentHandler: DFDL.DaffodilUnparseContentHandler): UnparseResult = {
- val xmlReader = DaffodilSAXParserFactory().newSAXParser.getXMLReader
+ contentHandler: DFDL.DaffodilUnparseContentHandler,
+ infosetType: InfosetType.Type): UnparseResult = {
+ val xmlReader = infosetType match {
+ case InfosetType.EXI => {
+ val exiSource = new EXISource()
+ exiSource.getXMLReader
+ }
+ case _ => DaffodilSAXParserFactory().newSAXParser.getXMLReader
+ }
Review Comment:
In addition to potential performance improvements, one benefit of creating a
custom EXIInfosetInputter/Outputter is that is becomes part of the Daffodil
API, and anyone using Daffodil can easily using it by just choosing the right
class. With this change, EXI support is only added to the CLI--anyone wanting
to using EXI must use code similar to what is in the CLI and do it themselves.
And we don't even really need a new release for that (except for maybe the
reset() change?). Can we already claim we support EXI via our SAX API?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]