stevedlawrence commented on PR #1605: URL: https://github.com/apache/daffodil/pull/1605#issuecomment-3700161979
It looks like there are two tests related to spaces in names that are failing. I don't think they are unrelated failures, but are due to additional changes needed to support spaces. One test is the new spaces_in_names_01 test, which fails with this stack trace: ``` Test org.apache.daffodil.section00.general.TestSpace2.spaces_in_names_01 failed: java.lang.IllegalArgumentException: schemaLocation is not a valid URI: test space/test 2/multi_A_05_nons.dfdl.xsd, took 0.225 sec [error] at org.apache.daffodil.lib.xml.XMLUtils$.resolveSchemaLocation(XMLUtils.scala:1473) [error] at org.apache.daffodil.core.dsom.IIBase.$anonfun$4(IIBase.scala:203) [error] at scala.Option.flatMap(Option.scala:283) [error] at org.apache.daffodil.core.dsom.IIBase.resolvedSchemaLocation$lzyINIT1$$anonfun$1(IIBase.scala:198) [error] at org.apache.daffodil.lib.oolag.OOLAG$OOLAGValueBase.body$lzyINIT1(OOLAG.scala:523) [error] at org.apache.daffodil.lib.oolag.OOLAG$OOLAGValueBase.body(OOLAG.scala:523) [error] at org.apache.daffodil.lib.oolag.OOLAG$OOLAGValueBase.valueAsAny(OOLAG.scala:708) ... Caused by: java.net.URISyntaxException: Illegal character in path at index 4: test space/test 2/multi_A_05_nons.dfdl.xsd [error] at java.net.URI$Parser.fail(URI.java:2995) [error] at java.net.URI$Parser.checkChars(URI.java:3166) [error] at java.net.URI$Parser.parseHierarchical(URI.java:3248) [error] at java.net.URI$Parser.parse(URI.java:3207) [error] at java.net.URI.<init>(URI.java:645) [error] at org.apache.daffodil.lib.xml.XMLUtils$.resolveSchemaLocation(XMLUtils.scala:1468) ``` The "Caused by" part of the exception indicates we probably we have something like `schemaLocation="test space/test 2/multi_A_05_nons.dfdl.xsd"`, which we can find in the test schema files in the "test spaces/" directory. So it seems like we don't currently support spaces in the resolveSchemaLocation function. We may need further investigations to figure out how to correctly fix things and what other functions might need changes. I think the core issue is that some functions expect string paths to be valid URIs where spaces are escaped with %20 (e.g. resolveSchemaLocation, optRelativeFileURI, optRelativeResourceURI), but some expect them to be normal paths without escapes (e.g. getResourceOption, getRequiredResource) and some I'm not sure it's very well defined (e.g. getResourceAbsoluteOrRelativeOption). I'm sure there are other functions not listed that have some built-in assumptions. It feels like the right approach to convert our strings to `URI`s as early as possible and pass the `URI` around instead of strings. That way we know exactly what we are dealing with. And if we ever need to call a function that expects an unescaped string (e.g. `getResourceOption`), then we can call `uri.getPath()` which will do that unescaping and provide us a string that can be used for something like `Class.getResource()`. Though, this could potentially have a pretty big cascading effect. Changing one variable to a `URI` might require other functions to need `URI`s and it might turn into a significant change. Another potential issue is some functions, while not technically part of the public API, are used externally. For example, `getRequiredResource`, `getResourceOption`, etc. are used in a lot of tests, and we don't really want to have to change those all to use URIs--they still really do want to use strings. Or alternatively, instead of changing things to URIs we just kindof have a convention that we expect these string paths to be valid URIs with escaped spaces (and sometimes we'll make escape things for the user), and the functions that don't want escapes can unescape the strings before use. So all this to say, I'm not really sure what the perfect approach is. It might require making some changes and see what gets affect, what tests start failing, etc. and use that to guide us to the best solution. You're welcome to dig into this and we're happy to help, but this might also be veering into an issue that isn't quite as beginner friendly since it might start touching a lot of the codebase and dealing with undocumented assumptions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
