stevedlawrence commented on code in PR #1335:
URL: https://github.com/apache/daffodil/pull/1335#discussion_r1796950120
##########
daffodil-lib/src/main/scala/org/apache/daffodil/lib/xml/XMLUtils.scala:
##########
@@ -1459,34 +1459,27 @@ Differences were (path, expected, actual):
)
}
- val uriIsJustPathComponent =
- uri.getScheme == null &&
- uri.getAuthority == null &&
- uri.getQuery == null &&
- uri.getFragment == null &&
- uri.getPath != null
-
val optResolved: Option[(URISchemaSource, Boolean)] =
- if (uri.isAbsolute) {
- // an absolute URI is one with a scheme. In this case, we expect to be
able to resolve
- // the URI and do not try anything else (e.g. filesystem, classpath).
Since this function
+ if (uri.isAbsolute && uri.getScheme.contains("jar")) {
+ // an absolute URI is one with a scheme. In the case that it is a jar
uri
+ // we expect to be able to resolve the URI and do not try anything else
+ // (e.g. filesystem, classpath). Since this function
// is for schemaLocation attributes, we may eventually want to
disallow this, and only
// allow relative URIs (i.e. URIs without a scheme). We do have some
places that use
// absolute URIs in includes/imports and cannot remove this yet.
try {
- uri.toURL.openStream.close
+ uri.toURL.openStream.close()
val uss = URISchemaSource(Misc.uriToDiagnosticFile(uri), uri)
Some(uss, false)
} catch {
case e: IOException => None
}
- } else if (!uriIsJustPathComponent) {
- // this is not an absolute URI so we don't have a scheme. This should
just be a path, so
- // throw an IllegalArgumentException if that's not the case
- val msg =
- s"Non-absolute schemaLocation URI can only contain a path component:
$schemaLocation"
- throw new IllegalArgumentException(msg)
- } else if (uri.getPath.startsWith("/")) {
+ }
+ // we want to attempt to resolve the URI whether the non-jar uri has a
scheme or not,
+ // this is relevant for when we are validating with Xerces, and it calls
resolvesEntity
+ // we get URIs that look like file:/path/to/not/absolute/path ex:
file:/org/apache/daffodil/xsd/dafext.xsd
+ // that fail to be found in the above case, so we have to look them up
Review Comment:
Just to clarify, if we have something like this:
```
xsi:schemaLocation="urn:ogf:dfdl:2013:imp:daffodil.apache.org:2018:ext
/org/apache/daffodil/xsd/dafext.xsd"
```
Then Xerces changes that to `file:/org/apache/daffodil/xsd/dafext.xsd` and
sends that to the EntityResolver? Feels like a bug in Xerces. It doesn't do
that for include/import schemaLocation, not sure why it would do that for
xsi:schemaLocation.
Note this also means means that if we ever have something legit like this:
```xml
<xs:include schemaLocation="file:/foo/bar/baz.dfdl.xsd">
```
Then we will look on the class path for `/foo/bar/baz.dfdl.xsd`, which feels
wrong, we should only ever look at the filesystem for that kind of
schemaLocation. Anything else feels like a bug. And I'm not sure I want to
trade a Xerces bug for a Daffodil bug.
I'm wondering if we should that xsi:schemaLocation won't resolve anything to
a classpath, unless there is someway we can get Xerces to not munge the
xsi:schemaLocation. Or maybe we have a bug somewhere else where we are doing
that munging?
##########
daffodil-lib/src/main/scala/org/apache/daffodil/lib/xml/XMLUtils.scala:
##########
@@ -1459,34 +1459,27 @@ Differences were (path, expected, actual):
)
}
- val uriIsJustPathComponent =
- uri.getScheme == null &&
- uri.getAuthority == null &&
- uri.getQuery == null &&
- uri.getFragment == null &&
- uri.getPath != null
-
val optResolved: Option[(URISchemaSource, Boolean)] =
- if (uri.isAbsolute) {
- // an absolute URI is one with a scheme. In this case, we expect to be
able to resolve
- // the URI and do not try anything else (e.g. filesystem, classpath).
Since this function
+ if (uri.isAbsolute && uri.getScheme.contains("jar")) {
Review Comment:
This should be `uri.getScheme == "jar"`. Avoids ever accidentally matching
some other scheme like "foojar" if that ever happens.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]