stevedlawrence commented on code in PR #1065:
URL: https://github.com/apache/daffodil/pull/1065#discussion_r1284593547
##########
daffodil-lib/src/main/scala/org/apache/daffodil/lib/util/Misc.scala:
##########
@@ -660,4 +660,15 @@ object Misc {
}
Some(res)
}
+
+ /**
+ * Replace HOME env var in Path with ~
+ */
+ def depersonalizePath(path: String): String = {
+ path.replace(
+ System.getenv("HOME"),
Review Comment:
Daffodil never looks at the home directory when resolving schemas unless
it's on the classpath so if feels like we shouldn't need this. Feels like this
is a sign that maybe our approach isn't right.
This also doesn't fully depersonalize schema paths. For example, if my
schemas are in `/home/slawrence/mycompany/confidential/schemas/`, there's sill
going to be personal information in the path. Same issue if schemas are in a
different mount point like `/dev/myschemas/`. I'm wondering if there's a
different approach we can take?
I'm wondering if a different approach could tie into recent discussions
about absolute vs relative schemaLocation paths?
If an import schemaLocation is absolute (i.e. starts with slash), then that
is what we display for whatever it resolves to. For example, if we have
schemaLocation="/foo/bar/baz.dfdl.xsd" and that resolves to
`jar:file:/home/slawrence/.ivy1/pom/jars/foo.jar!/foo/bar/baz.dfdl.xsd`, then
the display schemaLocation is just `/foo/bar/baz.dfdl.xsd`, since that was the
absolute schemaLocation that found it.
If an import schemaLocation is relative (i.e. doesn't start with a slash),
then the display location becomes the schemaLocation resolved relative to the
display URI of including schema. So for example, if the above `baz.dfdl.xsd`
file imported `../qaz.dfdl.xsd`, then display schemaLocation of that file be
calculated like `URI("/foo/bar/baz.dfdl.xsd).resolve("../qaz.dfdl.xsd")` which
would become `URI("/foo/qaz.dfdl.xsd")`. Note that this is very basically what
we do to resolve relative paths, so the logic is very similar. We are just
resolving reslative to the the dispaly schema location of the importing schema
it's actual schema location
So we always end up with an absolute URI as the display schemaLocation. I
imagine this is very similar to the results you get by looking at the
classpath, but should be less finicky. For example, Misc.classpath wont' always
return somethign if it's using a different classloader, which we have talk
about for things like making plugins easier.
##########
daffodil-lib/src/main/scala/org/apache/daffodil/lib/api/Diagnostic.scala:
##########
@@ -263,7 +263,12 @@ trait LocationInSchemaFile {
): String = {
// works for both windows and unix
val splitter = "/"
- val tokens = uriString.split(splitter)
+ val classpathTrimmedUriString =
+ uriString.replace(
+ Misc.classPath.find(cp =>
uriString.startsWith(cp.toString)).getOrElse("").toString,
Review Comment:
Mentioned elsewhere, but Misc.classpath is not very reliable. If a differnet
classloader is used than a URLClassLoader (which we have discussed) this is
going to break. Suggested an alternative in other comments about using
schemaLocation instead.
##########
daffodil-cli/src/test/scala/org/apache/daffodil/cli/cliTest/TestCLISaveParser.scala:
##########
@@ -318,4 +318,18 @@ class TestCLISaveParser {
}
}
+ @Test def test_CLI_Saving_SaveParser_error(): Unit = {
+ val schema = path(
+ "daffodil-sapi/src/test/resources/test/sapi/mySchema6.dfdl.xsd",
+ )
+
+ withTempFile { parser =>
+ runCLI(args"save-parser -s $schema $parser") { cli =>
+ cli.expectErr("[error]")
+ cli.expectErr("systemId: file:~")
+ cli.expectErr("Schema context: file:~")
Review Comment:
Instead of outputting `file:~/...`, it feels like this wants to be something
like:
```
Schema context:
daffodil-sapi/src/test/resources/test/sapi/mySchema6.dfdl.xsd",
```
since that is what the user provided. Similar to my other comment about how
we want to use the impor/include schemaLocation as a hint of what to display, I
think for the root schema we want to display the path that was actually
provided.
As another example, if we had foo.jar on DAFFODIL_CLASSPATH and ran
something like:
```
daffodil save-parser -s /some/file/in/foo/jar/dfdl.xsd
```
And that resolves to a path in foo.jar, then I think the schema context we
output wants to be exactly `/some/file/in/foo/jar/dfdl.xsd`.
So I suggested that based on my other comment, when we resolve a schema, we
either want to use the path directly (in the case of importing absolute
schemaLocations or locations without an importing schema like this one), or
resolve the schemaLocation relative to the importing display location in the
case of relative imports with a context.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]