[ https://issues.apache.org/jira/browse/JENA-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322713#comment-16322713 ]
ASF GitHub Bot commented on JENA-1462: -------------------------------------- GitHub user stain opened a pull request: https://github.com/apache/jena/pull/341 JENA-1462: Tests RDF/XML parsing newer URI schemes Tests for [JENA-1462](https://issues.apache.org/jira/browse/JENA-1462) RIOT parsing RDF/XML with a base URI different from http/https/file, such as `ssh://example.com/nested/`, fails. Note as JENA-1462 is not fixed by this PR, this only adds the unit tests and test files. This test also highlights a bug in parsing URIs like `file://example.com/etc/passwd` as described in [JENA-1463](https://issues.apache.org/jira/browse/JENA-1463) You can merge this pull request into a Git repository by running: $ git pull https://github.com/stain/jena JENA-1462 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/jena/pull/341.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #341 ---- commit 6ecd48af6967ca48f985850393ac3b16df31a314 Author: Stian Soiland-Reyes <stain@...> Date: 2018-01-11T18:12:33Z JENA-1462: Tests RDF/XML parsing newer URI schemes RIOT parsing RDF/XML with a base URI different from http/https/file, such as ssh://, fails. Note as JENA-1462 is not fixed, this only adds the unit tests. ---- > RDF/XML parsing fails on newer/provisional/private URI schemes in base URI > -------------------------------------------------------------------------- > > Key: JENA-1462 > URL: https://issues.apache.org/jira/browse/JENA-1462 > Project: Apache Jena > Issue Type: Bug > Components: ARQ, RDF/XML > Affects Versions: Jena 3.3.0, Jena 3.4.0, Jena 3.5.0, Jena 3.6.0 > Environment: Apache Maven 3.3.9 > Maven home: /usr/share/maven > Java version: 1.8.0_151, vendor: Oracle Corporation > Java home: /usr/lib/jvm/java-8-openjdk-amd64/jre > Default locale: en_GB, platform encoding: UTF-8 > OS name: "linux", version: "4.10.0-42-generic", arch: "amd64", family: "unix" > Distributor ID: Ubuntu > Description: Ubuntu 16.04.3 LTS > Release: 16.04 > Codename: xenial > Reporter: Stian Soiland-Reyes > > RIOT parsing RDF/XML with a base URI different from http/https/file, such as > ssh://, fails. > See https://github.com/stain/jena-test-unregistered-iana for some tests I > came up with. > Tests fail both for xml:base or if the base URI is provided to RDFDataMgr, > but not if the URI is full inside the RDF/XML. > {code} > org.apache.jena.riot.RiotException: [line: 5, col: 40] {E214} Resolving > against bad URI <ssh://example.com/nested/>: <foo.txt> > at > org.apache.jena.riot.TestParseURISchemeBases.sshBaseRDF(TestParseURISchemeBases.java:336) > {code} > This error message comes from ERR_RESOLVING_AGAINST_MALFORMED_BASE - for some > reason the warning becomes an error as the IRI Factory used for creating the > Base IRI within the RDF/XML parser is a bit too strict. > However I could not find anything in the specs: > * https://www.w3.org/TR/2014/REC-rdf-syntax-grammar-20140225/ > * https://www.w3.org/TR/2009/REC-xmlbase-20090128/ > * https://www.ietf.org/rfc/rfc3986 > that says "foreign" URI schemes should not be permitted. Anyway Jena's IANA > list is probably out of date, as my tests shown. > This was initially detected in TAVERNA-1027 which tries to parse an RDF/XML > with the [app:// URI scheme|https://www.w3.org/TR/app-uri/] , which is *not* > registered with IANA https://www.iana.org/assignments/uri-schemes according > to https://tools.ietf.org/html/bcp35 > However, testing Jena with other permanent and provisional schemes from the > registry, such as example://, ssh:// or a conformant private scheme with a > domain-based name org.apache.jena.test:// also give the same error. > IMHO they should all be understood in the same way as when parsing the Turtle > examples, which don't fail. > I could trace this back to Jena 3.3.0, so I suspect this was introduced with > JENA-1306. With versions before that all my tests *) work. > I'll raise a pull request with the junit tests, but have not been able to > find a good way to fix it. > _*) There's a separate issue that hostnames in file://example.com/etc/passwd > style URIs also seem to be misparsed in RDF/XML into > file:///example.com/etc/passwd , which I'll report separately, that goes back > till 3.0.1._ -- This message was sent by Atlassian JIRA (v6.4.14#64029)