shawnsmith opened a new issue, #3465:
URL: https://github.com/apache/jena/issues/3465

   ### Version
   
   5.5.0
   
   ### What happened?
   
   Testing out `jena-iri3986` to see what might change in Jena 6.0 compared to 
`jena-iri` on 5.5, I ran across an edge case where `IRIx.resolve()` behavior is 
different between the two. The `jena-iri3986` implementation strips a trailing 
`/` where I don't think it should.
   
   Here's what I think is supposed to happen:
   ```java
   package org.apache.jena.rfc3986;
   ...
   public class TestResolve {
       ...
       @Test public void resolve_ref_5() { testResolve("http://example/";, 
"https:subdir/", "https:subdir/"); }
       @Test public void resolve_ref_6() { testResolve("http://example/";, 
"urn:foo/", "urn:foo/"); }
   ```
   
   Where, with the current implementation of `AlgResolveIRI`:
   | Actual | Expected |
   | - | - |
   | `https:subdir` | `https:subdir/` |
   | `urn:foo` | `urn:foo/` |
   
   The `jena-iri` implementation of resolve does not strip the trailing `/`.
   
   As a sanity check, I tried the same resolve operations in Chrome and Node.js 
which I think is supposed to follow RFC 3986, and got the following, where you 
can see it doesn't strip the trailing slash either, though it does add `//` in 
the https case:
   ```
   $ node
   > new URL('https:subdir/', 'http://example.com/').toString()
   'https://subdir/'
   > new URL('urn:foo/', 'http://example.com/').toString()
   'urn:foo/'
   ```
   
   Looking at the implementation of `AlgResolveIRI`, I believe the root cause 
is that `trailingSlash` isn't correct when there isn't a leading slash. This 
one-line change passes all the tests, and I think it will fix this edge case:
   ```patch
   diff --git 
a/jena-iri3986/src/main/java/org/apache/jena/rfc3986/AlgResolveIRI.java 
b/jena-iri3986/src/main/java/org/apache/jena/rfc3986/AlgResolveIRI.java
   index d560f885cc..9c162c0d15 100644
   --- a/jena-iri3986/src/main/java/org/apache/jena/rfc3986/AlgResolveIRI.java
   +++ b/jena-iri3986/src/main/java/org/apache/jena/rfc3986/AlgResolveIRI.java
   @@ -200,7 +200,7 @@ public class AlgResolveIRI {
            boolean initialSlash = segments[0].isEmpty();
            boolean trailingSlash = false;
            // Trailing slash if it isn't the initial "/" and it ends in "/" or 
"/." or "/.."
   -        if ( N > 1 ) {
   +        if ( N > (initialSlash ? 1 : 0) ) {
                if ( segments[N-1].equals(".") || segments[N-1].equals("..") )
                    trailingSlash = true;
                else if ( path.charAt(path.length()-1) == '/' )
   ```
   
   ### Relevant output and stacktrace
   
   ```shell
   
   ```
   
   ### Are you interested in making a pull request?
   
   Maybe


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to