Stian Soiland-Reyes created JENA-1161:
-----------------------------------------

             Summary: riot cmdline uses wrong base when parsing Turtle over http
                 Key: JENA-1161
                 URL: https://issues.apache.org/jira/browse/JENA-1161
             Project: Apache Jena
          Issue Type: Bug
          Components: Cmd line tools
    Affects Versions: Jena 3.0.1
            Reporter: Stian Soiland-Reyes


Parsing a Turtle file served over http:// or https:// which has got no @base, 
and uses relative IRI references, wrongly uses the current directory in 
file:/// as a base.

The command line
{code}
stain@biggie:/tmp$ riot 
https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl
{code}

where that URL returns Content-Type: text/turtle;charset=utf-8 with the body:

{code}
@prefix : <#> .
<> :a <#test> .
{code}

is wrongly parsed by the riot command line tool to be relative to the current 
directory:

{code}
<file:///tmp/https:/cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl>
 
<file:///tmp/https:/cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#a>
 
<file:///tmp/https:/cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#test>
 .
{code}

The expected output would be the same as supplying the same URI as a --base:

{code}
stain@biggie:/tmp$ riot --base 
https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl
 
https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl
<https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl>
 
<https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#a>
 
<https://cdn.rawgit.com/stain/3d49908d790c5678faee302ba17f4a43/raw/bceb24ee6bfcaa60c4508779bc5f09ec367876f1/nobase.ttl#test>
 .
{code}

(except if a Content-Location header is provided, or HTTP redirection has been 
followed - in which case the result of that should be used as base)

Relevant specs:

https://www.w3.org/TR/turtle/#sec-iri-references
https://www.ietf.org/rfc/rfc3986 section 5.1 and 5.2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to