Osma Suominen created JENA-1424:
-----------------------------------

             Summary: LOAD ... INTO GRAPH relies too much on filename extension
                 Key: JENA-1424
                 URL: https://issues.apache.org/jira/browse/JENA-1424
             Project: Apache Jena
          Issue Type: Bug
    Affects Versions: Jena 3.5.0
         Environment: Ubuntu 16.04
            Reporter: Osma Suominen
            Priority: Minor


I tried to perform this SPARQL Update on an empty TDB:

{noformat}
LOAD <http://api.finto.fi/rest/v1/yso/data> INTO GRAPH 
<http://www.yso.fi/onto/yso/>
{noformat}

but got this error from tdbupdate:

{noformat}
$ tdbupdate --loc tdb --update=load-yso.ru 
org.apache.jena.update.UpdateException: Attempt to load quads into a graph
        at 
org.apache.jena.sparql.modify.UpdateEngineWorker.visit(UpdateEngineWorker.java:146)
        at 
org.apache.jena.sparql.modify.request.UpdateLoad.visit(UpdateLoad.java:64)
        at 
org.apache.jena.sparql.modify.UpdateVisitorSink.send(UpdateVisitorSink.java:46)
        at 
org.apache.jena.sparql.modify.UpdateVisitorSink.send(UpdateVisitorSink.java:26)
        at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:546)
        at org.apache.jena.atlas.iterator.Iter.sendToSink(Iter.java:553)
        at 
org.apache.jena.sparql.modify.UpdateProcessorBase.execute(UpdateProcessorBase.java:59)
        at arq.update.execOneFile(update.java:105)
        at arq.update.execUpdate(update.java:81)
        at arq.cmdline.CmdUpdate.exec(CmdUpdate.java:63)
        at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
        at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
        at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
        at tdb.tdbupdate.main(tdbupdate.java:37)
{noformat}

So basically Jena is complaining that I'm trying to load quads into a graph. 
But that's not true. The URL specified in the LOAD actually performs content 
negotiation and then redirects (302) either to an RDF/XML or a Turtle 
serialization (with the proper Content-type headers), both are graphs not quads.

The problem seems to be that UpdateEngineWorker checks the URL specified in the 
LOAD as if it were a filename and throws an exception if its file extension 
doesn't match the known graph formats. In this case there is no extension so it 
won't match.

The check was introduced 6 months ago in this commit:
https://github.com/apache/jena/commit/931a437bb49fecdb1cb70a5e6225e27141dec86c#diff-d0b3b8995c502712dac778f5bb61bc9dR146

If I use the URL that the above URL redirects to, which contains a .ttl file 
extension, loading works fine:

{noformat}
LOAD <http://api.finto.fi/download/yso/yso-skos.ttl> INTO GRAPH 
<http://www.yso.fi/onto/yso/>
{noformat}

But this means that the LOAD ... INTO GRAPH ... command cannot be used with 
arbitrary Linked Data URIs, just ones that happen to contain a file extension 
like .ttl or .rdf or .nt.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to