[ 
https://issues.apache.org/jira/browse/JENA-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14028203#comment-14028203
 ] 

Tim Harsch edited comment on JENA-712 at 6/11/14 6:49 PM:
----------------------------------------------------------

The proposal by ~rvesse seems like a good solution to me.  I really don't 
understand the issues behind the equality contract, so I can't weigh in on 
that.  Having the behavior be configurable is really the ony thing that makes 
sense to me.  I think for our use cases we can probably work with the default 
being to preserve the current behavior, but it makes more sense to me to have 
the default be that which would not cause users performing remote queries to 
get tripped up.

It may be worth noting is the way I ran into the issue, it is a slight 
perturbation of Rob's demonstration case.  With the following psuedo code:
UpdateRequest ur1 = UpdateFactory.create("LOAD <...> INTO GRAPH 
<file:///private/tmp/mydir/aggregates-all.nq>");
UpdateProcessor processor = UpdateExecutionFactory.createRemote(update, ..., 
...);
processor.execute();

If I run that code within a working directory = /private/tmp
The SPARQL that gets sent to the remote system is.
LOAD <...> INTO GRAPH <mydir/aggregates-all.nq>

So, in this way the implicit BASE URI is a problem for the remote sytem, as it 
is not likely to have the same implicit base uri as chosen by the client (as 
Rob had mentioned earlier).  So, what I don't get is what Andy was trying to 
describe on the mailing list "For a SPARQL Service request there is a base URI 
... it is almost certainly not what you want :-) as it's the whole request URL" 
  Here what I see is the current working directory, not the request URL.  Maybe 
I misunderstood.



was (Author: harschware):
~rvesse's proposal seems like a good solution to me.  I really don't understand 
the issues behind the equality contract, so I can't weigh in on that.  Having 
the behavior be configurable is really the ony thing that makes sense to me.  I 
think for our use cases we can probably work with the default being to preserve 
the current behavior, but it makes more sense to me to have the default be that 
which would not cause users performing remote queries to get tripped up.

It may be worth noting is the way I ran into the issue, it is a slight 
perturbation of Rob's demonstration case.  With the following psuedo code:
UpdateRequest ur1 = UpdateFactory.create("LOAD <...> INTO GRAPH 
<file:///private/tmp/mydir/aggregates-all.nq>");
UpdateProcessor processor = UpdateExecutionFactory.createRemote(update, ..., 
...);
processor.execute();

If I run that code within a working directory = /private/tmp
The SPARQL that gets sent to the remote system is.
LOAD <...> INTO GRAPH <mydir/aggregates-all.nq>

So, in this way the implicit BASE URI is a problem for the remote sytem, as it 
is not likely to have the same implicit base uri as chosen by the client (as 
Rob had mentioned earlier).  So, what I don't get is what Andy was trying to 
describe on the mailing list "For a SPARQL Service request there is a base URI 
... it is almost certainly not what you want :-) as it's the whole request URL" 
  Here what I see is the current working directory, not the request URL.  Maybe 
I misunderstood.


> ARQ serialises queries and updates using relative URIs but does not include a 
> BASE clause
> -----------------------------------------------------------------------------------------
>
>                 Key: JENA-712
>                 URL: https://issues.apache.org/jira/browse/JENA-712
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ
>    Affects Versions: Jena 2.11.2
>            Reporter: Rob Vesse
>         Attachments: JENA-712-AlwaysWriteBase.patch, 
> JENA-712-ConfigurableOutputImplictBase.patch, 
> JENA-712-ConfigurableOutputImplictBaseOnByDefault.patch, 
> SparqlRelativeUriTreatment.java
>
>
> An internal discussion with [~harschware] has raised what we think is a bug 
> in ARQs behaviour though it is somewhat open to interpretation so may be 
> controversial.
> The code I will attach demonstrates the issue.
> The problem arises as follows:
> 1 - When given a query/update with a relative URI ARQ resolves it against an 
> implicit Base URI of the current working directory
> 2 - When applying {{toString()}} on the parsed {{Query}} or {{UpdateRequest}} 
> the implicit Base URI is used and relative URIs are output but no `BASE` 
> clause is output
> 3 - The query is transmitted to a different system which has a different 
> working directory and so interprets it differently resulting in unexpected 
> behaviour/errors
> This causes us issues because the relative URIs are valid relative to the 
> working directory of the client but not relative to the working directory of 
> the server so we want absolute URIs to be transmitted to the server.
> For example given the following query string:
> {noformat}
> SELECT * WHERE { <path/to/thing> a ?type }
> {noformat}
> Calling {{toString()}} on the resulting {{Query}} object gives the following:
> {noformat}
> SELECT  *
> WHERE
>   { <path/to/thing> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type }
> {noformat}
> Which does not include the `BASE` declaration, if we however force the 
> `Query` object to have a null base via `setBaseURI((String)null)` ARQ prints 
> the following when `toString()` is called:
> {noformat}
> BASE    <file:///Users/rvesse/Documents/Work/Code/jena-playground/>
> SELECT  *
> WHERE
>   { <path/to/thing> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> ?type }
> {noformat}
> More generally it seems that whenever an implicit Base URI is used or where a 
> Base URI is passed only to the {{QueryFactory.create()}} or 
> {{UpdateFactory.create()}} call a {{BASE}} declaration is never written i.e. 
> when there is an {{IRIResolver}} set but not a specific Base URI no {{BASE}} 
> declaration will be written but URIs will be serialised in relative form.
> We can appreciate that other people may have use cases where leaving relative 
> URIs as-is and not including a `BASE` is desirable but our feeling is that in 
> the more general case this does more harm than good and lets users shoot 
> themselves in the foot unwittingly as we have done in this example.
> We would like to propose that the default behaviour should be for a `BASE` 
> declaration to always be written if relative URIs are being output.  Or at 
> the very least we would like the behaviour to be configurable.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to