On Tuesday 13 October 2009 22:20:16 Patrick van Kleef wrote:
> Hi Sebastian,
>
> > I am having trouble with some queries when URIs contain special
> > characters. In
> > the example below it is ( and ).
> > The question is: how do I encode them? Do I need to use percent
> > encoding or is
> > there another way?
> > And which are the characters that I need to encode?
> >
> > Thanks a lot
> > Sebastian Trueg
> >
> > SQLExecDirect failed on query 'sparql select ?r where { ?r
> > <http://www.semanticdesktop.org/ontologies/2007/11/01/
> > pimo#groundingOccurrence>
> > <file:///home/freedom/media/lossless%20recoded/Avril%20Lavigne%20-%
> > 20Under%20My%20Skin(APE)/Avril%20Lavigne%20-%20Nobody's%20Home%20
> > (LIVE%20Acoustic).flac>
> > . }'
> >
> > iODBC Error: [OpenLink][Virtuoso iODBC Driver][Virtuoso Server]SQ074:
> > Line 1: Unterminated SPARQL short single-quoted string at ''
>
> I suggest you read the following article:
>
> http://en.wikipedia.org/wiki/Percent-encoding
>
OK, as always I was not clear enough. Let me try to improve on that.
In Soprano we use QUrl to represent URIs. QUrl provides a method called
toEncoded which provides an ASCII string in which reserved characters that are
not used in their special meaning and non-ascii characters are percent-
encoded.
This works nicely for backends like redland and sesame2. URIs can perfectly be
encoded and decoded.
However, QUrl::toEncoded does for example not percent-encode reserved
characters such as ( or '.
This is no problem as far as the standard is concerned as those have to
special meaning in the schemes we use but it makes Virtuoso choke.
Now this can sort of be solved by percent-encoding every character that is not
in the unreserved set and that is not #. (Encoding the latter could mean to
make the fragment part of the path when converting back.)
But this is not backwards compatible with old data from Soprano which uses the
"simple" encoded form. This is a problem since I need to merge data from the
old backends to Virtuoso.
I could create a tool that converts all URIs but I would rather have a more
generic solution.
That is why I asked for a way to encode those characters in the query only. So
that the URIs stored in Virtuoso are the same as the ones stored in redland.
I hope this makes it a bit clearer.
Cheers,
Sebastian