I see. The only drawback is that the extracted RDF graph would not really
match the loaded one, and this may be an issue in some cases (at least in
theory). It would be perfect if the original datatype could be used (only)
at serialization time, but I see the point of the cost at query time.
In my case it is not really a problem, since queries will match both, and I
am happy to benefit from it considering the added value that you mentioned.

Thank you for the insight :)

Enrico


On 9 July 2013 09:40, Andy Seaborne <[email protected]> wrote:

> Enrico,
>
> Yes - two things are happening - derived types get rolled up to
> xsd:integer and also formats are canonicalized (well, the value is stored
> and the lexical form is remade if needed).
>
> {  ?a ?b 28 } matches {  ?a ?b 0028 } and {  ?a ?b +28 }
>
> The original datatype could have been kept (needs 4 bits of encoding -
> there are 13 derived types of xsd:decimal, from xsd:integer on down) but
> "28"^^xsd:int  would not match "28"^^xsd:integer without some cost.
>
> Inlining the value really speeds up numerical filters like
>
>    FILTER ( ?x < 56 )
>    FILTER ( ?x > 4 && ?x < 56 )
>
>         Andy
>
>
> On 09/07/13 09:24, Enrico Daga wrote:
>
>> Yes, I am using TDB, your pointers clarified all, thank you!
>>
>> At the end, any SPARQL expression asking for xsd:int would match values
>> with xsd:integer and this applies to any canonicalized datatype.
>> Returning to my example, the following queries actually work:
>>
>>   select * from <http://example/int/integer> where {  ?a ?b 28 }
>>
>> or
>>
>>   select * from <http://example/int/integer> where {  ?a ?b "28"^^<
>> http://www.w3.org/2001/**XMLSchema#int<http://www.w3.org/2001/XMLSchema#int>>
>> }
>>
>> Thank you very much!
>>
>> Enrico
>>
>>
>>
>> On 8 July 2013 17:45, Rob Vesse <[email protected]> wrote:
>>
>>  Is this using TDB as a backend?
>>>
>>> This is by design in TDB - see Value Canonicalization and TDB Design
>>> (http://jena.apache.org/**documentation/tdb/value_**
>>> canonicalization.html<http://jena.apache.org/documentation/tdb/value_canonicalization.html>and
>>> http://jena.apache.org/**documentation/tdb/**architecture.html<http://jena.apache.org/documentation/tdb/architecture.html>)
>>> - and not a
>>> Fuseki issue but rather a feature of TDB.
>>>
>>> Since TDB inlines certain datatypes into the Node IDs in order to speed
>>> up
>>> common datatype computations it needs to normalize derived datatypes to
>>> the appropriate base type.  So as in your example anything derived from
>>> xsd:integer will be canonicalized to the xsd:integer form.
>>>
>>> Rob
>>>
>>>
>>> On 7/8/13 8:50 AM, "Enrico Daga" <[email protected]> wrote:
>>>
>>>  Hi,
>>>>
>>>> I loaded some data in Fuseki and found some differences in an xsd
>>>> datatype.
>>>> Follows a test case:
>>>>
>>>> insert data {
>>>> graph <http://example/int/integer> {
>>>> _:ex <http://example.org/property/**size<http://example.org/property/size>>
>>>> "28"^^<
>>>> http://www.w3.org/2001/**XMLSchema#int<http://www.w3.org/2001/XMLSchema#int>
>>>> >
>>>> }}
>>>>
>>>> Selecting data from the graph <http://example/int/integer> will show
>>>> the
>>>> value as 
>>>> <http://www.w3.org/2001/**XMLSchema#integer<http://www.w3.org/2001/XMLSchema#integer>>
>>>> instead.
>>>>
>>>> While this is not a big issue and I could live with that in principle,
>>>> in
>>>> my specific situation (back-end migration to Fuseki), clients relying on
>>>> the xsd:int datatype will break (and I want the data to be consistent
>>>> with
>>>> the legacy back-end).
>>>>
>>>> Any advise? Should I open a bug towards 0.2.8? ;)
>>>>
>>>> Thank you all,
>>>>
>>>> Enrico
>>>>
>>>>
>>>> --
>>>> Enrico Daga
>>>>
>>>> --
>>>> http://www.enridaga.net
>>>> skype: enri-pan
>>>>
>>>
>>>
>>>
>>
>>
>


-- 
Enrico Daga

--
http://www.enridaga.net
skype: enri-pan

Reply via email to