[ 
https://issues.apache.org/jira/browse/JENA-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886215#comment-17886215
 ] 

Andy Seaborne commented on JENA-2361:
-------------------------------------

This is related to how TDB2 stores decimals but also how decimals are displayed 
in Turtle. The contract around TDB2 has not been to preserve terms for certain 
XSD datatypes, not just decimals, it's floats and doubles.

TDB2 stores their value, encoded into the NodeId. The disk layout has not 
changed.
Its the materialization that has changed.

There were different ways to normalize decimals between the various places this 
happens in the  in the API code as well as the effect of TDB value storage. 
Amongst other things, it leads to inconsistent Turtle. The situation for floats 
and doubles was worse - the value could change.

https://github.com/apache/jena/issues/2557 put numeric normalization on a 
consistent footing as far as possible. "Perfect" is not possible - there are 
conflicting situations. Certain in-memory graphs and datasets store the node as 
given which is efficient.

XSD schema has changed it's position on normalization. Only XML schema 1.0 
looks to be compatible with Turtle.



> TDB doesn't preserve lexical form of integer-valued decimals
> ------------------------------------------------------------
>
>                 Key: JENA-2361
>                 URL: https://issues.apache.org/jira/browse/JENA-2361
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 5.1.0
>            Reporter: Damien Obrist
>            Priority: Minor
>
> The following sample code reproduces the issue:
>  
> {code:java}
> public static void main(String[] arguments) {
>     try {
>         FileUtils.deleteDirectory(new File("sample-data"));
>     } catch (IOException e) {
>         throw new RuntimeException(e);
>     }
>     Dataset dataset = TDB2Factory.connectDataset("sample-data");
>     Txn.executeWrite(dataset, () -> {
>         // TDB: code wrongly prints out "2.0"
>         Model model = dataset.getDefaultModel();
>         // in-memory: code correctly prints out "2"
>         //Model model = ModelFactory.createDefaultModel();
>         model.add(
>             ResourceFactory.createResource("http://www.test.com/my-graph";),
>             RDF.type,
>             ResourceFactory.createTypedLiteral("2", XSDDatatype.XSDdecimal)
>         );         
> System.out.println(model.listStatements().next().getObject().asLiteral().getLexicalForm());
>     });
> }  {code}
> The behaviour is as follows:
>  * When running the code with Jena 5.1.0, it wrongly prints out "2.0"
>  * When running the code with Jena 5.0.0, it correctly prints out "2"
>  * When running the code using the in-memory model instead of the TDB one, 
> then with both Jena 5.0.0 and 5.1.0, it correctly prints out "2"
> It seems that the problem happens when writing the triple, not when reading 
> it: if the above code is modified and run with Jena 5.0.0 to store, and with 
> Jena 5.1.0 to read the triple, then it correctly prints out "2".
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to