AW: XML Model read and datatypes

Bögershausen , Merlin Michael Wed, 27 Nov 2019 02:44:37 -0800

Hi,
as you use RDFDataMgr I assume you are within a Java context.
You could archive the typing when reading the graph, following the idea:
1) parse the schema into a map property -> datatype
2) extend org.apache.jena.riot.system.StreamRDF 
    a) receive the datatype map and a graph at creation
    b) overwrite void triple(Triple triple) to lookup and add the datatype to 
the triple before adding it to the graph
    c) getter for graph


Then you can do something like:
Map typeMap = parseDatatypeMap();
Graph graph = GraphFactory.createDefaultGraph();
StreamRDF sink = new DataTypedStreamRDF(graph, typeMap);
RDFDataMgr.parse(sink, is, base, lang);
Model modle = ModelFactory.createModelForGraph(graph);


Best Merlin
-----Ursprüngliche Nachricht-----
Von: Dr. Chavdar Ivanov <[email protected]> 
Gesendet: Mittwoch, 27. November 2019 09:38
An: [email protected]
Betreff: Re: XML Model read and datatypes

Ok. I will try a few things :)

Получете Outlook за Android<https://aka.ms/ghei36>

________________________________
From: Martynas Jusevičius <[email protected]>
Sent: Tuesday, November 26, 2019 10:43:27 PM
To: jena-users-ml <[email protected]>
Subject: Re: XML Model read and datatypes

Well if the instance data does not match the schema, then it's a problem :)

What you could do is add a step with SPARQL CONSTRUCT query that transforms 
datatype-less data into valid data. Something like this
(untested):

PREFIX cp:     <http://...>
PREFIX xsd:    <http://www.w3.org/2001/XMLSchema#>

CONSTRUCT
{
  ?s ?p ?o .
  ?s cp:class.attribute ?attrWithDT .
}
{
  # copy over unchanged triples (identity transform)
  {
    ?s ?p ?o .
    FILTER (?p NOT IN (cp:class.attribute))
  }
  UNION
  # fix values of certain properties, e.g. add datatype
  {
    ?s cp:class.attribute ?o .
    BIND (xsd:float(?o) AS ?attrWithDT)
  }
}

You can keep adding more UNION branches and FILTER IN values.

On Tue, Nov 26, 2019 at 10:33 PM Dr. Chavdar Ivanov <[email protected]> 
wrote:
>
> Thanks,
> Do I understand correct that in case the syntax of the XML is
> <cp: class.attribute >650.000000</cp: class.attribute> …
>
> There is no way to get other than string even if the datatypes 
> information is available in a RDF scheme I was thinking that RDF 
> schema can be somehow considered at the read activity or as a second 
> step
>
>
>
> -----Original Message-----
> From: Martynas Jusevičius <[email protected]>
> Sent: Tuesday, November 26, 2019 10:21 PM
> To: jena-users-ml <[email protected]>
> Subject: Re: XML Model read and datatypes
>
>     @cp:class.attribute "650.000000";
>
> is also a string.
>
> In Turtle, typed literals need a datatype:
>
> @cp:class.attribute "650.000000"^^xsd:float;
>
> The spec: https://www.w3.org/TR/turtle/#abbrev
>
> In RDF/XML, that would be
>
> <cp: class.attribute
> rdf:datatype="http://www.w3.org/2001/XMLSchema#float";>650.000000</cp:
> class.attribute>
>
> On Tue, Nov 26, 2019 at 9:55 PM Dr. Chavdar Ivanov <[email protected]> 
> wrote:
> >
> > Dear all,
> >
> > When I read an xml I see that all values are read as strings. It 
> > doesn’t matter if they are stings, integers, float, etc
> >
> > I read a XML file using this
> > RDFDataMgr.read(model, new FileInputStream(file), "http://myNs1#";, 
> > Lang.RDFXML);
> >
> > And in the xml I have
> > …
> > <cp: class.attribute >650.000000</cp: class.attribute> …
> >
> > In the model when I browse I see
> > … @cp:class.attribute "650.000000";…
> >
> >
> > Is there a way to read the xml and parse the values in their proper form, 
> > e.g. float, string, …?
> >
> > Regards
> > Chavdar

AW: XML Model read and datatypes

Reply via email to