On Wed, 2011-10-26 at 15:27 +0100, Paolo Castagna wrote: 
> Dave Reynolds wrote:
> > Hi Paolo,
> > 
> > On Wed, 2011-10-26 at 14:34 +0100, Paolo Castagna wrote: 
> >> Dave Reynolds wrote:
> >>> On Wed, 2011-10-26 at 13:38 +0100, Paolo Castagna wrote:
> >>>
> >>>> I am not sure if these two triples in my data are both "correct", are 
> >>>> they?
> >>>>
> >>>> ----
> >>>> <foo:bar1> <foo:p> "6.0"^^<http://www.w3.org/2001/XMLSchema#int> .
> >>>> <foo:bar2> <foo:p> "6.0"^^<http://www.w3.org/2001/XMLSchema#integer> .
> >>>> ----
> >>> No. The lexical forms for int and integer do not allow ".". See:
> >>> http://www.w3.org/TR/xmlschema-2/#integer etc and
> >>> http://www.w3.org/TR/xmlschema11-2/#integer etc
> >>>
> >>> Perhaps that's why they aren't cannonicalized by TDB.
> >> Thank you Dave.
> >>
> >> I still do not understand why I do not see errors or warnings when
> >> I validate my data with http://sparql.org/data-validator.html [1]
> > 
> > Humm. When I go to [1] and type in:
> > 
> >     <foo:bar1> <foo:p> "6.0"^^<http://www.w3.org/2001/XMLSchema#int> .
> > 
> > Select "Turtle" (default) and press the validate button then I see the
> > error:
> > 
> > """
> > [line: 10, col: 20] Lexical form '6.0' not valid for datatype
> > http://www.w3.org/2001/XMLSchema#int
> > <foo:bar1>  <foo:p>  "6.0"^^<http://www.w3.org/2001/XMLSchema#int> .
> > """
> > 
> > Not sure what might be different in your case.
> 
> Interesting...
> 
> Sorry, I assumed I would have had the same answer no matter the input format.
> But, the validator (in Joseki) is giving different answers for the same data.
> In particular, when N-Triples format is used as input, there is no error.

N-Triples was originally (and technically still is) just a format for
writing down RDF/XML test cases so it has to be able to represent all
syntactically well-formed data even if it isn't legal by other criteria.
So it would be incorrect for an N-Triple reader to raise an error on
that data.

In fact in RDF an ill-formed datatype is not only not a syntax error,
it's not even an semantic inconsistency it "just" represents a value
which is not in the space of literals. Also, of course, the set of
datatypes (other than rdfs:XMLLiteral) is open ended in RDF so there is
no guarantee a given processor recognizes xsd:int. 

Though in practice it's best to do eager checking of such things :)

> I have a large N-Triples or N-Quads file, what's the best way (i.e. the more
> strict the better for me) to validate the data in it, before ingestion?

Use Eyeball?

Dave


Reply via email to