> Last time I checked (which was quite a while ago though), loading
> DBpedia in a normal triple store such as Jena TDB didn't work very well
> due to many issues with the DBpedia RDF (e.g., problems with the URIs of
> external links scraped from Wikipedia).
Agree. Common errors in LOD are:
-- single quoted and double quoted strings with newlines;
-- bnode predicates (but SPARQL processor may ignore them!);
-- variables, but triples with variables are ignored;
-- literal subjects, but triples with them are ignored;
-- '/', '#', '%' and '+' in local part of QName ("Qname with path");
-- invalid symbols between '<' and '>', i.e. in relative IRIs.
That's why my own TURTLE parser is configurable to selectively report or
ignore these errors. In addition I can relax TURTLE syntax to include
popular violations like redundant delimiters and/or try to recover from
lexical errors as much as it is possible, even if I should lose some ill
triples together with some limited number of proper triples around them
("GIGO mode", for "Garbage In Garbage Out").
Best Regards,
Ivan Mikhailov
OpenLink Software
http://virtuoso.openlinksw.com
------------------------------------------------------------------------------
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion