Hi Roman,
> I have a lot of errors when I want to load DBpedia dataset using isql, the
> command:
> ld_dir('/workingDir/btc2014_unzipped/01', 'data.nq-*', 'http://fake.org');
>
> Example error:
>
> 22007 XM003: XML parser detected an error: ERROR : Tag nesting
> error: name 'img' of end tag does not match the name 'p' of start tag
> at line 4 column 432 at line 4 column 438 of source text
> 04/02/skos/core#" xmlns:xsd="http://www.w3.org/2001/XMLSchema#"></img></p>
> ----------------------------------------------------------------------^
>
> Ok, let's find the line where the error occured (I put a line break, so it is
> easier to see):
>
> <http://core-project.kmi.open.ac.uk/data-description>
> <http://purl.org/rss/1.0/modules/content/encoded> "<h2
> xmlns=\"http://www.w3.org/1999/xhtml\"
> xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"
> xmlns:dc=\"http://purl.org/dc/terms/\"
> xmlns:foaf=\"http://xmlns.com/foaf/0.1/\" xmlns:og=\"http://ogp.me/ns#\"
> xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"
> xmlns:sioc=\"http://rdfs.org/sioc/ns#\"
> xmlns:sioct=\"http://rdfs.org/sioc/types#\"
> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\"
> xmlns:xsd=\"http://www.w3.org/2001/XMLSchema#\">What data are
> exposed</h2>\n<p xmlns=\"http://www.w3.org/1999/xhtml\"
> xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"
> xmlns:dc=\"http://purl.org/dc/terms/\"
> xmlns:foaf=\"http://xmlns.com/foaf/0.1/\" xmlns:og=\"http://ogp.me/ns#\"
> xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"
> xmlns:sioc=\"http://rdfs.org/sioc/ns#\"
> xmlns:sioct=\"http://rdfs.org/sioc/types#\"
> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\" xmlns:xsd=
\"http://www.w3.org/2001/XMLSchema#\">The CORE project exposes data about the
aggregated content. The following schema shows the kind of metadata CORE holds
about each resource. </p>\n<h2 xmlns=\"http://www.w3.org/1999/xhtml\"
xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"
xmlns:dc=\"http://purl.org/dc/terms/\"
xmlns:foaf=\"http://xmlns.com/foaf/0.1/\" xmlns:og=\"http://ogp.me/ns#\"
xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"
xmlns:sioc=\"http://rdfs.org/sioc/ns#\"
xmlns:sioct=\"http://rdfs.org/sioc/types#\"
xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\"
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema#\">Data Schema</h2>\n<p
xmlns=\"http://www.w3.org/1999/xhtml\"
xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"
xmlns:dc=\"http://purl.org/dc/terms/\"
xmlns:foaf=\"http://xmlns.com/foaf/0.1/\" xmlns:og=\"http://ogp.me/ns#\"
xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"
xmlns:sioc=\"http://rdfs.org/sioc/ns#\" xmlns:sioct=\"http://rdfs.org/sioc/typ
es#\" xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\"
xmlns:xsd=\"http://www.w3.org/2001/XMLSchema#\"></img></p>
> \n<h2 xmlns=\"http://www.w3.org/1999/xhtml\"
> xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"
> xmlns:dc=\"http://purl.org/dc/terms/\"
> xmlns:foaf=\"http://xmlns.com/foaf/0.1/\" xmlns:og=\"http://ogp.me/ns#\"
> xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"
> xmlns:sioc=\"http://rdfs.org/sioc/ns#\"
> xmlns:sioct=\"http://rdfs.org/sioc/types#\"
> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\"
> xmlns:xsd=\"http://www.w3.org/2001/XMLSchema#\">Data License</h2>\n<p
> xmlns=\"http://www.w3.org/1999/xhtml\"
> xmlns:content=\"http://purl.org/rss/1.0/modules/content/\"
> xmlns:dc=\"http://purl.org/dc/terms/\"
> xmlns:foaf=\"http://xmlns.com/foaf/0.1/\" xmlns:og=\"http://ogp.me/ns#\"
> xmlns:rdfs=\"http://www.w3.org/2000/01/rdf-schema#\"
> xmlns:sioc=\"http://rdfs.org/sioc/ns#\"
> xmlns:sioct=\"http://rdfs.org/sioc/types#\"
> xmlns:skos=\"http://www.w3.org/2004/02/skos/core#\"
> xmlns:xsd=\"http://www.w3.org/2001/XMLSchema#\">All data from CORE (unless
> otherwise specified) are available under th
e a Creative Commons Attribution 3.0 Unported License.
</p>\n"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral> .
>
> Also tried to load using different errors bits, the same result:
> DB.DBA.TTLP_MT (file_to_string_output
> ('/workingDir/btc2014_unzipped/01/data.nq-9'), '', 'http://fake.org', 512)
>
> Why Virtuoso tries to check HTML/XML tags consistency inside the literals?!
> Is it possible to turn it off? I have too many errors in the dataset, it is a
> waste of time trying to find all lines with errors and remove them by hands.
> Can't find anything related to this in the documentation.
I have reproduced the problem in-house and i am currently talking to
development to provide a solution to this problem. I will advice as soon as a
patch is available.
Note that this is NOT the DBpedia dataset itself you are trying to load, but
part of the Billion Triple Challenge 2014 (btc-2014) which is in a different
format.
If you really meant to load the DBpedia datasets, check out this page:
http://wiki.dbpedia.org/Downloads2015-04
Patrick
---
Patrick van Kleef
Program Manager
OpenLink Software
http://www.openlinksw.com/
http://twitter.com/openlink/
------------------------------------------------------------------------------
Monitor Your Dynamic Infrastructure at Any Scale With Datadog!
Get real-time metrics from all of your servers, apps and tools
in one place.
SourceForge users - Click here to start your Free Trial of Datadog now!
http://pubads.g.doubleclick.net/gampad/clk?id=241902991&iu=/4140
_______________________________________________
Virtuoso-users mailing list
Virtuoso-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/virtuoso-users