Spaces in URIs are particularly problematic; even if you can get them
into the data, using the data will likely break.
When ingesting data from somewhere else, it is good to check it before
loading, then fix as needed before loading.
riot --check file....
Andy
http://lov.okfn.org/lov.nq.gz is only 749810 quads. tdbloader2 is
overkill. Use tdbloader. tdbloader2 is an advantage for much larger
data (100 million+ and even then it is not always faster)
On 07/04/17 13:17, Martynas Jusevičius wrote:
This question comes up regurarly:
http://markmail.org/message/seqiw74hhdx2u64j
On Fri, Apr 7, 2017 at 2:10 PM, Laura Morales <[email protected]> wrote:
I'm trying to import the LOV dump [1] into Fuseki using tdbloader2.
Unfortunately some quads are "broken" in the sense that they're not
well-formed. For example this one
ERROR [line: 203556, col: 152] Bad character in IRI (space):
<http://securitytoolbox.appspot.com/MASO#Objectif[space]...>
org.apache.jena.riot.RiotException: [line: 203556, col: 152] Bad
character in IRI (space):
<http://securitytoolbox.appspot.com/MASO#Objectif[space]...>
Is there an option to tell tdbloader2 to simply ignore these nquads
(or show a warning) and keep going instead of raising an exception and
halting?
-----------------
The problem is much more the 'Spaces'.
But last not least, i think, a utility making database for Fuseki, may not
'encourage' the users throwing away this and that triple/quad-line because
the user wants to run it to the end. It is clear where this ends, than
there is no logic in that what you do...
I have had this proplem usally with downloaded dbpedia files
long_abstracts_en.nt
long_abstracts_en_uris_de.nt
I repaired each line in an editor, as our utility likes it and if i
couldn't guess where the problem is for an object string, i wrote 'not
readable' for it...
Yes, i did it so, may be i was an idiot...
baran
--
Using Opera's mail client: http://www.opera.com/mail/