Re: importing ntriples into tdb without stop at an error

Stefan Scheffler Wed, 13 Jun 2012 07:14:23 -0700


Am 13.06.2012 15:55, schrieb Andy Seaborne:

On 13/06/12 14:19, Damian Steer wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 13/06/12 14:03, Stefan Scheffler wrote:
Hello, I need to import large n-triple files (dbpedia) into a tdb.
The problem is, that many of the triples are not valid (like
missing '<' or invalid chars) and leading to an exception which
quits the import... I just want to skip them and continue, so that
all valid triples are in the tdb at the end.

Is there a possibility to do that easily? I tried to rewrite the
ARQ, but this is very complex With friendly regards Stefan
Scheffler
You'd be much better off finding an n-triple parser that kept going
and also spat out (working) n-triples for piping to TDB. I can't see
an option like that in the riot command line.
There isn't such an option - there could be (if someone wants tocontribute a patch).
This is a typical ETL situation - you're going to have to clean thosetriples (which were not written by an RDf tool presumably). Do youwant to loose them or fix them?
Checking before loading is always a good idea, especially data fromoutside and other tools. When I receive TTL or RDF/XML, I parse to NTwhich means its then checked. Then load the data.
    Andy


  Hi Andy,

At the moment i just want to skip the invalid triples (later they shouldbe stored and maybe fixed, if its possible).The main goal is to have an import-proccess which runs automaticly anddon't stops on every found failure.The moment of checking doesn't matter (atm ;)) . It can before orduring the import (but i used the second strategy on sesame).


Thanks Stefan

--
Stefan Scheffler
Avantgarde Labs GbR
Löbauer Straße 19, 01099 Dresden
Telefon: + 49 (0) 351 21590834
Email: [email protected]

Re: importing ntriples into tdb without stop at an error

Reply via email to