On 13/06/12 14:19, Damian Steer wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 13/06/12 14:03, Stefan Scheffler wrote:
Hello, I need to import large n-triple files (dbpedia) into a tdb.
The problem is, that many of the triples are not valid (like
missing '<' or invalid chars) and leading to an exception which
quits the import... I just want to skip them and continue, so that
all valid triples are in the tdb at the end.

Is there a possibility to do that easily? I tried to rewrite the
ARQ, but this is very complex With friendly regards Stefan
Scheffler


You'd be much better off finding an n-triple parser that kept going
and also spat out (working) n-triples for piping to TDB. I can't see
an option like that in the riot command line.

There isn't such an option - there could be (if someone wants to contribute a patch).

This is a typical ETL situation - you're going to have to clean those triples (which were not written by an RDf tool presumably). Do you want to loose them or fix them?

Checking before loading is always a good idea, especially data from outside and other tools. When I receive TTL or RDF/XML, I parse to NT which means its then checked. Then load the data.

        Andy


Damian
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/Yk1UACgkQAyLCB+mTtynCxwCdGO4xFNd3sJaLqFGGRzMtMaqH
p+kAn0tS4RXd/1iroz+UuahFefyjfxbq
=2jgU
-----END PGP SIGNATURE-----

Reply via email to