On 04/07/14 18:27, Andy Seaborne wrote:
On 04/07/14 17:20, Guido Zuccarelli wrote:
Hello,

I have a directory with 200,000+ ttl files that I want to
load into a TDB database. The command help only specifies the sintaxis
for one file load.

tdbloader2 --help
==>
Usage: tdbloader2 --loc location datafile ...

"..." indicates as many files as you like.

I tried with the following command:

cat ../listaExtraidos.txt | tdbloader2 --loc
/home/guidoz/workspace/rdfMaven/database

if it's reading from stdin, then the input must be N-quads (N-triples)


where listaExtraidos.txt is a space-separated list of ttl files
obtained by the ls command.
It hits me this exception:

  12:35:17 -- TDB Bulk Loader Start
  12:35:17 Data phase
File does not exist: -

A minor bug - just now fixed.


Is there any way to do this, or I will need to join the files?

PS

better to put all on the tdbloader2 if you can get 200K files there else ...

Do not join files if they have any blank nodes.

_:a is the same blank node within a file.

If you do a blank node with label, after concatenation, it will be the same blank node in all files.


for each file:
  riotcmd.riot file.ttl >> data.nt

then tdbloader --loc whatever "data.nt" (or tdbloader2)

The parser command "riot" will generate stable identifiers that don't clash.


        Andy


Guido.



Reply via email to