thank you - your link indicates why the solution with calling s-put for
each individual file is so slow.
practically - i will just wait the 10 hours and then extract the triples
from the store.
can you understand, why somebody would select this format? what is the
advantage?
andrew
On 10/07/2017 10:52 AM, zPlus wrote:
Hello Andrew,
if I understand this correctly, I think I stumbled on the same problem
before. Concatenating XML files will not work indeed. My solution was
to convert all XML files to N-Triples, then concatenate all those
triples into a single file, and finally load only this file.
Ultimately, what I ended up with is this loop [1]. The idea is to call
RIOT with a list of files as input, instead of calling RIOT on every
file.
I hope this helps.
----
[1] https://notabug.org/metadb/pipeline/src/master/build.sh#L54
----- Original Message -----
From: [email protected]
To:"[email protected]" <[email protected]>
Cc:
Sent:Sat, 7 Oct 2017 10:17:18 -0400
Subject:loading many small rdf/xml files
i have to load the Gutenberg projects catalog in rdf/xml format. this
is
a collection of about 50,000 files, each containing a single record
as
attached.
if i try to concatenate these files into a single one the result is
not
legal rdf/xml - there are xml doc headers:
<rdf:RDF xml:base="http://www.gutenberg.org/">
and similar, which can only occur once per file.
i found a way to load each file individually with s-put and a loop,
but
this runs extremely slowly - it is alrady running for more than 10
hours; each file takes half a second to load (fuseki running as
localhost).
i am sure there is a better way?
thank you for the help!
andrew
--
em.o.Univ.Prof. Dr. sc.techn. Dr. h.c. Andrew U. Frank
+43 1 58801 12710 direct
Geoinformation, TU Wien +43 1 58801 12700 office
Gusshausstr. 27-29 +43 1 55801 12799 fax
1040 Wien Austria +43 676 419 25 72 mobil
--
em.o.Univ.Prof. Dr. sc.techn. Dr. h.c. Andrew U. Frank
+43 1 58801 12710 direct
Geoinformation, TU Wien +43 1 58801 12700 office
Gusshausstr. 27-29 +43 1 55801 12799 fax
1040 Wien Austria +43 676 419 25 72 mobil