i have to load the Gutenberg projects catalog in rdf/xml format. this is a collection of about 50,000 files, each containing a single record as attached.

if i try to concatenate these files into a single one the result is not legal rdf/xml - there are xml doc headers:

<rdf:RDF xml:base="http://www.gutenberg.org/";>

and similar, which can only occur once per file.

i found a way to load each file individually with s-put and a loop, but this runs extremely slowly - it is alrady running for more than 10 hours; each file takes half a second to load (fuseki running as localhost).

i am sure there is a better way?

thank you for the help!

andrew



--
em.o.Univ.Prof. Dr. sc.techn. Dr. h.c. Andrew U. Frank
                                 +43 1 58801 12710 direct
Geoinformation, TU Wien          +43 1 58801 12700 office
Gusshausstr. 27-29               +43 1 55801 12799 fax
1040 Wien Austria                +43 676 419 25 72 mobil

Attachment: pg9630.rdf
Description: application/rdf

Reply via email to