Re: How to design large volume data ingest with Jena?

Claude Warren Mon, 23 Sep 2013 03:15:42 -0700

Seems to me it would be faster to load the data at the jena interface
rather than at the fuseki interface.  Since duplicates are ignored you
could just open a connection to the model and start adding triples as you
read them from the json file.


if you don't want to take your system offline to do this you could do it
against a new database and then use mysql replication or export/import to
move the data.

I don't know if the jena/mysql implementation allows for parallel access to
the mysql database without conflicts.

Claude


On Mon, Sep 23, 2013 at 10:39 AM, Michel de Lange <
michel_de_la...@yahoo.co.uk> wrote:

> dear all,
>
> My aim is to take an existing library catalogue (which I have a large json
> file), and put this into a triple store. It is about books and authors,
> andI have about 50,000 records. My naive strategy is to go through my
> input, and every time I come to a new book, I do a sparql query to look up
> whether this already exists inthe triple store. If it doesnot, I will do
> somekind of update query to put this book, and its authors, intothe triple
> store. Apart from the fact that I don't know how to do such an update (I
> posted a separate question about that), I wonder if this is a good strategy
> at all. Should I be doing update queries, or should the output of my
> program bean rdf file, which I then ingest separately? Or should I be doing
> something else completely?
>
> The ontology is VIVO, the triple store is mysql, which I access through
> fuseki, with a java program and the jena libraries.
>
> My question is: what would be a good way, at a high level, to accomplish
> this?
>
> Many thanks for your attention, any help is greatly appreciated.
>
>
>
> Michel
>



-- 
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: How to design large volume data ingest with Jena?

Reply via email to