Thanks for your answer!

2015-08-07 17:14 GMT+03:00 Andy Seaborne <[email protected]>:
>
> Your example is small. So I assume you are not actually describing your real
> use case.  Your example is already keeping everything in memory. TDB
> in-memory, which is for testing mainly, uses a RAM disk for exact semantics
> with the disk version, and has multiple copies of data.

Yes, example that I show use small data, but in real case I use data
from DBpedia that is pretty big.
>
> There are several choices:
>
> 1/ Increase the heap size.
>
> 2/ (if we're really talking about a large disk database).  Do a scan through
> the results of graph.find and keep the subjects in a separate datastructure.
> Do a second loop over the retained subjects to do the updates.  This works
> well when there are a significant smaller number of places to update than
> the whole find() locates.
>
> 3/ If the updates are going to be huge, and the database is really a
> larger-than-RAM persistent one, (total number of updates is compararable or
> larger than RAM) then you are asking for an operation that is fundamentally
> expensive.  Write the updates to a file; in the loop then add the file into
> the database.
>
> 4/ If it's a one off maintance task, dump to N-triples and use perl/ruby/...
> to fix the backup and reload.
>
> See also large transaction support.
> http://mail-archives.apache.org/mod_mbox/jena-users/201507.mbox/%3CCAPTxtVOZRzyPxN1njh3WVggsJEUNxeXDJhNvx%2BG4WcRtExxPxg%40mail.gmail.com%3E
>
>         Andy
>

So if I understood correctly, in my case I need to organize temporary
buffer. And then read all "abstracts" from DBpedia to this buffer.
After that, do calculations with this data from temporary buffer and
then save to TDB.

Ok, I think about something like that, but I'm not work with Jena, so
I think maybe Jena already have some mechanism for this case.

Reply via email to