Just so you know: The TDB bulkloader can load all the data offline -
it's faster than using Fuseki for data loading online.
Andy
On 15/03/11 11:22, Anuj Kumar wrote:
Hi Andy,
Thanks for the info. I have loaded few GBs using Fuseki Server but I didn't
try RiotReader or Java APIs for TDB. Will try that.
Thanks for the response.
Regards,
Anuj
On Tue, Mar 15, 2011 at 4:12 PM, Andy Seaborne<
[email protected]> wrote:
1/ Have you considered reading the DBpedia data into TDB? This would keep
the triples on-disk (and have cached in-memory versions of a subset).
2/ A file can be read sequentially by using the parser directly (See
RiotReader and pass in a Sink<Triple> that processes the stream of triples).
Andy
On 14/03/11 18:42, Anuj Kumar wrote:
Hi All,
I am new to Jena and trying to explore it to work with large number of
N-Triples. The requirement is to read large number of N-Triples. For
example, a nt file from DBpedia dump that may run into GBs. I have to read
these triples, pick specific ones and further link it to the resource of
another set of triples. The goal is to link some of the entities based on
Linked Data concept. Once the mapping is done, I have to query the model
from that point onwards. I don't want to work by loading both the source
and
target dataset in-memory.
To achieve this, I have first created a file model maker and then a named
model for the specific dataset being mapped. Now, I need to read the
Triples
and add the mapping to this new model. What should be the right approach?
One way is to load the model using FileManager and iterate through the
statements and map them accordingly to the named model (i.e. our mapped
model) and at the end close it. This will work, but it will load all of
the
triples in memory. Is this the right way to proceed or is there a way to
read the model sequentially at the time of mapping?
Just trying to understand the efficient way to map large set of N-Triples.
Need your suggestions.
Thanks,
Anuj