Hello JENA users and developers, please help me.

I am trying to load BTC dataset via low-end hardware (Core 2 Quad Q6600
2.40 GHz, 4 GB RAM, 2x250 GB SATA Barracuda 7200.10 RAID0 Stripe)

First, I was using TDB. But hardware is too bad for this task -- I can load
only ~100 million quads. So I decided to switch to SDB.
But sdbload utility is unable to import .nq files:

egor@egorov:~/semsearch/sdb$ sdbload -v sdb.ttl
../dataset/btc-2009-chunk-115-urified.nq
Start load: sdb.ttl
Start load: ../dataset/btc-2009-chunk-115-urified.nq
WARN  Only triples or default graph data expected : named graph data ignored
<[email protected]>
So I am using the following java code to import nquads:

Store store =
SDBFactory.connectStore("/home/egor/semsearch/sdb/sdb.ttl");
Dataset dataset = SDBFactory.connectDataset(store);
RDFDataMgr.read(dataset,
"/home/egor/semsearch/dataset/btc-2009-chunk-115-urified.nq");

I have the following questions:
1. What approx. hardware requirements to load ~1.3 Billion quads into TDB
or SDB backend?
2. Is it real to load the BTC dataset via my computer & sdb?
3. Why sdbload utility is unable to load NQuads, but RDFDataMgr.read
accepts .nq files? I think that it is very useful feature for sdbload
utility, can it be realized in new versions of jena?

Thank you!

Egor Egorov

Reply via email to