On 21/03/13 04:42, Егор Егоров wrote:
Thank you, Andy, for your reply.

tdbloader will do the job but are you running a 32 bit JVM?

I am using 64-bit Ubuntu 12.04.

TDB is stagnating after 24 hours of work -- its throughput slow down from
80k tps to ~500 tps and I think it will never finish.

It will finish unless the glaciers get to you first.

My PC is totally
freeze, I cant even open new terminal tab.

Yes - it's disk I/O bound at that size and it makes the machine unusable for interactive work.

1.3 billion  ...  what sort fo queries do you want to ask of the data
once loaded?  Only simply queries are going to stand much chance of running
at a tolerable speed.

A lot of simple SELECT queries is enough for me, I understand that overall
performance will be low.

If you can borrow a large machine to load the database you'll get on
better.  Databases are portable - you can copy the database directory
around.

I am thinking about Amazon EC2 or new standalone server. What amount of RAM
is enough to load the entire BTC dataset via TDB?

As much as you can get - I saw a report that used 48GB but it was Power7 system.


And I have ~30 identical computers like mine right now. Is it possible to
configure a cluster and load the entire dataset via tdb? Or it is better to
use another store that supports jena api?

Not for TDB - it's single system.

It does not need to be the Jena API - SPARQL is a standard so you can use that - 4Store is the system that springs to mind here : http://4store.org/ (GPLv3 license)

        Andy


Thank you!

Reply via email to