Re: Report on loading wikidata

Laura Morales Thu, 14 Dec 2017 17:01:17 -0800

> The loaders work on empty databases.

Yes my test is on a new empty dataset. The command that I use is `tdbloader2 
--loc wikidata wikidata.ttl`


> If you are splitting files, and doing partial loads, things are rather 
> different.

No I'm using the whole file. I'd only consider splitting it if there were a way 
to use "FROM <wikidata>" as an alias for "FROM <wd-store1> FROM <wd-store2> 
FROM <wd-store3> ..."

> Maybe swappiness is set to keep a %-age of RAM free.

My swappiness is set to 10.
Disk read speed: 2-3MB/s | Disk write speed: 40-50MB/s  (slowing down over 
time). I think what Dick said is correct; that is, as the index and stored data 
grows, the disk can't keep up. I think a single HDD just doesn't cut it. 
Perhaps a SSD can do it, I don't know because I don't have one. Maybe I should 
try with many hard disks... one to host the 200GB source, one to handle 
data-triples.tmp, one for node2id.net, one for nodes.dat, and so forth...

Re: Report on loading wikidata

Reply via email to