Re: JENA Loader Benchmarks

2019-06-14 Thread Marco Neumann
absolutely it does, preferably NVMe SSD. tdbloaders are almost a showcase themselves for good up-to-date hardware.. if possible I'd like to load the wikidata dataset* at at some point to see where 57GB fits in terms of tdb. The wikidata team is currently looking at new solutions that can go beyond

Re: JENA Loader Benchmarks

2019-06-14 Thread Martynas Jusevičius
What about SSD disks, don't they make a difference? On Sat, Jun 15, 2019 at 12:36 AM Marco Neumann wrote: > > that did the trick Andy, very good might be a good idea to add this to the > distribution in jena-log4j.properties > > I am getting these numbers for a midsize dedicated server, very nice

Re: JENA Loader Benchmarks

2019-06-14 Thread Marco Neumann
that did the trick Andy, very good might be a good idea to add this to the distribution in jena-log4j.properties I am getting these numbers for a midsize dedicated server, very nice numbers indeed Andy. well done! 00:24:53 INFO loader :: Loader = LoaderPhased 00:24:53 INFO loader

Re: JENA Loader Benchmarks

2019-06-14 Thread Andy Seaborne
These messages are logged (to logger "org.apache.jena.tdb2.loader") - do you have log4j.proprties in the current working directory? Do you get any output? INFO Loader = LoaderParallel INFO Start: /home/afs/Datasets/BSBM/bsbm-5m.nt.gz INFO Add: 500,000 bsbm-5m.nt.gz (Batch: 134,770 / Avg: 134

Re: JENA Loader Benchmarks

2019-06-14 Thread Marco Neumann
let me fire up one of the big machines to see what I will get there. currently I have no info display during load with tdb2.tdbloader . if -v is specified I get some extra info but no load info. On Fri, Jun 14, 2019 at 8:03 PM Andy Seaborne wrote: > > > On 14/06/2019 18:13, Marco Neumann wrote:

Re: JENA Loader Benchmarks

2019-06-14 Thread Marco Neumann
nice, so basically for a read only instance tdbloader2 is the way to go in terms of disk space. Is there a trade off for the full packed B+Trees in terms of performance? On Fri, Jun 14, 2019 at 7:52 PM Andy Seaborne wrote: > > > On 14/06/2019 18:13, Marco Neumann wrote: > > I am collecting jena

Re: JENA Loader Benchmarks

2019-06-14 Thread Andy Seaborne
On 14/06/2019 18:13, Marco Neumann wrote: I am collecting jena loader benchmarks. if you have results please post them directly. http://www.lotico.com/index.php/JENA_Loader_Benchmarks tdb2.tdbloader has variations controlled by --loader. --loader= Loader to use: 'basic', 'phased' (default)

Re: JENA Loader Benchmarks

2019-06-14 Thread Andy Seaborne
On 14/06/2019 18:13, Marco Neumann wrote: I am collecting jena loader benchmarks. if you have results please post them directly. http://www.lotico.com/index.php/JENA_Loader_Benchmarks On a linux machine I am using "time" to collect data. Is there a flag on tdb2.tdbloader to report time and

JENA Loader Benchmarks

2019-06-14 Thread Marco Neumann
I am collecting jena loader benchmarks. if you have results please post them directly. http://www.lotico.com/index.php/JENA_Loader_Benchmarks On a linux machine I am using "time" to collect data. Is there a flag on tdb2.tdbloader to report time and triples per second? I have noticed that storag