On Tue, 2011-10-25 at 15:18 -0500, Craig, R Bruce, JR (Bruce) wrote: > Our team has downloaded the latest release of TDB for some preliminary > efforts to migrate to triple stores from SMW mysql etc. > > We ran into performance issues with MySQL so the options around doing some > benchmarks are potentially critical for us. > We have been simply doing tdbload operations on various small OWL and RDF > sets. > We'd been able to query pretty well as we might have expected with tdbquery. > > We've loaded a large/huge dataset from the Social Intelligence Benchmark > (SIB) and while the load seemed to complete, efforts to run tdbquery as well > as tdbdump all blow up with Heap limits. We've got 4GB ram allocated on our > Ubuntu 9.10 VM system but haven't gotten things to run even after adjusting > JVM_ARGS to -Xmx2400m
Is that a 32bit or 64bit VM and Java? Which Java - OpenJDK? Oracle? Is it up to date? (Ubuntu 9.10 is pretty old and I guess its possible its using an obsolete OpenJDK, I've had significant problems with OpenJDK in the past though at present it seems usable). Do your queries require sorting/distinct? For non-sorted queries I would certainly expect to be able to run tdbquery in less than 1G of heap on very large datasets. For sorted queries there had been various scaling limits but if you are running off the latest snapshots then you'll have the spill-to-disk fixes for that included. Dave
