Re: Streaming CONSTRUCT/INSERTs in TDB

Andy Seaborne Fri, 09 Mar 2018 15:36:37 -0800


On 06/03/18 20:50, Adrian Gschwend wrote:

On 03.03.18 17:11, Andy Seaborne wrote:

Hi Andy,


Hi Adrian,

Executes in 2m 20s (java8) for me (and 1m 49s with java9 which is .Default heap which is IIRC 25% of RAM or 8G. Cold JVM, cold file cache.


If you have an 8G machine, an 8G heap may cause problems (swapping).

Does the CPU load go up very high, on all cores? That's a sign of a fullGC trying to reclaim space before a OOME.

If you get the same with TDB2, then the space isn't going intransactions in TDB1.

Do you have an example of such an update?


yes I can deliver two use-cases, with data and query. First one is this
dataset:

http://ktk.netlabs.org/misc/rdf/fuseki-lock.nq.gz

Query:

https://pastebin.com/7TbsiAii

This returns reliably in Stardog, in less than one minute. The UNION is
most probably necessary due to blank-nodes issues so I don't think I can
split them.

Blank nodes are coming from the database so they will be the same eachtime. I think you can execute each part of the union separately


The WHERE parts as queries were (one time timings)
1170524 results, 40s
905 result, 1s
697471, 27s

The ARQ optimizer does not try to spot common sub patterns across UNIONsnor does it do a good job on paths combined with BGPs. It ends up withsome less than ideal join orders.


    Andy


In Fuseki it runs out with

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
exceeded

And I once allocated almost all I had on my system (> 8GB)

Some cases can't stream, but it is possible some cases aren't streaming
when they could.

ok

Or the whole transaction is quite large which is where TDB2 comes in.


I did try that on TDB2 recently as well, same issue.

Will post the other sample later.

regards

Adrian

Re: Streaming CONSTRUCT/INSERTs in TDB

Reply via email to