Re: Bad Jena TDB performance

Brian McBride Mon, 28 Mar 2011 04:28:26 -0700


On 28/03/2011 09:17, Andy Seaborne wrote:
[...]


If the TDB default event handler then did something (like sync to disk?)
that the memory model does not, this could explain the difference in
performance. I have put a profiler on the test program and it reports
that the test program is spending a lot more time in
BlockManagerFile.force() when it is reading directly in to TDB than when
it is going via a memory model. So there is some evidence that this is
what is happening.

I haven't been able to track down the block manager code actually in use
as I'm having trouble checking ARQ out of SVN, but Andy likely knows off
the top of his head whether this is plausible.


> s/block manager/event manager/

Could be - the model.add(model) will go via the BulkUpdateHandler (Ithink). TDB's BulkUpdateHandler inherits from SimpleBulkUpdateHandlerfor insertion.

Yes.  The event is not issued by the update handler but by ARP.

Could you try putting a break point in dataset.sync and see what thecall stack is when it gets hit? That'll tell you who is causing thesync.

Done. ARP issues the finishRead event. This leads tocom.hp.hpl.jena.tdb.graph.GraphSyncListener.finishRead() which does async. Something is(was) attaching a GraphSyncListener to the eventmanager for TDB graphs.

There used to be (up to v 0.8.9? not int he last snapshot build) async wrapper that sync() every n'000 triples added.

I think it is ARP issuing the finishRead event that is the trigger forthe sync.

It's not in the development codebase. All hidden implicit syncsshould now be removed They were causing problems for a user who wastracking whether the DB on disk was dirty or not.
Brian, Frank, which versions are you running?

I've been using the latest from the main Maven repository:

tbd: 0..8.9
arq: 2.8.7
jena: 2.6.4

I've checked with the latest from CVS/SVN taken today. That does notdo the sync call and is faster when the parser is reading directly intothe TDB dataset. So this issue already fixed in head version.


Brian

Re: Bad Jena TDB performance

Reply via email to