Quoting Rickard Öberg <[email protected]>:

Hi,

I did some tests, and found a number of ways to fix this issue.

First of all, whenever you do benchmarking, make sure you run the test repeatedly. Running it once will not allow the JIT to kick in. Run tests about 10 times instead, in the same go, to get more stable numbers.

Second, I changed the notifyChanges() indexing in RdfEntityIndexerMixin to do the removes in one call to Sesame, instead of one call per object.

Third (and most important), you had forgotten to add indexes. I added this:
prefModule.forMixin( NativeConfiguration.class ).declareDefaults().tripleIndexes().set( "cspo,spoc" );

And the performance became muuuuuch better. Along with the other fixes, my results (two-year-old MacBook Pro, 24Ghz, are:

286ms compared to 97secs is a big difference. My guess is that the lack of indexing was the biggest problem. In any case, I've checked in the updated RDF indexer version which removes all entities in one call, which should help regardless.


Thanks for fixes and advices, Rickard. Adding those indexes seemed to make good improvement in our project. However, I still am not convinced with OpenRDF code related to this problem at all, in some places it looks like someone's first programming assignment. Need to find time to make some proper solution to this problem tho.

Btw, those 97secs were during performance instrumentation, which slows down program significally. Without instrumentation, I got around 1-2sec time (without indexes) for full test.

Oh, a "funny" bit, when removing entity, which contained around 12000 aggregated entities (as a test), I got out-of-memory error somewhere in Qi4j, MapEntityStore or somewhere like that (need to check log on my work place, if you want more info about that).


_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

Reply via email to