On 2010-01-04 06.38, Niclas Hedhman wrote:
I have just noticed that Sesame/OpenRDF has a general performance
problem when it comes to removing so called "connections", i.e. when
we Remove an Entity from the entity store, the index needs to be
updated. This takes on my machine 360ms per entity if I do a batch of
1000 removals per connection.commit() (~6 minutes). The total graph
consist of ~20,000 entities.

Interestingly enough, the call to connection.clear() is slow, but the
commit() is relatively fast (5 sec of the total).


I am getting more and more annoyed by OpenRDF/Sesame, and seriously
considering implementing and alternate Indexing engine. Problem is; It
will take more time than I have for the 1.0 release.

There's a couple of considerations:
1) this is mainly for tests right? During normal operation this shouldn't be a problem 2) are you sure the latest Sesame is used? The latest release has increased performance quite a lot, so make sure that not an older version is used.

In StreamFlow, what we have done is to create a FileConfiguration service which calculates all the base directories that are to be used by services (for data, configuration, caches, logs, tmp files, etc.). We have then enforced that when it is shut down and the app is in test mode all these directories are cleared. Because of this there's no need to do individual cleanup of services; they will be automatically deleted, on a file system level. One option is to merge this service into qi4j-extensions. It would simplify file system integration quite a lot, since so many service need to decide "where to put stuff", and also remove these test cleanup issues entirely.

/Rickard

_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev

Reply via email to