On 2010-01-04 06.38, Niclas Hedhman wrote:
I have just noticed that Sesame/OpenRDF has a general performance
problem when it comes to removing so called "connections", i.e. when
we Remove an Entity from the entity store, the index needs to be
updated. This takes on my machine 360ms per entity if I do a batch of
1000 removals per connection.commit() (~6 minutes). The total graph
consist of ~20,000 entities.
Interestingly enough, the call to connection.clear() is slow, but the
commit() is relatively fast (5 sec of the total).
I am getting more and more annoyed by OpenRDF/Sesame, and seriously
considering implementing and alternate Indexing engine. Problem is; It
will take more time than I have for the 1.0 release.
There's a couple of considerations:
1) this is mainly for tests right? During normal operation this
shouldn't be a problem
2) are you sure the latest Sesame is used? The latest release has
increased performance quite a lot, so make sure that not an older
version is used.
In StreamFlow, what we have done is to create a FileConfiguration
service which calculates all the base directories that are to be used by
services (for data, configuration, caches, logs, tmp files, etc.). We
have then enforced that when it is shut down and the app is in test mode
all these directories are cleared. Because of this there's no need to do
individual cleanup of services; they will be automatically deleted, on a
file system level. One option is to merge this service into
qi4j-extensions. It would simplify file system integration quite a lot,
since so many service need to decide "where to put stuff", and also
remove these test cleanup issues entirely.
/Rickard
_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev