On 02/09/11 13:58, David Jordan wrote:
In terms of requirements, we will have several relatively large
biomedical ontologies that will not be changed at all between
releases of our software, it is basically reference data. The
application user will not be changing this data. There will be other,
mostly smaller models that will have user's data that will be
changing daily and it must support concurrent updates. They will also
be making associations and inferencing relative to the large
read-only reference ontologies, but this can be stored in the users
dataset. So it seems like it would make sense to have the large
reference ontologies use TDB and the user data use SDB.
Yes, that makes sense. If the two types of data have different
charcateristics, then using two different storage subsystems makes sense.
Fuseki does add the right locking to allow multiple updates (i.e.
Transactional TDB supports one writer-transaction and many
read-transactions and give isolation "serializable" (that is, a reader
doing COUNT() will get an answer consistent with the database at the
start of the transaction, unlike "read-committed"). Internally manages
multiple write requests by locking.
True multiple writers always runs the risk of deadlock, and in the RDF
case its can be mysterious because the storage model is not related to
the data model. The app would have a hard time predicting or
understanding deadlocks.
We would also have the reference model pre-inferenced (if that is the
right term to use), so that there is not a HUGE wait time when the
model is first read.
Again, good plan.
Andy