On 12/08/14 14:21, Claude Warren wrote:
I just realized that we don't have any distinction between read and write
transactions at the graph level. Perhaps we should add that in V3.
There are no native ACID implementations of graph transactions (that was
in the RDB code). Graph transaction != TDB transactions which are on
datasets.
For V2 the transaction must imply a write lock and my example above can not
be done at the graph level.
Write locks are one to implementation transactions but not the only way.
Table level locks have a large potential concurrency issue but are
simple and efficient (in total costs). And there are only two tables so
table locking is coarse.
Normally, 2PC is finer grained, and needs lock management (e.g. abort
deadlocks) using e.g. two-phase locking.
Normal JDBC starts as read transaction at begin() taking read locks, and
becomes a write transaction (i.e starts taking write locks) when an
update happens.
In RDF, the issue is choosing the unit to be locked. A triple is the
moral equivalent to row-level locking - except the "row" is really the
data cell and finer grained than row-level of enities in a relational
database.
TDB uses write-ahead-logging (WAL) and not transaction locks. Only has
one true active writer; it has many true concurrent readers.
There is also techniques for allowing a single writer, not using WAL and
doing marking of triples. Needs short lived locks to update the triple
but the lock is only there for datastructure consistency. I view that
is morally WAL, just encoding the log into the datastructures which
makes recovery more interesting and isn't serializable but has to be
read-committed.
c.f. DatasetGraphWithLock is ACID with respect to the JVM for datasets.
Fuseki uses it if it can't find a better choice.
It can be executed at a level that provides a view type graph.
So do view level graphs support transactions?
On the dataset, not via TransactionHandler. That might be fixable to
some extent.
if so does the TransactionHandler.begin() start a write transaction?
Implementation detail!
How would it know to be a read transaction? I think it needs to start
"read" and engage in lock promotion and all the attendant complexity and
application-visible system aborts.
Andy
On Tue, Aug 12, 2014 at 9:18 AM, Andy Seaborne <[email protected]> wrote:
On 12/08/14 08:58, Claude Warren wrote:
On Mon, Aug 11, 2014 at 6:19 PM, Andy Seaborne <[email protected]> wrote:
Transactions:
The text around transactions does not distinguish being inside or outside
a transaction.
There are 2 base kinds of graphs - ones in datasets (views) and
standalone
ones, then things like InfGraph and other added functionality.
Transactions
on view graphs need to be defined in the context of the dataset because
transactions are connected.
The point here is that several graphs may be in one transaction but other
graphs (other datasets) may not.
Using databases as a transaction example. There are multiple types of
transactions -- I don't know if we want to get into supporting or
identifying the type of transaction supported by a graph, but...
In the current case (Jena 2.12.x) whith 2 threads T1 and T2.
T1 begins a write transaction
T1 add to graph
T2 begins a read transaction (is this possible -- I think so)
Can T2 find the triples written by T1 (note that the transaction is not
yet
committed)?
No
T1 commits the transaction
Can T2 now find the triples written by T1?
No
The isolation level in TDB transactions is serializable, not read
committed. TDB transaction are not lock based nor do they ever cause
system aborts due to unresolvable contention [*].
T2's view of the database is all commits up until then (so not T1) and
does not change. That would be read committed.
T2 ends transaction (just for completeness)
And if T3 starts, it sees T1 updates and T2 still does not. see them.
Given the nature of storing triples/quads and not entities (meaningful
rows in the data abstraction which is what SQL tends to do), the lower
isolation levels can have rather strange effects. And phantom reads are
horrendous.
Andy
PS 4.2 does not differentiate between inside and outside the transaction
so the isolation level is not relevant.
[*] That would change if we have begin() with no read/write indicator at
the point of transaction promotion from read to write, an abort would be
possible if the DB is inconsistent with that happening.
Claude