On 23/09/15 17:44, A. Soroka wrote:
Following up on this conversation, I have now have a branch available
at:

https://github.com/ajs6f/jena/tree/jena-624

with a six-way-map-based version of this, advancing from (but not
directly using) the journaling branch already discussed. (Of course I
can separate these if so desired.)

There is an Index type which hopefully is a start towards abstracting
out index behavior, as Paul Houle suggested. There have already been
interesting suggestions made by Andy and Claude about possible
implementations that are more sophisticated than my simple approach,
so I just hope that this branch will get the ball rolling.

Comments/advice/criticism eagerly desired!

--- A. Soroka The University of Virginia Library

Hi - I pulled the code down and had a partial look.

And it looks very good.

(As you probably know by now, having me (test/demo/use/) anywhere near new code is very risky but I broke very little ... including TDB.)

0/ Basic read/write of files into the dataset worked as expected. :-)

1/
I tried with existing test harnesses:
(see below for code)

AbstractDatasetGraphTests
  Green line!

AbstractTestTransaction
  Red line.

This has tests for various error conditions.

The begin-begin cases: your code throws JenaException. There is a sub-exception JenaTransactionException and that what the tests look for.

The rest are testing that e.g. begin-abort-commit is an error. It looks like the transaction lifecycle is not being tracked. It needs finer granularity than DatasetGraphInMemory.isInTransaction. May be as simple as NOT_TXN -> ACTIVE_TXN -> FINISHING_TXN (-> NOT_TXN). Just in/out isn't enough for, e.g. begin-commit-add_quad->end , begin-commit-commit. I managed to add after commit.

There are some presumable related things here: A writer that .end() without .commit or .abort doesn't indicate an error. It probably should. And be added to the AbstractTestTransaction tests.

And plain commit (no begin) isn't caught as wrong.

  Dataset ds = DatasetFactory.create(new DatasetGraphInMemory()) ;
  ds.commit() ;

Ditto .abort.

Stray .end()s are probably reasonable - "when in doubt call end()" - so multiple ends() on a transaction, which in effect is end on a non-transaction, is good to allow.

(I'll add some tests to AbstractTestTransaction for these cases)

2/
I found the use of the name "Index" a odd. Usually an index (in database speak) is a specific lookup pattern. S->P->O->G and needing a prefix of that for partials.

Hence, in SQL and NoSQL, having multiple indexes per table, one primary, and 0-N secondary. The use in your code is more like the whole "table" (in individual components are all covering solike all RDF subsystems no need to have a distinguished "table"

org.apache.jena.tdb.index.Index
org.apache.jena.tdb.index.RangeIndex

Is this really something like "QuadTable"? "QuadStorage"?

I am encountering a similar split between the storage and provision of the interface in TDB2. There, I want to be able to swap the storage on the fly to give parallel storage a compaction option to a running database. Being on-disk, there isn't a GC to manage the multi-version datastructures.

3/
No prefixes on the dataset? I only got them to work with getDefaultModel etc.

TDB uses DatasetPrefixStorage for managing prefixes and then GraphTDB
Some of DatasetPrefixesTDB could be extracted for common use.

leading to...
4/
We should look for commonality between TDB and InMem and pull out a separate framework. That's long term - getting something working and released should not have "architectural internal reorg" on the critical path.

    Andy


public class TestDatasetGraphInMem_AFS
             extends AbstractDatasetGraphTests {
    @Override
    protected DatasetGraph emptyDataset() {
        return new DatasetGraphInMemory() ;
    }
}

public class TestDatasetGraphTxn_AFS
             extends AbstractTestTransaction {
    @Override
    protected Dataset create() {
        return DatasetFactory.create(new DatasetGraphInMemory()) ;
    }
}

----------------------

And if all that begin-commit stuff is boring code ... WIP ...

https://github.com/afs/mantis/blob/master/dboe-transaction/src/main/java/org/seaborne/dboe/transaction/Txn.java

    Txn.executeRead(dsg, ()->{
    ... query ...
        }) ;


This includes ThreadTxn - starting a transaction on another thread and executing the body at some later date. Great for isolation testing.

This "Transactional" is a different interface to Jena's (slight change - backwards compatible) but the code should work for Jena.

Reply via email to