Sorry for insisting to much on this issue but I don't fully understand
the behaviour. Let's try with some examples with the static reference to
the dataset object retrieved from TDBFactory.createDataset(path).

# Case 1) Static reference to dataset for read

- Server receives an HTTP request and an instance of the REST endpoint
management class is created
- Server class instance 1 (SCI1) uses static dataset reference
   dataset.begin(ReadWrite.READ);
- SCI1 starts a quite time consuming read operation (Operation1)
- Server receives another HTTP request and an instance of the REST
endpoint management class is created.

Don't know what a "instance of the REST endpoint management class" is.

- Server class instance 2 (SCI2) uses static dataset reference
   dataset.begin(ReadWrite.READ);

Fine.

- SCI2 read operation finishes and ends transaction.
   dataset.end();

Fine.

- SCI1 read operation finishes and ends transaction.
   dataset.end();

Fine.

In this case both operations are fulfilled and you get the results or
does SCI2 blocks because the dataset is in a transaction?

Same dataset means same underlying managed storage - multiple readers - this is fine.

Different datasets with the same underlying managed storage (same location) is also fine.

Static is more natural IMO.

How does the dataset knows which transaction has ended as both calls are
over the same object?

ThreadLocal variables.

There is a Transaction object created at each begin().

A ThreadLocal holds handle to a lowlevel DatasetGraphTxn and that holds the Transaction object.

# Case 2) Static reference to dataset for read/write

- Server receives an HTTP request and an instance of the REST endpoint
management class is created
- Server class instance 1 (SCI1) uses static dataset reference
   dataset.begin(ReadWrite.READ);
- SCI1 starts a quite time consuming read operation
- Server receives another HTTP request, in this case for writing, and an
instance of the REST endpoint management class is created.
- Server class instance 2 (SCI2) uses static dataset reference
   dataset.begin(ReadWrite.WRITE);
- SCI2 write operation finishes and ends transaction.
   dataset.commit()
   dataset.end();
- SCI1 read operation finishes and ends transaction.
   dataset.end();

That all works.  It happens in Fuseki all the time.

No blocking.


In this case ¿are both operations are fulfilled or SCI2 is not going to
work? Does SCI1 only get data referenced till the SC2 commits? Is there
any block in execution?


# Case 3 Static reference to dataset for write

- Server receives an HTTP request and an instance of the REST endpoint
management class is created
- Server class instance 1 (SCI1) uses static dataset reference
   dataset.begin(ReadWrite.WRITE);
- SCI1 starts a quite time consuming write operation
- Server receives another HTTP request and an instance of the REST
endpoint management class is created.
- Server class instance 2 (SCI2) uses static dataset reference
   dataset.begin(ReadWrite.WRITE);

This blocks that request.

MR+SW : Multiple Readers and a Single Writer at any one time.

- SCI2 write operation finishes and ends transaction.
   dataset.commit();
   dataset.end();

Happens later.

- SCI1 read operation finishes and ends transaction.
   dataset.commit();
   dataset.end();

SCI2 enters its write transaction.


In this case ¿are both operations are fulfilled or is SCI2 blocked till
SCI1 ends?

SCI2 blocked till SCI1 ends

Once this is understood I guess we can go on with the case of the
non-static reference ;)

The same happen.

The non-static references are sharing the same underlying subsystem.

One location - one database - one transaction regime.

Static is more natural IMO.

    Andy






Rob

On 12/09/2017 10:24, "George News" <[email protected]> wrote:

On 2017-09-12 10:43, Andy Seaborne wrote:
They are per storage area.

This blocks and never prints "DONE"

Location loc = Location.create("DB"); Dataset dataset1 =
TDBFactory.createDataset(loc); Dataset dataset2 =
TDBFactory.createDataset(loc); dataset1.begin(ReadWrite.WRITE) ;
dataset2.begin(ReadWrite.WRITE) ; System.out.println("DONE");

but if either a READ, it will work - there and be many readers and
one writer at a time.  The readers will not see the updated by the
writer even after the writer commits.

I understands that but I still think they are not liked to the
storage area. If you put

Location loc = Location.create("DB"); Dataset dataset1 =
TDBFactory.createDataset(loc); Dataset dataset2 =
TDBFactory.createDataset(loc); dataset1.begin(ReadWrite.READ) ;
System.out.println(dataset1.isInTransaction());
System.out.println(dataset2.isInTransaction());
dataset2.begin(ReadWrite.READ) ; System.out.println("DONE");

It will print true for dataset1 and false for dataset2 cases. This
means that the transaction is linked to the object Dataset and not
the real location. Or at least this is what is happening to me.

Therefore I think this is a bug :( as the transaction READ is opened
over the same location. I haven't checked for the WRITE but I guess
it should be the same. If you write on a dataset and you have a
several transactions opened this means you will have a kind of a
counter (semaphore) and when you call the .end() you finish them.


Creating the datasets is quite cheap. It is not really creating
everything everytime. But a statics works as well; Fuseki uses a
static registry of datasets.

(it's called "connect", "not "create" in TDB2 to make that
clearer).

Andy

I think the static will be the way to go for me for the cleanness of
the code, as otherwise it will more complex to handle.


On 11/09/17 15:57, George News wrote:
Hi all,

I'm facing an issue that I guess it was implemented that way for
some reason. The issue is that I thought that transactions were
Dataset based, not the object but the TDB or whatever database
you use.

However while developing my service I have noticed that if you
open 2 datasets on the same TDB

Dataset dataset1 = TDBFactory.createDataset(tripleStorePath);
Dataset dataset2 = TDBFactory.createDataset(tripleStorePath);

then each dataset has it's own transaction pointer, that is,
read/write operations are block per object. Is that the expected
behaviour? Why is like this and not blocked per triple store?

Therefore my question now goes in the direction of which is
better. I'm developing a webservice that is working against the
same triple store path. The Dataset object I create on each call
is link to the instance of the class (not static). Then, how
should I proceed? Should I create the Dataset variable as static,
so this way I only have one object for all?

Thanks Regards,









Reply via email to