Re: Txn code not handling type of transaction

George News Sun, 31 Dec 2017 04:34:04 -0800

Hi all,

thanks for your tips. I think I know understand what I need to do.
Although all the graphs comes from the same dataset, it is not the same
the merging of them as using the dataset itself.


Therefore, the lock to the dataset seems not to apply to a bunch of
merged graphs from the same dataset. I guess I have to start using the
FROM GRAPH statement in my SPARQL sentences. But, how is it done
internally? If I create a transaction on the dataset, and there are
several FROM GRAPH then internally you have to merge and execute the
query over the list of graphs.

Besides that, after exploring a bit Fuseki, I might be changing the
backend to Fuseki and modify my service to act as a proxy to it. Anyway
I have to go deeper in Fuseki to see if I can easily apply the updates,
etc on the dataset. In this sense:
- For SPARQLs, do you recomend creating a HTTP client and send the
request to the appropriate endpoint? Or is it better to use the
QueryExecutor based on service endpoint? Which one offers better control
over HTTP error?

- For Fuseki GSP endpoint and uploading data to a named graph I have to
use the "graph" query parameter as from
https://www.w3.org/TR/sparql11-http-rdf-update/, do I?


Thanks a lot for you help. It seems there is a lot to do for 2018.

Wish you all the best for the new year.

Regards,
Jorge

On 2017-12-29 18:29, Andy Seaborne wrote:
> I can't see that variable "dataset" is a TDB dataset, not a bunch of
> graphs pulled together at the start of the request.  If it's a composite
> wrapper containing TDB graphs, and if it is different on request, that
> would be consistent with what has been presented.  Not the only
> possibility though.
> 
>     Andy
> 
> On 29/12/17 14:28, ajs6f wrote:
>> Just to add a bit of context to Andy's point:
>>
>>> It needs a dataset so it makes one and puts the model in as the
>>> default graph.
>>
>> This is (more or less) because SPARQL is defined as acting against
>> datasets, not individual graphs.
>>
>> https://www.w3.org/TR/sparql11-query/#rdfDataset:
>>
>> "A SPARQL query is executed against an RDF Dataset which represents a
>> collection of graphs."
>>
>> And if (as Andy pointed out is possible) the individual graph in
>> question is a view over several datasets, that doesn't work either.
>> There has to be a single dataset that is the basic context for SPARQL.
>> (Even while using federation/SERVICE, there is still a single dataset
>> against which the query is being executed.)
>>
>> ajs6f
>>
>>> On Dec 29, 2017, at 9:21 AM, Andy Seaborne <[email protected]> wrote:
>>>
>>>
>>>
>>> On 29/12/17 13:58, George News wrote:
>>>> On 2017-12-29 14:46, Andy Seaborne wrote:
>>>>>
>>>>>
>>>>> On 29/12/17 09:51, George News wrote:
>>>>>> On 2017-12-28 18:02, dandh988 wrote:
>>>>>>> Can you give a complete example? How are you calling MultiUnion? Are
>>>>>>> you calling txn read on the dataset then building the union, then
>>>>>>> calling txn write to update the graph?
>>>>>>
>>>>>> Let's describe a use case.
>>>>>
>>>>> The stack trace does not agree with the details here (and details
>>>>> matter!) especially what dataset is because if that is a general
>>>>> dataset
>>>>> built out of TDB graphs, this whole thing will not work.  A complete,
>>>>> minimal example is needed.
>>>>>
>>>>> A few general observations:
>>>>>
>>>>>>
>>>>>> 1) I have a dataset that includes 4 named graphs: G1, G2, G3, G4
>>>>>> 2.a) Someone initiates a sparql request and the internal procedure
>>>>>> followed is:
>>>>>>     return Txn.calculateRead(dataset, () -> {
>>>>>>       // Create multiunion of 3 namegraphs
>>>>>>       MultiUnion union = new MultiUnion();
>>>>>>       union.addGraph(dataset.getNamedModel("G1").getGraph());
>>>>>>       union.addGraph(dataset.getNamedModel("G2").getGraph());
>>>>>>       union.addGraph(dataset.getNamedModel("G4").getGraph());
>>>>>
>>>>> This can be done in SPARQL:
>>>>>
>>>>> either UNION or
>>>>>
>>>>> SELECT
>>>>> FROM <G1>
>>>>> FROM <G2>
>>>>> FROM <G4>
>>>>> WHERE { ... }
>>>>>
>>>>> (copy the query object and modify it).
>>>>>
>>>>> and TDB will handle it better.
>>>>>
>>>>>>
>>>>>>       Model m = ModelFactory.createModelForGraph(union);
>>>>>
>>>>> You are not querying the TDB dataset if you pass a model to
>>>>> QueryExecutionFactory.create.  It has to create a dataset (a general
>>>>> purpose one) for the query.
>>>> But if the model is created from graphs/models on a dataset, isn't it
>>>> the same? At the end the models are a reference to ones in the dataset.
>>>
>>> Not the same and the model is not a reference into a dataset.
>>>
>>> QueryExecutionFactory.create is given a model, here a composite one,
>>> could be a model without a datasets, could be a model over many
>>> datasets.  It can't know what the structure is, model is an interface.
>>>
>>> It needs a dataset so it makes one and puts the model in as the
>>> default graph.
>>>
>>>>>>       // Launch Sparql query on it
>>>>>>       try (QueryExecution qExec = QueryExecutionFactory.create(query,
>>>>>> m)) {
>>>>>>         return ResultSetFactory.copyResults(qExec.execSelect())
>>>>>>       }
>>>>>>     });
>>>>>>
>>>>>> 2.b) Someone initiates a sparql request and the internal procedure
>>>>>> followed is:
>>>>>>     return Txn.calculateRead(dataset, () -> {
>>>>>>       // Create multiunion of 2 namegraphs
>>>>>>       // This code in in a function
>>>>>>       MultiUnion union = new MultiUnion();
>>>>>>       union.addGraph(dataset.getNamedModel("G1").getGraph());
>>>>>>       union.addGraph(dataset.getNamedModel("G2").getGraph());
>>>>>>       Model m1 = ModelFactory.createModelForGraph(union);
>>>>>>
>>>>>>       // Retrieve namegraph G3
>>>>>>       // This code in in a function
>>>>>>       Model m2 = dataset.getNamedModel("G3");
>>>>>>
>>>>>>       Model m = ModelFactory.createUnion(m2, m1);
>>>>>>
>>>>>>       // Launch Sparql query on it
>>>>>>       try (QueryExecution qExec = QueryExecutionFactory.create(query,
>>>>>> m)) {
>>>>>>         return ResultSetFactory.copyResults(qExec.execSelect())
>>>>>>       }
>>>>>>     });
>>>>>>
>>>>>>
>>>>>> 3) Someone initiaties a write in G2
>>>>>>     // m stores the new entity model
>>>>>>     Model m;
>>>>>>     Txn.executeWrite(dataset, () -> {
>>>>>>       dataset.getNamedModel("G2").add(m);
>>>>>>     });
>>>>>>
>>>>>>
>>>>>>
>>>>>> If either version of 2), and 3) are not done in parallel there is no
>>>>>> problem and everything is executed correctly.
>>>>>>
>>>>>> The problem arise when 2.a) or 2.b) is run, and before ending someone
>>>>>> tries to perform 3). Then I get the FileException("In the middle
>>>>>> of an
>>>>>> alloc-write").
>>>>>>
>>>>>> Do you have an idea on how to avoid this? How can I handle
>>>>>> transactions in this model?
>>>>>>
>>>>>>
>>>>>> I was thinking on creating a global mutex so if any action is being
>>>>>> performed over the dataset, then the rest would be blocked. The
>>>>>> problem here is that the code is part of a webservice and then if the
>>>>>> read/write operation lasts long, I will get a timeout that will close
>>>>>> the connection.
>>>>>>
>>>>>> The other option is to disable writing while someone is reading. The
>>>>>> main problem here is how to properly reschedule writings not to
>>>>>> have a
>>>>>> big queue.
>>>>>>
>>>>>> Any help is more than welcome. I don't know what else to do to solve
>>>>>> this is issue, and the problem is making the service unusable :(
>>>>>>
>>>>>> Thanks a lot for the great help you are all offering.
>>>>>> Jorge
>>>>>
>>
>

Re: Txn code not handling type of transaction

Reply via email to