Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-07-14 Thread Reto Bachmann-Gmür
On Fri, Jul 12, 2013 at 9:18 PM, Andy Seaborne  wrote:
> On 10/07/13 16:59, Reto Bachmann-Gmür wrote:
>>
>> Could it be that Jena methods return before that jena has actually
>> finished writing and that the jena buiilt-in locks take this into
>> account?
>
>
> No - they don't do that.  Iterators (obviously) resume on teh next .hasNext
> etc so iterator - operation - iterator is a potential problem area.
>
> Could there be separate accesses to two graphs in the same dataset?

That's exactly the point. This test
(http://svn.apache.org/viewvc/clerezza/trunk/rdf.jena.tdb.storage/src/test/java/org/apache/clerezza/rdf/jena/tdb/storage/MultiThreadedTest.java)
runs on a single graph. Locks (standard java locks) ensure that no two
threads write two the graph concurrently and that no thread readys
from the graph while another thread is having write access.

Even worse, the first exception described in CLEREZZA-792 and the one
quoted in this thread occur after all threads that perform some write
operation have ended.

> Can you use transactions?  Then you don't need locking.

Maybe. But it still should work by only having concurrent
read-operations and no other operation during writes.

Cheers,
Reto


Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-07-12 Thread Andy Seaborne

On 10/07/13 16:59, Reto Bachmann-Gmür wrote:

Could it be that Jena methods return before that jena has actually
finished writing and that the jena buiilt-in locks take this into
account?


No - they don't do that.  Iterators (obviously) resume on teh next 
.hasNext etc so iterator - operation - iterator is a potential problem area.


Could there be separate accesses to two graphs in the same dataset?

Can you use transactions?  Then you don't need locking.

Andy



Another exception I'm getting is:
java.util.ConcurrentModificationException: Reader = 1, Writer = 1
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:152)
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.checkConcurrency(DatasetControlMRSW.java:79)
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.startRead(DatasetControlMRSW.java:46)
 at 
com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.startRead(NodeTupleTableConcrete.java:68)
 at 
com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.findAsNodeIds(NodeTupleTableConcrete.java:139)
 at com.hp.hpl.jena.tdb.store.TripleTable.find(TripleTable.java:76)
 at 
com.hp.hpl.jena.tdb.store.DatasetGraphTDB.findInDftGraph(DatasetGraphTDB.java:100)
 at 
com.hp.hpl.jena.sparql.core.DatasetGraphBaseFind.find(DatasetGraphBaseFind.java:46)
 at 
com.hp.hpl.jena.tdb.store.GraphTDBBase.graphBaseFindDft(GraphTDBBase.java:114)
 at 
com.hp.hpl.jena.tdb.store.GraphTriplesTDB.graphBaseFind(GraphTriplesTDB.java:71)
 at com.hp.hpl.jena.graph.impl.GraphBase.find(GraphBase.java:268)
 at com.hp.hpl.jena.graph.impl.GraphBase.graphBaseFind(GraphBase.java:290)
 at com.hp.hpl.jena.graph.impl.GraphBase.find(GraphBase.java:287)
 at 
org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor.performFilter(JenaGraphAdaptor.java:94)
 at 
org.apache.clerezza.rdf.core.impl.AbstractTripleCollection.filter(AbstractTripleCollection.java:71)
 at 
org.apache.clerezza.rdf.core.impl.AbstractTripleCollection.contains(AbstractTripleCollection.java:65)
 at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$4.run(PrivilegedTripleCollectionWrapper.java:88)
 at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$4.run(PrivilegedTripleCollectionWrapper.java:84)
 at java.security.AccessController.doPrivileged(Native Method)
 at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper.contains(PrivilegedTripleCollectionWrapper.java:84)
 at 
org.apache.clerezza.rdf.core.access.LockableMGraphWrapper.contains(LockableMGraphWrapper.java:118)
 at 
org.apache.clerezza.rdf.jena.tdb.storage.MultiThreadedTest.perform(MultiThreadedTest.java:131)


Cheers,
Reto

On Wed, Jul 10, 2013 at 5:51 PM, Reto Bachmann-Gmür  wrote:

Hi Andy

Running into concurrency issues again. I'm getting the foolowing exception:

java.util.ConcurrentModificationException: Iterator: started at 99459, now 99460
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
 at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.next(DatasetControlMRSW.java:128)
 at org.apache.jena.atlas.iterator.Iter.count(Iter.java:478)
 at 
com.hp.hpl.jena.tdb.store.GraphTDBBase.graphBaseSize(GraphTDBBase.java:159)
 at com.hp.hpl.jena.graph.impl.GraphBase.size(GraphBase.java:344)
 at 
org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor.size(JenaGraphAdaptor.java:70)
 at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$2.run(PrivilegedTripleCollectionWrapper.java:66)
 at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$2.run(PrivilegedTripleCollectionWrapper.java:62)
 at java.security.AccessController.doPrivileged(Native Method)
 at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper.size(PrivilegedTripleCollectionWrapper.java:62)
 at 
org.apache.clerezza.rdf.core.access.LockableMGraphWrapper.size(LockableMGraphWrapper.java:97)
 at 
org.apache.clerezza.rdf.jena.tdb.storage.MultiThreadedTest.perform(MultiThreadedTest.java:129)


I've checked the clerezza code and it seems that it ensures that no
write happens while size() is executed. Also only a single graph is
used so it cannot be the issue about the clerezza lock not being broad
enough (afaik this is an issue but not the cause of this exception).
Do you have an idea what could cause the problem?

Cheers,
Reto

On Thu, Mar 14, 2013 at 2:43 PM, Andy Seaborne  wrote:

On 14/03/13 09:39, Minto van der Sluis wrote:


Rupert,

Thanks for the additional explanation.

Regards,

Minto

Op 14-3-2013 

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-07-10 Thread Reto Bachmann-Gmür
Could it be that Jena methods return before that jena has actually
finished writing and that the jena buiilt-in locks take this into
account?

Another exception I'm getting is:
java.util.ConcurrentModificationException: Reader = 1, Writer = 1
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:152)
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.checkConcurrency(DatasetControlMRSW.java:79)
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.startRead(DatasetControlMRSW.java:46)
at 
com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.startRead(NodeTupleTableConcrete.java:68)
at 
com.hp.hpl.jena.tdb.nodetable.NodeTupleTableConcrete.findAsNodeIds(NodeTupleTableConcrete.java:139)
at com.hp.hpl.jena.tdb.store.TripleTable.find(TripleTable.java:76)
at 
com.hp.hpl.jena.tdb.store.DatasetGraphTDB.findInDftGraph(DatasetGraphTDB.java:100)
at 
com.hp.hpl.jena.sparql.core.DatasetGraphBaseFind.find(DatasetGraphBaseFind.java:46)
at 
com.hp.hpl.jena.tdb.store.GraphTDBBase.graphBaseFindDft(GraphTDBBase.java:114)
at 
com.hp.hpl.jena.tdb.store.GraphTriplesTDB.graphBaseFind(GraphTriplesTDB.java:71)
at com.hp.hpl.jena.graph.impl.GraphBase.find(GraphBase.java:268)
at com.hp.hpl.jena.graph.impl.GraphBase.graphBaseFind(GraphBase.java:290)
at com.hp.hpl.jena.graph.impl.GraphBase.find(GraphBase.java:287)
at 
org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor.performFilter(JenaGraphAdaptor.java:94)
at 
org.apache.clerezza.rdf.core.impl.AbstractTripleCollection.filter(AbstractTripleCollection.java:71)
at 
org.apache.clerezza.rdf.core.impl.AbstractTripleCollection.contains(AbstractTripleCollection.java:65)
at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$4.run(PrivilegedTripleCollectionWrapper.java:88)
at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$4.run(PrivilegedTripleCollectionWrapper.java:84)
at java.security.AccessController.doPrivileged(Native Method)
at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper.contains(PrivilegedTripleCollectionWrapper.java:84)
at 
org.apache.clerezza.rdf.core.access.LockableMGraphWrapper.contains(LockableMGraphWrapper.java:118)
at 
org.apache.clerezza.rdf.jena.tdb.storage.MultiThreadedTest.perform(MultiThreadedTest.java:131)


Cheers,
Reto

On Wed, Jul 10, 2013 at 5:51 PM, Reto Bachmann-Gmür  wrote:
> Hi Andy
>
> Running into concurrency issues again. I'm getting the foolowing exception:
>
> java.util.ConcurrentModificationException: Iterator: started at 99459, now 
> 99460
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.next(DatasetControlMRSW.java:128)
> at org.apache.jena.atlas.iterator.Iter.count(Iter.java:478)
> at 
> com.hp.hpl.jena.tdb.store.GraphTDBBase.graphBaseSize(GraphTDBBase.java:159)
> at com.hp.hpl.jena.graph.impl.GraphBase.size(GraphBase.java:344)
> at 
> org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor.size(JenaGraphAdaptor.java:70)
> at 
> org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$2.run(PrivilegedTripleCollectionWrapper.java:66)
> at 
> org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$2.run(PrivilegedTripleCollectionWrapper.java:62)
> at java.security.AccessController.doPrivileged(Native Method)
> at 
> org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper.size(PrivilegedTripleCollectionWrapper.java:62)
> at 
> org.apache.clerezza.rdf.core.access.LockableMGraphWrapper.size(LockableMGraphWrapper.java:97)
> at 
> org.apache.clerezza.rdf.jena.tdb.storage.MultiThreadedTest.perform(MultiThreadedTest.java:129)
>
>
> I've checked the clerezza code and it seems that it ensures that no
> write happens while size() is executed. Also only a single graph is
> used so it cannot be the issue about the clerezza lock not being broad
> enough (afaik this is an issue but not the cause of this exception).
> Do you have an idea what could cause the problem?
>
> Cheers,
> Reto
>
> On Thu, Mar 14, 2013 at 2:43 PM, Andy Seaborne  wrote:
>> On 14/03/13 09:39, Minto van der Sluis wrote:
>>>
>>> Rupert,
>>>
>>> Thanks for the additional explanation.
>>>
>>> Regards,
>>>
>>> Minto
>>>
>>> Op 14-3-2013 10:31, Rupert Westenthaler schreef:

 Hi Minto

 I am traveling this week and do not have time to work on this until
 the weekend but I will have a look into this.

 Let me try to explain my concern again and make it more clear:
>

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-07-10 Thread Reto Bachmann-Gmür
Hi Andy

Running into concurrency issues again. I'm getting the foolowing exception:

java.util.ConcurrentModificationException: Iterator: started at 99459, now 99460
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
at 
com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.next(DatasetControlMRSW.java:128)
at org.apache.jena.atlas.iterator.Iter.count(Iter.java:478)
at 
com.hp.hpl.jena.tdb.store.GraphTDBBase.graphBaseSize(GraphTDBBase.java:159)
at com.hp.hpl.jena.graph.impl.GraphBase.size(GraphBase.java:344)
at 
org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor.size(JenaGraphAdaptor.java:70)
at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$2.run(PrivilegedTripleCollectionWrapper.java:66)
at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper$2.run(PrivilegedTripleCollectionWrapper.java:62)
at java.security.AccessController.doPrivileged(Native Method)
at 
org.apache.clerezza.rdf.core.impl.util.PrivilegedTripleCollectionWrapper.size(PrivilegedTripleCollectionWrapper.java:62)
at 
org.apache.clerezza.rdf.core.access.LockableMGraphWrapper.size(LockableMGraphWrapper.java:97)
at 
org.apache.clerezza.rdf.jena.tdb.storage.MultiThreadedTest.perform(MultiThreadedTest.java:129)


I've checked the clerezza code and it seems that it ensures that no
write happens while size() is executed. Also only a single graph is
used so it cannot be the issue about the clerezza lock not being broad
enough (afaik this is an issue but not the cause of this exception).
Do you have an idea what could cause the problem?

Cheers,
Reto

On Thu, Mar 14, 2013 at 2:43 PM, Andy Seaborne  wrote:
> On 14/03/13 09:39, Minto van der Sluis wrote:
>>
>> Rupert,
>>
>> Thanks for the additional explanation.
>>
>> Regards,
>>
>> Minto
>>
>> Op 14-3-2013 10:31, Rupert Westenthaler schreef:
>>>
>>> Hi Minto
>>>
>>> I am traveling this week and do not have time to work on this until
>>> the weekend but I will have a look into this.
>>>
>>> Let me try to explain my concern again and make it more clear:
>>>
>>> The Jena TDB named graphs are hold in a single quad store table (SPOC
>>> - Subject Predicate Object Context). On the Clerezza side you have a
>>> TripleCollections (SPO) with a name (C). What that means is that all
>>> Clerezza TripleCollections provided by the same
>>> SingleTdbDatasetTcProvider do share the same SPOC table. meaning that
>>> a change of any of those TripleCollections will cause a modification
>>> in the Jena TDB Backend. This means that Iterators of all
>>> TripleCollections need to make a ReadLock on the SPOC table (and not
>>> only on the SPO section represented by the TripleCollection).
>>>
>>> While Clerezza allows to build a LockableMGraphWrapper over an MGrpah
>>> this is not sufficient for the SingleTdbDatasetTcProvider as this will
>>> only protect the SPO section and not the SPOC table used by the
>>> backend. So changes in other graphs - or the creation of a new graph -
>>> are still possible and will cause ConcurrentModificationExceptions as
>>> reported.
>>>
>>> To solve this issue one needs to ensure that a single ReadWrite lock
>>> is used for all TripleCollections provided by the
>>> SingleTdbDatasetTcProvider as this will allow users to lock the whole
>>> SPOC table of the backend when they perform operations on the Clerezza
>>> TripleCollections.
>
>
> A TDB dataset provides a single Lock you can reuse/wrap so all the graph
> locks are related when needed.  The GraphTDB.getLock() is the dataset lock.
>
> Transactions would be better.  Better concurrency (concurrent writer and
> multiple readers).
>
> Andy
>
>
>>>
>>> best
>>> Rupert
>>>
>>>
>>> On Thu, Mar 14, 2013 at 9:50 AM, Minto van der Sluis 
>>> wrote:

 Hi,

 Half of what the 2 of you write is not very clear to me. Probably due to
 being a novice when it comes to Clerezza internals.

 Maybe I will start with giving CLEREZZA-726 another try and then check
 if I still get these exceptions.

 Regard,

 Minto

 Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef:
>
> On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> Hi,
>>
>> I think that this is cased by the fact that if you create a
>> LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
>> you end up in a situation where you have multiple ReadWrite Locks on
>> the same quad store (the Jena TDB dataset). This means that acquiring
>> a write lock on one MGraph will not prohibit changes in other graphs -
>> or the creation of new graphs. Because of that you will end up 

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-18 Thread Minto van der Sluis
Hi Rupert,

Thanks for the modifications I will port the changes to my "scalable"
version.

How long does it take for the changes to arrive in
https://github.com/apache/clerezza?

Regards,

Minto

Op 18-3-2013 8:38, Rupert Westenthaler schreef:
> Hi Minto
>
> Created CLEREZZA-745 describing this issue and provided a fix with
> revision 1457660 [2]. You will need to port the fix to your "scalable"
> version (CLEREZZA-736). While I do have adapted the
> MultiThreadedSingleTdbDatasetTest to explicitly validate your usage
> scenario it would still be nice if you could validate the fix against
> your usage scenario
>
> best
> Rupert
>
>
> [1] https://issues.apache.org/jira/browse/CLEREZZA-745
> [2] http://svn.apache.org/r1457660
>
> On Thu, Mar 14, 2013 at 2:43 PM, Andy Seaborne  wrote:
>> On 14/03/13 09:39, Minto van der Sluis wrote:
>>> Rupert,
>>>
>>> Thanks for the additional explanation.
>>>
>>> Regards,
>>>
>>> Minto
>>>
>>> Op 14-3-2013 10:31, Rupert Westenthaler schreef:
 Hi Minto

 I am traveling this week and do not have time to work on this until
 the weekend but I will have a look into this.

 Let me try to explain my concern again and make it more clear:

 The Jena TDB named graphs are hold in a single quad store table (SPOC
 - Subject Predicate Object Context). On the Clerezza side you have a
 TripleCollections (SPO) with a name (C). What that means is that all
 Clerezza TripleCollections provided by the same
 SingleTdbDatasetTcProvider do share the same SPOC table. meaning that
 a change of any of those TripleCollections will cause a modification
 in the Jena TDB Backend. This means that Iterators of all
 TripleCollections need to make a ReadLock on the SPOC table (and not
 only on the SPO section represented by the TripleCollection).

 While Clerezza allows to build a LockableMGraphWrapper over an MGrpah
 this is not sufficient for the SingleTdbDatasetTcProvider as this will
 only protect the SPO section and not the SPOC table used by the
 backend. So changes in other graphs - or the creation of a new graph -
 are still possible and will cause ConcurrentModificationExceptions as
 reported.

 To solve this issue one needs to ensure that a single ReadWrite lock
 is used for all TripleCollections provided by the
 SingleTdbDatasetTcProvider as this will allow users to lock the whole
 SPOC table of the backend when they perform operations on the Clerezza
 TripleCollections.
>>
>> A TDB dataset provides a single Lock you can reuse/wrap so all the graph
>> locks are related when needed.  The GraphTDB.getLock() is the dataset lock.
>>
>> Transactions would be better.  Better concurrency (concurrent writer and
>> multiple readers).
>>
>> Andy
>>
>>
 best
 Rupert


 On Thu, Mar 14, 2013 at 9:50 AM, Minto van der Sluis 
 wrote:
> Hi,
>
> Half of what the 2 of you write is not very clear to me. Probably due to
> being a novice when it comes to Clerezza internals.
>
> Maybe I will start with giving CLEREZZA-726 another try and then check
> if I still get these exceptions.
>
> Regard,
>
> Minto
>
> Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef:
>> On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> I think that this is cased by the fact that if you create a
>>> LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
>>> you end up in a situation where you have multiple ReadWrite Locks on
>>> the same quad store (the Jena TDB dataset). This means that acquiring
>>> a write lock on one MGraph will not prohibit changes in other graphs -
>>> or the creation of new graphs. Because of that you will end up with
>>> ConcurrentModificationException when using iterators over triples
>>> (such as going over SPARQL results).
>>>
>> True. But where is the graph locked in the first place? It should
>> aquire a
>> lock  before iterating though the graph, does this happen?
>>
>> cheers,
>> reto
>>
>>> The solution would be to
>>>
>>> * create a single ReadWirte lock for the SingleTdbDatasetTcProvider
>>> * replace all synchronized(dataset){..} block with read/wirte locks
>>> * all methods returning MGraphs need to return LockableMGraph
>>> instances that do use the ReadWrite lock used by the
>>> SingleTdbDatasetTcProvider
>>> * users would than need to use the LockableMGraph instance provided by
>>> the provider and NOT wrap those with an other LockableMGraph instance
>>> (e.g. the LockableMGraphWrapper).
>>>
>>> best
>>> Rupert
>>>
>>>
>>> On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis 
>>> wrote:
 Hi Folks,

 I ran into an issue is both the existing SingleTdbDatasetTcPro

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-18 Thread Rupert Westenthaler
Hi Minto

Created CLEREZZA-745 describing this issue and provided a fix with
revision 1457660 [2]. You will need to port the fix to your "scalable"
version (CLEREZZA-736). While I do have adapted the
MultiThreadedSingleTdbDatasetTest to explicitly validate your usage
scenario it would still be nice if you could validate the fix against
your usage scenario

best
Rupert


[1] https://issues.apache.org/jira/browse/CLEREZZA-745
[2] http://svn.apache.org/r1457660

On Thu, Mar 14, 2013 at 2:43 PM, Andy Seaborne  wrote:
> On 14/03/13 09:39, Minto van der Sluis wrote:
>>
>> Rupert,
>>
>> Thanks for the additional explanation.
>>
>> Regards,
>>
>> Minto
>>
>> Op 14-3-2013 10:31, Rupert Westenthaler schreef:
>>>
>>> Hi Minto
>>>
>>> I am traveling this week and do not have time to work on this until
>>> the weekend but I will have a look into this.
>>>
>>> Let me try to explain my concern again and make it more clear:
>>>
>>> The Jena TDB named graphs are hold in a single quad store table (SPOC
>>> - Subject Predicate Object Context). On the Clerezza side you have a
>>> TripleCollections (SPO) with a name (C). What that means is that all
>>> Clerezza TripleCollections provided by the same
>>> SingleTdbDatasetTcProvider do share the same SPOC table. meaning that
>>> a change of any of those TripleCollections will cause a modification
>>> in the Jena TDB Backend. This means that Iterators of all
>>> TripleCollections need to make a ReadLock on the SPOC table (and not
>>> only on the SPO section represented by the TripleCollection).
>>>
>>> While Clerezza allows to build a LockableMGraphWrapper over an MGrpah
>>> this is not sufficient for the SingleTdbDatasetTcProvider as this will
>>> only protect the SPO section and not the SPOC table used by the
>>> backend. So changes in other graphs - or the creation of a new graph -
>>> are still possible and will cause ConcurrentModificationExceptions as
>>> reported.
>>>
>>> To solve this issue one needs to ensure that a single ReadWrite lock
>>> is used for all TripleCollections provided by the
>>> SingleTdbDatasetTcProvider as this will allow users to lock the whole
>>> SPOC table of the backend when they perform operations on the Clerezza
>>> TripleCollections.
>
>
> A TDB dataset provides a single Lock you can reuse/wrap so all the graph
> locks are related when needed.  The GraphTDB.getLock() is the dataset lock.
>
> Transactions would be better.  Better concurrency (concurrent writer and
> multiple readers).
>
> Andy
>
>
>>>
>>> best
>>> Rupert
>>>
>>>
>>> On Thu, Mar 14, 2013 at 9:50 AM, Minto van der Sluis 
>>> wrote:

 Hi,

 Half of what the 2 of you write is not very clear to me. Probably due to
 being a novice when it comes to Clerezza internals.

 Maybe I will start with giving CLEREZZA-726 another try and then check
 if I still get these exceptions.

 Regard,

 Minto

 Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef:
>
> On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> Hi,
>>
>> I think that this is cased by the fact that if you create a
>> LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
>> you end up in a situation where you have multiple ReadWrite Locks on
>> the same quad store (the Jena TDB dataset). This means that acquiring
>> a write lock on one MGraph will not prohibit changes in other graphs -
>> or the creation of new graphs. Because of that you will end up with
>> ConcurrentModificationException when using iterators over triples
>> (such as going over SPARQL results).
>>
> True. But where is the graph locked in the first place? It should
> aquire a
> lock  before iterating though the graph, does this happen?
>
> cheers,
> reto
>
>> The solution would be to
>>
>> * create a single ReadWirte lock for the SingleTdbDatasetTcProvider
>> * replace all synchronized(dataset){..} block with read/wirte locks
>> * all methods returning MGraphs need to return LockableMGraph
>> instances that do use the ReadWrite lock used by the
>> SingleTdbDatasetTcProvider
>> * users would than need to use the LockableMGraph instance provided by
>> the provider and NOT wrap those with an other LockableMGraph instance
>> (e.g. the LockableMGraphWrapper).
>>
>> best
>> Rupert
>>
>>
>> On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis 
>> wrote:
>>>
>>> Hi Folks,
>>>
>>> I ran into an issue is both the existing SingleTdbDatasetTcProvider
>>> and
>>> my customized version (see CLEREZZA-736).
>>>
>>> How to reproduce:
>>> 1) Have some process constantly inject new named graphs (I had a
>>> process
>>> injecting 1000 named graphs)
>>> 2) perform a query while 1 is still running. I used the following
>>> query:
>>>
>>>  SELECT ?graph

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-14 Thread Andy Seaborne

On 14/03/13 09:39, Minto van der Sluis wrote:

Rupert,

Thanks for the additional explanation.

Regards,

Minto

Op 14-3-2013 10:31, Rupert Westenthaler schreef:

Hi Minto

I am traveling this week and do not have time to work on this until
the weekend but I will have a look into this.

Let me try to explain my concern again and make it more clear:

The Jena TDB named graphs are hold in a single quad store table (SPOC
- Subject Predicate Object Context). On the Clerezza side you have a
TripleCollections (SPO) with a name (C). What that means is that all
Clerezza TripleCollections provided by the same
SingleTdbDatasetTcProvider do share the same SPOC table. meaning that
a change of any of those TripleCollections will cause a modification
in the Jena TDB Backend. This means that Iterators of all
TripleCollections need to make a ReadLock on the SPOC table (and not
only on the SPO section represented by the TripleCollection).

While Clerezza allows to build a LockableMGraphWrapper over an MGrpah
this is not sufficient for the SingleTdbDatasetTcProvider as this will
only protect the SPO section and not the SPOC table used by the
backend. So changes in other graphs - or the creation of a new graph -
are still possible and will cause ConcurrentModificationExceptions as
reported.

To solve this issue one needs to ensure that a single ReadWrite lock
is used for all TripleCollections provided by the
SingleTdbDatasetTcProvider as this will allow users to lock the whole
SPOC table of the backend when they perform operations on the Clerezza
TripleCollections.


A TDB dataset provides a single Lock you can reuse/wrap so all the graph 
locks are related when needed.  The GraphTDB.getLock() is the dataset lock.


Transactions would be better.  Better concurrency (concurrent writer and 
multiple readers).


Andy



best
Rupert


On Thu, Mar 14, 2013 at 9:50 AM, Minto van der Sluis  wrote:

Hi,

Half of what the 2 of you write is not very clear to me. Probably due to
being a novice when it comes to Clerezza internals.

Maybe I will start with giving CLEREZZA-726 another try and then check
if I still get these exceptions.

Regard,

Minto

Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef:

On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
[email protected]> wrote:


Hi,

I think that this is cased by the fact that if you create a
LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
you end up in a situation where you have multiple ReadWrite Locks on
the same quad store (the Jena TDB dataset). This means that acquiring
a write lock on one MGraph will not prohibit changes in other graphs -
or the creation of new graphs. Because of that you will end up with
ConcurrentModificationException when using iterators over triples
(such as going over SPARQL results).


True. But where is the graph locked in the first place? It should aquire a
lock  before iterating though the graph, does this happen?

cheers,
reto


The solution would be to

* create a single ReadWirte lock for the SingleTdbDatasetTcProvider
* replace all synchronized(dataset){..} block with read/wirte locks
* all methods returning MGraphs need to return LockableMGraph
instances that do use the ReadWrite lock used by the
SingleTdbDatasetTcProvider
* users would than need to use the LockableMGraph instance provided by
the provider and NOT wrap those with an other LockableMGraph instance
(e.g. the LockableMGraphWrapper).

best
Rupert


On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis  wrote:

Hi Folks,

I ran into an issue is both the existing SingleTdbDatasetTcProvider and
my customized version (see CLEREZZA-736).

How to reproduce:
1) Have some process constantly inject new named graphs (I had a process
injecting 1000 named graphs)
2) perform a query while 1 is still running. I used the following query:

 SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0

3) repeat step 2 a number of times (since the error does not always

occur)

This results in a ConcurrentModificationException (see stacktrace
below). I am not sure whether this is a Clerezza or Jena issue.

Anyone an idea what is causing this? Or more importantly how to fix it?

Should I create a Jira issue for this?

Regards,

--
ir. ing. Minto van der Sluis
Software innovator / renovator
Xup BV


Stacktrace:
java.util.ConcurrentModificationException: Iterator: started at 7103,

now 7105

 at

com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)

 at

com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)

 at

com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)

 at

com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:118)

 at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
 at

com.hp.hpl.jena.tdb.store.GraphTDBBa

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-14 Thread Minto van der Sluis
Rupert,

Thanks for the additional explanation.

Regards,

Minto

Op 14-3-2013 10:31, Rupert Westenthaler schreef:
> Hi Minto
>
> I am traveling this week and do not have time to work on this until
> the weekend but I will have a look into this.
>
> Let me try to explain my concern again and make it more clear:
>
> The Jena TDB named graphs are hold in a single quad store table (SPOC
> - Subject Predicate Object Context). On the Clerezza side you have a
> TripleCollections (SPO) with a name (C). What that means is that all
> Clerezza TripleCollections provided by the same
> SingleTdbDatasetTcProvider do share the same SPOC table. meaning that
> a change of any of those TripleCollections will cause a modification
> in the Jena TDB Backend. This means that Iterators of all
> TripleCollections need to make a ReadLock on the SPOC table (and not
> only on the SPO section represented by the TripleCollection).
>
> While Clerezza allows to build a LockableMGraphWrapper over an MGrpah
> this is not sufficient for the SingleTdbDatasetTcProvider as this will
> only protect the SPO section and not the SPOC table used by the
> backend. So changes in other graphs - or the creation of a new graph -
> are still possible and will cause ConcurrentModificationExceptions as
> reported.
>
> To solve this issue one needs to ensure that a single ReadWrite lock
> is used for all TripleCollections provided by the
> SingleTdbDatasetTcProvider as this will allow users to lock the whole
> SPOC table of the backend when they perform operations on the Clerezza
> TripleCollections.
>
> best
> Rupert
>
>
> On Thu, Mar 14, 2013 at 9:50 AM, Minto van der Sluis  wrote:
>> Hi,
>>
>> Half of what the 2 of you write is not very clear to me. Probably due to
>> being a novice when it comes to Clerezza internals.
>>
>> Maybe I will start with giving CLEREZZA-726 another try and then check
>> if I still get these exceptions.
>>
>> Regard,
>>
>> Minto
>>
>> Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef:
>>> On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
>>> [email protected]> wrote:
>>>
 Hi,

 I think that this is cased by the fact that if you create a
 LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
 you end up in a situation where you have multiple ReadWrite Locks on
 the same quad store (the Jena TDB dataset). This means that acquiring
 a write lock on one MGraph will not prohibit changes in other graphs -
 or the creation of new graphs. Because of that you will end up with
 ConcurrentModificationException when using iterators over triples
 (such as going over SPARQL results).

>>> True. But where is the graph locked in the first place? It should aquire a
>>> lock  before iterating though the graph, does this happen?
>>>
>>> cheers,
>>> reto
>>>
 The solution would be to

 * create a single ReadWirte lock for the SingleTdbDatasetTcProvider
 * replace all synchronized(dataset){..} block with read/wirte locks
 * all methods returning MGraphs need to return LockableMGraph
 instances that do use the ReadWrite lock used by the
 SingleTdbDatasetTcProvider
 * users would than need to use the LockableMGraph instance provided by
 the provider and NOT wrap those with an other LockableMGraph instance
 (e.g. the LockableMGraphWrapper).

 best
 Rupert


 On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis  wrote:
> Hi Folks,
>
> I ran into an issue is both the existing SingleTdbDatasetTcProvider and
> my customized version (see CLEREZZA-736).
>
> How to reproduce:
> 1) Have some process constantly inject new named graphs (I had a process
> injecting 1000 named graphs)
> 2) perform a query while 1 is still running. I used the following query:
>
> SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0
>
> 3) repeat step 2 a number of times (since the error does not always
 occur)
> This results in a ConcurrentModificationException (see stacktrace
> below). I am not sure whether this is a Clerezza or Jena issue.
>
> Anyone an idea what is causing this? Or more importantly how to fix it?
>
> Should I create a Jira issue for this?
>
> Regards,
>
> --
> ir. ing. Minto van der Sluis
> Software innovator / renovator
> Xup BV
>
>
> Stacktrace:
> java.util.ConcurrentModificationException: Iterator: started at 7103,
 now 7105
> at
 com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
> at
 com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
> at
 com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
> at
 com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(Da

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-14 Thread Rupert Westenthaler
Hi Minto

I am traveling this week and do not have time to work on this until
the weekend but I will have a look into this.

Let me try to explain my concern again and make it more clear:

The Jena TDB named graphs are hold in a single quad store table (SPOC
- Subject Predicate Object Context). On the Clerezza side you have a
TripleCollections (SPO) with a name (C). What that means is that all
Clerezza TripleCollections provided by the same
SingleTdbDatasetTcProvider do share the same SPOC table. meaning that
a change of any of those TripleCollections will cause a modification
in the Jena TDB Backend. This means that Iterators of all
TripleCollections need to make a ReadLock on the SPOC table (and not
only on the SPO section represented by the TripleCollection).

While Clerezza allows to build a LockableMGraphWrapper over an MGrpah
this is not sufficient for the SingleTdbDatasetTcProvider as this will
only protect the SPO section and not the SPOC table used by the
backend. So changes in other graphs - or the creation of a new graph -
are still possible and will cause ConcurrentModificationExceptions as
reported.

To solve this issue one needs to ensure that a single ReadWrite lock
is used for all TripleCollections provided by the
SingleTdbDatasetTcProvider as this will allow users to lock the whole
SPOC table of the backend when they perform operations on the Clerezza
TripleCollections.

best
Rupert


On Thu, Mar 14, 2013 at 9:50 AM, Minto van der Sluis  wrote:
> Hi,
>
> Half of what the 2 of you write is not very clear to me. Probably due to
> being a novice when it comes to Clerezza internals.
>
> Maybe I will start with giving CLEREZZA-726 another try and then check
> if I still get these exceptions.
>
> Regard,
>
> Minto
>
> Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef:
>> On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> I think that this is cased by the fact that if you create a
>>> LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
>>> you end up in a situation where you have multiple ReadWrite Locks on
>>> the same quad store (the Jena TDB dataset). This means that acquiring
>>> a write lock on one MGraph will not prohibit changes in other graphs -
>>> or the creation of new graphs. Because of that you will end up with
>>> ConcurrentModificationException when using iterators over triples
>>> (such as going over SPARQL results).
>>>
>> True. But where is the graph locked in the first place? It should aquire a
>> lock  before iterating though the graph, does this happen?
>>
>> cheers,
>> reto
>>
>>> The solution would be to
>>>
>>> * create a single ReadWirte lock for the SingleTdbDatasetTcProvider
>>> * replace all synchronized(dataset){..} block with read/wirte locks
>>> * all methods returning MGraphs need to return LockableMGraph
>>> instances that do use the ReadWrite lock used by the
>>> SingleTdbDatasetTcProvider
>>> * users would than need to use the LockableMGraph instance provided by
>>> the provider and NOT wrap those with an other LockableMGraph instance
>>> (e.g. the LockableMGraphWrapper).
>>>
>>> best
>>> Rupert
>>>
>>>
>>> On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis  wrote:
 Hi Folks,

 I ran into an issue is both the existing SingleTdbDatasetTcProvider and
 my customized version (see CLEREZZA-736).

 How to reproduce:
 1) Have some process constantly inject new named graphs (I had a process
 injecting 1000 named graphs)
 2) perform a query while 1 is still running. I used the following query:

 SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0

 3) repeat step 2 a number of times (since the error does not always
>>> occur)
 This results in a ConcurrentModificationException (see stacktrace
 below). I am not sure whether this is a Clerezza or Jena issue.

 Anyone an idea what is causing this? Or more importantly how to fix it?

 Should I create a Jira issue for this?

 Regards,

 --
 ir. ing. Minto van der Sluis
 Software innovator / renovator
 Xup BV


 Stacktrace:
 java.util.ConcurrentModificationException: Iterator: started at 7103,
>>> now 7105
 at
>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
 at
>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
 at
>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
 at
>>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:118)
 at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
 at
>>> com.hp.hpl.jena.tdb.store.GraphTDBBase$ProjectQuadsToTriples.hasNext(GraphTDBBase.java:173)
 at
>>> com.hp.hpl.jena.util.iterator.Wra

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-14 Thread Minto van der Sluis
Hi,

Half of what the 2 of you write is not very clear to me. Probably due to
being a novice when it comes to Clerezza internals.

Maybe I will start with giving CLEREZZA-726 another try and then check
if I still get these exceptions.

Regard,

Minto

Op 13-3-2013 18:35, Reto Bachmann-Gmür schreef:
> On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> Hi,
>>
>> I think that this is cased by the fact that if you create a
>> LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
>> you end up in a situation where you have multiple ReadWrite Locks on
>> the same quad store (the Jena TDB dataset). This means that acquiring
>> a write lock on one MGraph will not prohibit changes in other graphs -
>> or the creation of new graphs. Because of that you will end up with
>> ConcurrentModificationException when using iterators over triples
>> (such as going over SPARQL results).
>>
> True. But where is the graph locked in the first place? It should aquire a
> lock  before iterating though the graph, does this happen?
>
> cheers,
> reto
>
>> The solution would be to
>>
>> * create a single ReadWirte lock for the SingleTdbDatasetTcProvider
>> * replace all synchronized(dataset){..} block with read/wirte locks
>> * all methods returning MGraphs need to return LockableMGraph
>> instances that do use the ReadWrite lock used by the
>> SingleTdbDatasetTcProvider
>> * users would than need to use the LockableMGraph instance provided by
>> the provider and NOT wrap those with an other LockableMGraph instance
>> (e.g. the LockableMGraphWrapper).
>>
>> best
>> Rupert
>>
>>
>> On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis  wrote:
>>> Hi Folks,
>>>
>>> I ran into an issue is both the existing SingleTdbDatasetTcProvider and
>>> my customized version (see CLEREZZA-736).
>>>
>>> How to reproduce:
>>> 1) Have some process constantly inject new named graphs (I had a process
>>> injecting 1000 named graphs)
>>> 2) perform a query while 1 is still running. I used the following query:
>>>
>>> SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0
>>>
>>> 3) repeat step 2 a number of times (since the error does not always
>> occur)
>>> This results in a ConcurrentModificationException (see stacktrace
>>> below). I am not sure whether this is a Clerezza or Jena issue.
>>>
>>> Anyone an idea what is causing this? Or more importantly how to fix it?
>>>
>>> Should I create a Jira issue for this?
>>>
>>> Regards,
>>>
>>> --
>>> ir. ing. Minto van der Sluis
>>> Software innovator / renovator
>>> Xup BV
>>>
>>>
>>> Stacktrace:
>>> java.util.ConcurrentModificationException: Iterator: started at 7103,
>> now 7105
>>> at
>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
>>> at
>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
>>> at
>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
>>> at
>> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:118)
>>> at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
>>> at
>> com.hp.hpl.jena.tdb.store.GraphTDBBase$ProjectQuadsToTriples.hasNext(GraphTDBBase.java:173)
>>> at
>> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
>>> at
>> org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor$1.hasNext(JenaGraphAdaptor.java:106)
>>> at
>> org.apache.clerezza.rdf.core.impl.AbstractTripleCollection$1.hasNext(AbstractTripleCollection.java:78)
>>> at
>> org.apache.clerezza.rdf.core.access.LockingIterator.hasNext(LockingIterator.java:47)
>>> at
>> org.apache.clerezza.rdf.jena.facade.JenaGraph$1.hasNext(JenaGraph.java:95)
>>> at
>> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
>>> at
>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:151)
>>> at
>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
>>> at
>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
>>> at
>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
>>> at
>> com.hp.hpl.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:64)
>>> at
>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
>>> at
>> com.hp.hpl.jena.sparql.engine.main.iterator.QueryIterGraph$QueryIterGraphInner.hasNextBinding(QueryIterGraph.java:123)
>>> at
>> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
>>> at
>> com.hp.hpl.jena.sparql.engine.i

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-13 Thread Reto Bachmann-Gmür
On Wed, Mar 13, 2013 at 6:04 PM, Rupert Westenthaler <
[email protected]> wrote:

> Hi,
>
> I think that this is cased by the fact that if you create a
> LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
> you end up in a situation where you have multiple ReadWrite Locks on
> the same quad store (the Jena TDB dataset). This means that acquiring
> a write lock on one MGraph will not prohibit changes in other graphs -
> or the creation of new graphs. Because of that you will end up with
> ConcurrentModificationException when using iterators over triples
> (such as going over SPARQL results).
>

True. But where is the graph locked in the first place? It should aquire a
lock  before iterating though the graph, does this happen?

cheers,
reto

>
> The solution would be to
>
> * create a single ReadWirte lock for the SingleTdbDatasetTcProvider
> * replace all synchronized(dataset){..} block with read/wirte locks
> * all methods returning MGraphs need to return LockableMGraph
> instances that do use the ReadWrite lock used by the
> SingleTdbDatasetTcProvider
> * users would than need to use the LockableMGraph instance provided by
> the provider and NOT wrap those with an other LockableMGraph instance
> (e.g. the LockableMGraphWrapper).
>
> best
> Rupert
>
>
> On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis  wrote:
> > Hi Folks,
> >
> > I ran into an issue is both the existing SingleTdbDatasetTcProvider and
> > my customized version (see CLEREZZA-736).
> >
> > How to reproduce:
> > 1) Have some process constantly inject new named graphs (I had a process
> > injecting 1000 named graphs)
> > 2) perform a query while 1 is still running. I used the following query:
> >
> > SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0
> >
> > 3) repeat step 2 a number of times (since the error does not always
> occur)
> >
> > This results in a ConcurrentModificationException (see stacktrace
> > below). I am not sure whether this is a Clerezza or Jena issue.
> >
> > Anyone an idea what is causing this? Or more importantly how to fix it?
> >
> > Should I create a Jira issue for this?
> >
> > Regards,
> >
> > --
> > ir. ing. Minto van der Sluis
> > Software innovator / renovator
> > Xup BV
> >
> >
> > Stacktrace:
> > java.util.ConcurrentModificationException: Iterator: started at 7103,
> now 7105
> > at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
> > at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
> > at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
> > at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:118)
> > at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> > at
> com.hp.hpl.jena.tdb.store.GraphTDBBase$ProjectQuadsToTriples.hasNext(GraphTDBBase.java:173)
> > at
> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
> > at
> org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor$1.hasNext(JenaGraphAdaptor.java:106)
> > at
> org.apache.clerezza.rdf.core.impl.AbstractTripleCollection$1.hasNext(AbstractTripleCollection.java:78)
> > at
> org.apache.clerezza.rdf.core.access.LockingIterator.hasNext(LockingIterator.java:47)
> > at
> org.apache.clerezza.rdf.jena.facade.JenaGraph$1.hasNext(JenaGraph.java:95)
> > at
> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:151)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:64)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.main.iterator.QueryIterGraph$QueryIterGraphInner.hasNextBinding(QueryIterGraph.java:123)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
> > at
> com.hp.hpl.jena.sparql.engine.iterator.Que

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-13 Thread Rupert Westenthaler
Hi,

I think that this is cased by the fact that if you create a
LockableMGraph over MGraphs provided by the SingleTdbDatasetTcProvider
you end up in a situation where you have multiple ReadWrite Locks on
the same quad store (the Jena TDB dataset). This means that acquiring
a write lock on one MGraph will not prohibit changes in other graphs -
or the creation of new graphs. Because of that you will end up with
ConcurrentModificationException when using iterators over triples
(such as going over SPARQL results).

The solution would be to

* create a single ReadWirte lock for the SingleTdbDatasetTcProvider
* replace all synchronized(dataset){..} block with read/wirte locks
* all methods returning MGraphs need to return LockableMGraph
instances that do use the ReadWrite lock used by the
SingleTdbDatasetTcProvider
* users would than need to use the LockableMGraph instance provided by
the provider and NOT wrap those with an other LockableMGraph instance
(e.g. the LockableMGraphWrapper).

best
Rupert


On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis  wrote:
> Hi Folks,
>
> I ran into an issue is both the existing SingleTdbDatasetTcProvider and
> my customized version (see CLEREZZA-736).
>
> How to reproduce:
> 1) Have some process constantly inject new named graphs (I had a process
> injecting 1000 named graphs)
> 2) perform a query while 1 is still running. I used the following query:
>
> SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0
>
> 3) repeat step 2 a number of times (since the error does not always occur)
>
> This results in a ConcurrentModificationException (see stacktrace
> below). I am not sure whether this is a Clerezza or Jena issue.
>
> Anyone an idea what is causing this? Or more importantly how to fix it?
>
> Should I create a Jira issue for this?
>
> Regards,
>
> --
> ir. ing. Minto van der Sluis
> Software innovator / renovator
> Xup BV
>
>
> Stacktrace:
> java.util.ConcurrentModificationException: Iterator: started at 7103, now 7105
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
> at 
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:118)
> at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> at 
> com.hp.hpl.jena.tdb.store.GraphTDBBase$ProjectQuadsToTriples.hasNext(GraphTDBBase.java:173)
> at 
> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
> at 
> org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor$1.hasNext(JenaGraphAdaptor.java:106)
> at 
> org.apache.clerezza.rdf.core.impl.AbstractTripleCollection$1.hasNext(AbstractTripleCollection.java:78)
> at 
> org.apache.clerezza.rdf.core.access.LockingIterator.hasNext(LockingIterator.java:47)
> at 
> org.apache.clerezza.rdf.jena.facade.JenaGraph$1.hasNext(JenaGraph.java:95)
> at 
> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:151)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:64)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at 
> com.hp.hpl.jena.sparql.engine.main.iterator.QueryIterGraph$QueryIterGraphInner.hasNextBinding(QueryIterGraph.java:123)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at 
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-13 Thread Reto Bachmann-Gmür
Hi Minto

Interesting problem

What I'm wondering is what graph is ARQ (com.hp.hpl.jena.sparql.engine)
iterating over here? It might be worth checking if at
com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePatterm does anything
which could be mapped to entering a readlock on the graph while iterating
over it.

In principle I think for non-fastalane queries (i.e. the only thing there
is now) that all the graphs against which a query is directed should be
locked. For queries with a WildCard for the graph-name (like yours) the
TcManager should be locked as a whole.

Such a query would not be suitable for the fastlane anyway. Unless all
graphs in TcManager come from the same TcProvider. But this is hardly the
case as there typical instance there are also virtual graphs.

Cheers,
Reto


On Wed, Mar 13, 2013 at 5:31 PM, Minto van der Sluis  wrote:

> Hi Folks,
>
> I ran into an issue is both the existing SingleTdbDatasetTcProvider and
> my customized version (see CLEREZZA-736).
>
> How to reproduce:
> 1) Have some process constantly inject new named graphs (I had a process
> injecting 1000 named graphs)
> 2) perform a query while 1 is still running. I used the following query:
>
> SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0
>
> 3) repeat step 2 a number of times (since the error does not always occur)
>
> This results in a ConcurrentModificationException (see stacktrace
> below). I am not sure whether this is a Clerezza or Jena issue.
>
> Anyone an idea what is causing this? Or more importantly how to fix it?
>
> Should I create a Jira issue for this?
>
> Regards,
>
> --
> ir. ing. Minto van der Sluis
> Software innovator / renovator
> Xup BV
>
>
> Stacktrace:
> java.util.ConcurrentModificationException: Iterator: started at 7103, now
> 7105
> at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.policyError(DatasetControlMRSW.java:157)
> at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW.access$000(DatasetControlMRSW.java:32)
> at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.checkCourrentModification(DatasetControlMRSW.java:110)
> at
> com.hp.hpl.jena.tdb.sys.DatasetControlMRSW$IteratorCheckNotConcurrent.hasNext(DatasetControlMRSW.java:118)
> at org.openjena.atlas.iterator.Iter$4.hasNext(Iter.java:295)
> at
> com.hp.hpl.jena.tdb.store.GraphTDBBase$ProjectQuadsToTriples.hasNext(GraphTDBBase.java:173)
> at
> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
> at
> org.apache.clerezza.rdf.jena.storage.JenaGraphAdaptor$1.hasNext(JenaGraphAdaptor.java:106)
> at
> org.apache.clerezza.rdf.core.impl.AbstractTripleCollection$1.hasNext(AbstractTripleCollection.java:78)
> at
> org.apache.clerezza.rdf.core.access.LockingIterator.hasNext(LockingIterator.java:47)
> at
> org.apache.clerezza.rdf.jena.facade.JenaGraph$1.hasNext(JenaGraph.java:95)
> at
> com.hp.hpl.jena.util.iterator.WrappedIterator.hasNext(WrappedIterator.java:76)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterTriplePattern$TripleMapper.hasNextBinding(QueryIterTriplePattern.java:151)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterBlockTriples.hasNextBinding(QueryIterBlockTriples.java:64)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.main.iterator.QueryIterGraph$QueryIterGraphInner.hasNextBinding(QueryIterGraph.java:123)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:79)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterConvert.hasNextBinding(QueryIterConvert.java:59)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIterSlice.hasNextBinding(QueryIterSlice.java:76)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:112)
> at
> com.hp.hpl.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:40)

Re: ConcurrentModicationException on TDB storage provider (SingleDataset)

2013-03-13 Thread Minto van der Sluis
This is most probably related to CLEREZZA-726 (see [1]). Which by the
way was also created by me :-(

[1] https://issues.apache.org/jira/browse/CLEREZZA-726

Op 13-3-2013 17:31, Minto van der Sluis schreef:
> Hi Folks,
>
> I ran into an issue is both the existing SingleTdbDatasetTcProvider and
> my customized version (see CLEREZZA-736).
>
> How to reproduce:
> 1) Have some process constantly inject new named graphs (I had a process
> injecting 1000 named graphs)
> 2) perform a query while 1 is still running. I used the following query:
>
> SELECT ?graphName WHERE {   GRAPH ?graphName {} } LIMIT 10 OFFSET 0
>
> 3) repeat step 2 a number of times (since the error does not always occur)
>
> This results in a ConcurrentModificationException (see stacktrace
> below). I am not sure whether this is a Clerezza or Jena issue.
>
> Anyone an idea what is causing this? Or more importantly how to fix it?
>
> Should I create a Jira issue for this?
>
> Regards,
>


-- 
ir. ing. Minto van der Sluis
Software innovator / renovator
Xup BV

Mobiel: +31 (0) 626 014541