Aklakan opened a new issue, #2076:
URL: https://github.com/apache/jena/issues/2076

   ### Version
   
   4.9.0, 4.10.0
   
   ### What happened?
   
   It seems that GraphMem's Iterator.remove() is broken because its possible to 
create datasets where removal unexpectedly fails. In my case it occurrs 
randomly when running a skolemization algo on a large number of graphs which 
relies on remove().
   
   This test below fails with jena 4.9.0 and 4.10.0 (and probably earlier 
versions as well) on the provided data.
   These versions also return a GraphMem instance when using 
`GraphFactory.createDefaultGraph()`.
   
   The issue seems to be that in `HashCommon.removeFrom` a key may be moved 
multiple times causing it to be added multiple times to the movedKeys list 
`HashCommon.BasicKeyIterator.movedKeys` Maybe the fix is to make movedKeys a 
Set (LinkedHashSet for determinism) instead? I don't understand the hash table 
code well enough to judge whether multiple moves are valid or not.
   
   With GraphMem of jena 5.0.0-SNAPSHOT it works but this appears due to hash 
codes having changed rather than that the underlying issue was solved: Whereas 
HashCommon.keys in is the same in 4.9.0 and 4.10.0 it differs in 5.0.0-SNAPSHOT.
   Also, GraphMem is no longer the instance returned by 
`GraphFactory.createDefaultGraph()`.
   Furthermore, the default graph's iterator in jena 5.0.0 no longer supports 
remove() - so I wonder whether use of remove() on a Graph's iterator is 
generally considered discouraged.
   
   I also see that GraphMem is about being deprecated, but nonetheless I think 
it worth reporting this issue it affects legacy version and its quite hard to 
reproduce this issue.
    
   
   ```java
   import java.util.stream.Stream;
   
   import org.apache.jena.graph.Graph;
   import org.apache.jena.graph.Triple;
   import org.apache.jena.mem.GraphMem;
   import org.apache.jena.riot.system.AsyncParser;
   import org.apache.jena.util.iterator.ExtendedIterator;
   import org.junit.Test;
   
   public class TestGraphMemDelete {
       @Test
       public void test() {
           Graph graph = new GraphMem();
           try (Stream<Triple> stream = 
AsyncParser.of("graph-mem-broken-iterator-delete-01.ttl").streamTriples()) {
               // Unsuccessful attempt to mimick the jena 4 triples (and their 
hashcodes) on the jena 5 branch
               // .map(t -> new LegacyTriple(t.getSubject(), t.getPredicate(), 
t.getObject()))
               stream.forEach(graph::add);
           }
   
           // RDFDataMgr.write(System.out, graph, RDFFormat.NTRIPLES);
           ExtendedIterator<Triple> it = graph.find();
           try {
               while (it.hasNext()) {
                   Triple t = it.next();
                   System.err.println("Removing: " + t);
                   it.remove();
               }
           } finally {
               it.close();
           }
       }
   }
   ```
   
   
   ```ttl
   # graph-mem-broken-iterator-delete-01.ttl
   # This dataset is an ordered dump of HashCommon.keys with jena-4.9.0 and 
jena-4.10.0
   # It creates a situation where clearing all data using iterator.remove() on 
the iterator returned by graph.find() fails.
   
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> 
<http://lsq.aksw.org/vocab#statusCode> 
"200"^^<http://www.w3.org/2001/XMLSchema#int> .
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#user> 
"-" .
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://www.w3.org/ns/prov#atTime> 
"2023-06-02T08:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> 
<http://lsq.aksw.org/vocab#logRecord> "8955735dbec9f8a013df85d57bac725b - - 
[02/Jun/2023 10:00:00 +0200] \"GET 
/sparql?&timeout=60000&query=CONSTRUCT+%7B+%3Fsubject+%3Fproperty+%3Fvalue%7D+WHERE+%7B++%3Fsubject+%3Fproperty+%3Fvalue.+%3Fsubject+a+%3Ftype+.+%3Fsubject+rdfs%3Alabel+%3Flabel++FILTER+%28%28+lang%28%3Flabel%29+%3D+%22%22+%7C%7C+langMatches%28lang%28%3Flabel%29%2C+%22en%22%29%29+%26%26+%3Fsubject+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLobbyist%3E%29+%7D+ORDER+BY+ASC%28%3Fsubject+%3Fproperty+%3Fvalue%29
 HTTP/1.1\" 200 394 \"-\" \"-\" \"-\"" .
   # null, null, null, null, null, null, null, null, null, null, null, null, 
null, null, null, null, null, null, null,
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#query> 
"CONSTRUCT { ?subject ?property ?value} WHERE {  ?subject ?property ?value. 
?subject a ?type . ?subject rdfs:label ?label  FILTER (( lang(?label) = \"\" || 
langMatches(lang(?label), \"en\")) && ?subject = 
<http://dbpedia.org/resource/Lobbyist>) } ORDER BY ASC(?subject ?property 
?value)" .
   #, null, null, null, null, null, null, null, null, null, null, null, null,
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> 
<http://lsq.aksw.org/vocab#sequenceId> 
"110723"^^<http://www.w3.org/2001/XMLSchema#long> .
   #, null, null, null, null, null, null, null, null, null,
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#uri> 
"/sparql?&timeout=60000&query=CONSTRUCT+%7B+%3Fsubject+%3Fproperty+%3Fvalue%7D+WHERE+%7B++%3Fsubject+%3Fproperty+%3Fvalue.+%3Fsubject+a+%3Ftype+.+%3Fsubject+rdfs%3Alabel+%3Flabel++FILTER+%28%28+lang%28%3Flabel%29+%3D+%22%22+%7C%7C+langMatches%28lang%28%3Flabel%29%2C+%22en%22%29%29+%26%26+%3Fsubject+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLobbyist%3E%29+%7D+ORDER+BY+ASC%28%3Fsubject+%3Fproperty+%3Fvalue%29"
 .
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#verb> 
"GET" .
   # , null, null, null, null, null,
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> 
<http://lsq.aksw.org/vocab#protocol> "HTTP/1.1" .
   # null, null,
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> 
<http://lsq.aksw.org/vocab#numResponseBytes> 
"394"^^<http://www.w3.org/2001/XMLSchema#int> .
   # , null, null, null, null, null, null, null, null, null, null, null, null, 
null, null, null, null, null, null, null, null,
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> 
<http://lsq.aksw.org/vocab#hostHash> 
"RNxI06DRVty3nU8fTEvds_BV31N6xtLq0JclkZQMNLM" .
   <_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> 
<http://lsq.aksw.org/vocab#endpoint> <http://dbpedia.org/sparql> .
   ```
   
   ### Relevant output and stacktrace
   
   ```shell
   java.lang.IllegalStateException: no calls to next() since last call to 
remove()
        at 
org.apache.jena.util.iterator.NiceIterator$1.remove(NiceIterator.java:156)
        at 
org.apache.jena.mem.NodeToTriplesMapBase$1$NotifyMe.emptied(NodeToTriplesMapBase.java:129)
        at 
org.apache.jena.mem.HashCommon$MovedKeysIterator.remove(HashCommon.java:341)
        at 
org.apache.jena.util.iterator.NiceIterator$1.remove(NiceIterator.java:157)
        at 
org.apache.jena.mem.NodeToTriplesMapBase$1.remove(NodeToTriplesMapBase.java:153)
        at 
org.apache.jena.util.iterator.WrappedIterator.remove(WrappedIterator.java:118)
        at 
org.apache.jena.mem.StoreTripleIterator.remove(StoreTripleIterator.java:56)
        at 
org.apache.jena.sparql.graph.TestGraphMemDelete.test(TestGraphMemDelete.java:57)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
        at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
        at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
        at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
        at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
        at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
        at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
        at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
        at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
        at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
        at 
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:93)
        at 
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:40)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:529)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:756)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:452)
        at 
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:210)
   ```
   
   
   ### Are you interested in making a pull request?
   
   None


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to