Aklakan opened a new issue, #2076:
URL: https://github.com/apache/jena/issues/2076
### Version
4.9.0, 4.10.0
### What happened?
It seems that GraphMem's Iterator.remove() is broken because its possible to
create datasets where removal unexpectedly fails. In my case it occurrs
randomly when running a skolemization algo on a large number of graphs which
relies on remove().
This test below fails with jena 4.9.0 and 4.10.0 (and probably earlier
versions as well) on the provided data.
These versions also return a GraphMem instance when using
`GraphFactory.createDefaultGraph()`.
The issue seems to be that in `HashCommon.removeFrom` a key may be moved
multiple times causing it to be added multiple times to the movedKeys list
`HashCommon.BasicKeyIterator.movedKeys` Maybe the fix is to make movedKeys a
Set (LinkedHashSet for determinism) instead? I don't understand the hash table
code well enough to judge whether multiple moves are valid or not.
With GraphMem of jena 5.0.0-SNAPSHOT it works but this appears due to hash
codes having changed rather than that the underlying issue was solved: Whereas
HashCommon.keys in is the same in 4.9.0 and 4.10.0 it differs in 5.0.0-SNAPSHOT.
Also, GraphMem is no longer the instance returned by
`GraphFactory.createDefaultGraph()`.
Furthermore, the default graph's iterator in jena 5.0.0 no longer supports
remove() - so I wonder whether use of remove() on a Graph's iterator is
generally considered discouraged.
I also see that GraphMem is about being deprecated, but nonetheless I think
it worth reporting this issue it affects legacy version and its quite hard to
reproduce this issue.
```java
import java.util.stream.Stream;
import org.apache.jena.graph.Graph;
import org.apache.jena.graph.Triple;
import org.apache.jena.mem.GraphMem;
import org.apache.jena.riot.system.AsyncParser;
import org.apache.jena.util.iterator.ExtendedIterator;
import org.junit.Test;
public class TestGraphMemDelete {
@Test
public void test() {
Graph graph = new GraphMem();
try (Stream<Triple> stream =
AsyncParser.of("graph-mem-broken-iterator-delete-01.ttl").streamTriples()) {
// Unsuccessful attempt to mimick the jena 4 triples (and their
hashcodes) on the jena 5 branch
// .map(t -> new LegacyTriple(t.getSubject(), t.getPredicate(),
t.getObject()))
stream.forEach(graph::add);
}
// RDFDataMgr.write(System.out, graph, RDFFormat.NTRIPLES);
ExtendedIterator<Triple> it = graph.find();
try {
while (it.hasNext()) {
Triple t = it.next();
System.err.println("Removing: " + t);
it.remove();
}
} finally {
it.close();
}
}
}
```
```ttl
# graph-mem-broken-iterator-delete-01.ttl
# This dataset is an ordered dump of HashCommon.keys with jena-4.9.0 and
jena-4.10.0
# It creates a situation where clearing all data using iterator.remove() on
the iterator returned by graph.find() fails.
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7>
<http://lsq.aksw.org/vocab#statusCode>
"200"^^<http://www.w3.org/2001/XMLSchema#int> .
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#user>
"-" .
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://www.w3.org/ns/prov#atTime>
"2023-06-02T08:00:00Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7>
<http://lsq.aksw.org/vocab#logRecord> "8955735dbec9f8a013df85d57bac725b - -
[02/Jun/2023 10:00:00 +0200] \"GET
/sparql?&timeout=60000&query=CONSTRUCT+%7B+%3Fsubject+%3Fproperty+%3Fvalue%7D+WHERE+%7B++%3Fsubject+%3Fproperty+%3Fvalue.+%3Fsubject+a+%3Ftype+.+%3Fsubject+rdfs%3Alabel+%3Flabel++FILTER+%28%28+lang%28%3Flabel%29+%3D+%22%22+%7C%7C+langMatches%28lang%28%3Flabel%29%2C+%22en%22%29%29+%26%26+%3Fsubject+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLobbyist%3E%29+%7D+ORDER+BY+ASC%28%3Fsubject+%3Fproperty+%3Fvalue%29
HTTP/1.1\" 200 394 \"-\" \"-\" \"-\"" .
# null, null, null, null, null, null, null, null, null, null, null, null,
null, null, null, null, null, null, null,
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#query>
"CONSTRUCT { ?subject ?property ?value} WHERE { ?subject ?property ?value.
?subject a ?type . ?subject rdfs:label ?label FILTER (( lang(?label) = \"\" ||
langMatches(lang(?label), \"en\")) && ?subject =
<http://dbpedia.org/resource/Lobbyist>) } ORDER BY ASC(?subject ?property
?value)" .
#, null, null, null, null, null, null, null, null, null, null, null, null,
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7>
<http://lsq.aksw.org/vocab#sequenceId>
"110723"^^<http://www.w3.org/2001/XMLSchema#long> .
#, null, null, null, null, null, null, null, null, null,
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#uri>
"/sparql?&timeout=60000&query=CONSTRUCT+%7B+%3Fsubject+%3Fproperty+%3Fvalue%7D+WHERE+%7B++%3Fsubject+%3Fproperty+%3Fvalue.+%3Fsubject+a+%3Ftype+.+%3Fsubject+rdfs%3Alabel+%3Flabel++FILTER+%28%28+lang%28%3Flabel%29+%3D+%22%22+%7C%7C+langMatches%28lang%28%3Flabel%29%2C+%22en%22%29%29+%26%26+%3Fsubject+%3D+%3Chttp%3A%2F%2Fdbpedia.org%2Fresource%2FLobbyist%3E%29+%7D+ORDER+BY+ASC%28%3Fsubject+%3Fproperty+%3Fvalue%29"
.
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7> <http://lsq.aksw.org/vocab#verb>
"GET" .
# , null, null, null, null, null,
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7>
<http://lsq.aksw.org/vocab#protocol> "HTTP/1.1" .
# null, null,
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7>
<http://lsq.aksw.org/vocab#numResponseBytes>
"394"^^<http://www.w3.org/2001/XMLSchema#int> .
# , null, null, null, null, null, null, null, null, null, null, null, null,
null, null, null, null, null, null, null, null,
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7>
<http://lsq.aksw.org/vocab#hostHash>
"RNxI06DRVty3nU8fTEvds_BV31N6xtLq0JclkZQMNLM" .
<_:50470aa7-9d2b-4695-be38-a96f6fc1d7c7>
<http://lsq.aksw.org/vocab#endpoint> <http://dbpedia.org/sparql> .
```
### Relevant output and stacktrace
```shell
java.lang.IllegalStateException: no calls to next() since last call to
remove()
at
org.apache.jena.util.iterator.NiceIterator$1.remove(NiceIterator.java:156)
at
org.apache.jena.mem.NodeToTriplesMapBase$1$NotifyMe.emptied(NodeToTriplesMapBase.java:129)
at
org.apache.jena.mem.HashCommon$MovedKeysIterator.remove(HashCommon.java:341)
at
org.apache.jena.util.iterator.NiceIterator$1.remove(NiceIterator.java:157)
at
org.apache.jena.mem.NodeToTriplesMapBase$1.remove(NodeToTriplesMapBase.java:153)
at
org.apache.jena.util.iterator.WrappedIterator.remove(WrappedIterator.java:118)
at
org.apache.jena.mem.StoreTripleIterator.remove(StoreTripleIterator.java:56)
at
org.apache.jena.sparql.graph.TestGraphMemDelete.test(TestGraphMemDelete.java:57)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at
org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:93)
at
org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:40)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:529)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:756)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:452)
at
org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:210)
```
### Are you interested in making a pull request?
None
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]