On 17/03/2022 09:22, Andy Seaborne wrote:


On 17/03/2022 08:13, Élie Roux wrote:
Dear all,

I have some code that write hundreds of thousands of trig files
(converting some XML data to RDF). I recently introduced a relatively
minor change: some literals are now in a custom datatype, defined
here:

https://github.com/buda-base/xmltoldmigration/blob/9cc469de4c99c5f6f8f554ef1318d45de916534e/src/main/java/io/bdrc/xmltoldmigration/xml2files/CommonMigration.java#L121

(Is there a more simple way to just add a datatype to a Literal? I
really don't do anything special with these and treat them like
strings)

But now, I'm getting this exception:

Exception in thread "main" org.apache.jena.shared.BrokenException: oh
dear, already have a slot for
io.bdrc.xmltoldmigration.xml2files.CommonMigration$EDTFStr@3568ea59,
viz 54
     at org.apache.jena.mem.HashedBunchMap.grow(HashedBunchMap.java:109)
     at org.apache.jena.mem.HashedBunchMap.put$(HashedBunchMap.java:90)
     at org.apache.jena.mem.HashedBunchMap.put(HashedBunchMap.java:70)
     at org.apache.jena.mem.NodeToTriplesMapMem.add(NodeToTriplesMapMem.java:51)      at org.apache.jena.mem.GraphTripleStoreBase.add(GraphTripleStoreBase.java:63)
     at org.apache.jena.mem.GraphMem.performAdd(GraphMem.java:37)
     at org.apache.jena.graph.impl.GraphBase.add(GraphBase.java:184)
     at org.apache.jena.sparql.graph.GraphWrapper.add(GraphWrapper.java:39)      at java.base/java.util.ArrayList$Itr.forEachRemaining(ArrayList.java:1032)      at org.apache.jena.graph.GraphUtil.addIteratorWorkerDirect(GraphUtil.java:153)      at org.apache.jena.graph.GraphUtil.addIteratorWorker(GraphUtil.java:145)
     at org.apache.jena.graph.GraphUtil.addInto(GraphUtil.java:139)
     at org.apache.jena.sparql.core.DatasetGraphTriplesQuads.addGraph(DatasetGraphTriplesQuads.java:80)      at io.bdrc.xmltoldmigration.MigrationHelpers.modelToOutputStream(MigrationHelpers.java:584)

Which I don't understand... is it a hash collision?

What puzzles me is that this is triggered on an operation that doesn't
create new data:

https://github.com/buda-base/xmltoldmigration/blob/9cc469de4c99c5f6f8f554ef1318d45de916534e/src/main/java/io/bdrc/xmltoldmigration/MigrationHelpers.java#L584

addGraph is copying data from the m.getGraph into the new dataset.

"createGeneral" is the version that does not copy, and makes a link to the original.

By the way : DatasetGraphFactory


The value of Model m at this point is attached in TTL,

Don't undertand that -
What is the actual type of the DatasetGraphTriplesQuads?
Is this inside a transaction?

 >notice that it
contains the same literal twice:

"12XX"^^<http://id.loc.gov/datatypes/edtf>

Could this be the issue? If so, what's the best way to deal with it?

Having a term in the data several is quite common. The integer 1 for
                            ^^^ several times
example.


Thanks in advance,

Reply via email to