Hello all, I have encountered a situation where by the same cache key is generated for different triple statements. The current key generation is based on hash codes ( see IntArray.createSPOCKey(s, p, o, c)), which unfortunately is not guaranteed to be unique across different node types.
In my particular example I have a KiWiLiteral node with a URI string value. I also have another KiWiUriResource node with the same value for its URI (messy I know :( ). <rdf:Description rdf:about="http://vocabulary.curriculum.edu.au/access/10"> <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/> <skos:prefLabel xml:lang="en">Visual independence</skos:prefLabel> <skos:topConceptOf rdf:resource="*http://vocabulary.curriculum.edu.au/access <http://vocabulary.curriculum.edu.au/access>*"/> <skos:topConceptOf>*http://vocabulary.curriculum.edu.au/access <http://vocabulary.curriculum.edu.au/access>*</skos:topConceptOf> KiWiLiteral.hashCode() is solely based on its label/content and KiWiUriResource.hashCode() is solely based on its URI hence to two different nodes generate the same hashCode, which in its self is fine, but the IntArray.createSPOCKey(s, p, o, c) generates the same key and thus the KiWiValueFactory.tripleRegistry may return the wrong KiWiTriple (see KiWiValueFactory.createStatement(s, p, o, c, con)). Proposed solution: 1) Update the hashCode in the KiWiNode types to be more unique? 2) Adjust the IntArray.createSPOCKey implementation? 3) other? Happy to raise a Jira ticket if the dev community feels this needs attention. Kind regards Al
