[ 
https://issues.apache.org/jira/browse/MARMOTTA-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Frank updated MARMOTTA-401:
---------------------------------

    Description: 
>From Allan Melville at the development mailinglist:

I have encountered a situation where by the same cache key is generated for
different triple statements. The current key generation is based on hash
codes ( see IntArray.createSPOCKey(s, p, o, c)), which unfortunately is not
guaranteed to be unique across different node types.

In my particular example I have a KiWiLiteral node with a URI string value.
I also have another KiWiUriResource node with the same value for its URI
(messy I know :( ).

{code|xml}
<rdf:Description rdf:about="http://vocabulary.curriculum.edu.au/access/10";>
        <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
        <skos:prefLabel xml:lang="en">Visual independence</skos:prefLabel>
        <skos:topConceptOf
rdf:resource="*http://vocabulary.curriculum.edu.au/access
<http://vocabulary.curriculum.edu.au/access>*"/>
        <skos:topConceptOf>*http://vocabulary.curriculum.edu.au/access
<http://vocabulary.curriculum.edu.au/access>*</skos:topConceptOf>
{code}

KiWiLiteral.hashCode() is solely based on its label/content and
KiWiUriResource.hashCode() is solely based on its URI hence to two
different nodes generate the same hashCode, which in its self is fine, but
the IntArray.createSPOCKey(s, p, o, c) generates the same key and thus the
KiWiValueFactory.tripleRegistry may return the wrong KiWiTriple (see
KiWiValueFactory.createStatement(s, p, o, c, con)).


Proposed solution:

1) Update the hashCode in the KiWiNode types to be more unique?
2) Adjust the IntArray.createSPOCKey implementation?
3) other?


  was:
>From Allan Melville at the development mailinglist:

I have encountered a situation where by the same cache key is generated for
different triple statements. The current key generation is based on hash
codes ( see IntArray.createSPOCKey(s, p, o, c)), which unfortunately is not
guaranteed to be unique across different node types.

In my particular example I have a KiWiLiteral node with a URI string value.
I also have another KiWiUriResource node with the same value for its URI
(messy I know :( ).

<rdf:Description rdf:about="http://vocabulary.curriculum.edu.au/access/10";>
        <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
        <skos:prefLabel xml:lang="en">Visual independence</skos:prefLabel>
        <skos:topConceptOf
rdf:resource="*http://vocabulary.curriculum.edu.au/access
<http://vocabulary.curriculum.edu.au/access>*"/>
        <skos:topConceptOf>*http://vocabulary.curriculum.edu.au/access
<http://vocabulary.curriculum.edu.au/access>*</skos:topConceptOf>


KiWiLiteral.hashCode() is solely based on its label/content and
KiWiUriResource.hashCode() is solely based on its URI hence to two
different nodes generate the same hashCode, which in its self is fine, but
the IntArray.createSPOCKey(s, p, o, c) generates the same key and thus the
KiWiValueFactory.tripleRegistry may return the wrong KiWiTriple (see
KiWiValueFactory.createStatement(s, p, o, c, con)).


Proposed solution:

1) Update the hashCode in the KiWiNode types to be more unique?
2) Adjust the IntArray.createSPOCKey implementation?
3) other?



> Cache keys for triples with literal nodes are not created correctly
> -------------------------------------------------------------------
>
>                 Key: MARMOTTA-401
>                 URL: https://issues.apache.org/jira/browse/MARMOTTA-401
>             Project: Marmotta
>          Issue Type: Bug
>          Components: KiWi Triple Store
>    Affects Versions: 3.1-incubating
>            Reporter: Alan
>            Assignee: Sebastian Schaffert
>            Priority: Critical
>             Fix For: 3.1.1, 3.2
>
>
> From Allan Melville at the development mailinglist:
> I have encountered a situation where by the same cache key is generated for
> different triple statements. The current key generation is based on hash
> codes ( see IntArray.createSPOCKey(s, p, o, c)), which unfortunately is not
> guaranteed to be unique across different node types.
> In my particular example I have a KiWiLiteral node with a URI string value.
> I also have another KiWiUriResource node with the same value for its URI
> (messy I know :( ).
> {code|xml}
> <rdf:Description rdf:about="http://vocabulary.curriculum.edu.au/access/10";>
>         <rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept"/>
>         <skos:prefLabel xml:lang="en">Visual independence</skos:prefLabel>
>         <skos:topConceptOf
> rdf:resource="*http://vocabulary.curriculum.edu.au/access
> <http://vocabulary.curriculum.edu.au/access>*"/>
>         <skos:topConceptOf>*http://vocabulary.curriculum.edu.au/access
> <http://vocabulary.curriculum.edu.au/access>*</skos:topConceptOf>
> {code}
> KiWiLiteral.hashCode() is solely based on its label/content and
> KiWiUriResource.hashCode() is solely based on its URI hence to two
> different nodes generate the same hashCode, which in its self is fine, but
> the IntArray.createSPOCKey(s, p, o, c) generates the same key and thus the
> KiWiValueFactory.tripleRegistry may return the wrong KiWiTriple (see
> KiWiValueFactory.createStatement(s, p, o, c, con)).
> Proposed solution:
> 1) Update the hashCode in the KiWiNode types to be more unique?
> 2) Adjust the IntArray.createSPOCKey implementation?
> 3) other?



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to