On 08/05/13 11:38, "Dr. André Lanka" wrote:
Hello Andy,

sorry for the long silence, I was distracted by a few weeks full of
production updates and I didn't had any time to spent with Jena.

Below you'll find the test we used to point out the problem. I solved it
by changing AbstractDateTime.equals and AbstractDateTime.hashCode (A
diff is also attached).

It's up to you, if you like the fix and want to integrate it :-)

The fix is to make value data structure from processing a calendar match the value data structure when using the equivalent lexical form.

See

https://issues.apache.org/jira/browse/JENA-437

If you could try out the current development build, that would be great.

        Andy


Perhaps caching the hashCode could also improve performance.


Greetings
André



Index: AbstractDateTime.java
===================================================================
--- AbstractDateTime.java       (revision 2)
+++ AbstractDateTime.java       (revision 4)
@@ -118,9 +118,13 @@
          if (obj instanceof AbstractDateTime) {
              AbstractDateTime adt = (AbstractDateTime) obj;
              for (int i = 0; i < data.length; i++) {
-                if (data[i] != adt.data[i]) return false;
+              if(i==msscale || i==ms)
+                continue;
+              else if (data[i] != adt.data[i])
+                return false;
              }
-            return true;
+
+            return fractionalSeconds==adt.fractionalSeconds;
          }
          return false;
      }
@@ -131,7 +135,18 @@
      @Override
      public int hashCode() {
          int hash = 0;
+        int scale=data[msscale];
+        int scaledMs=data[ms];
+        while(scale<3) {
+          scale++;
+          scaledMs*=10;
+        }
          for (int i = 0; i < data.length; i++) {
+          if(i==msscale)
+            hash=(hash<<1)^scale;
+          else if(i==ms)
+            hash=(hash<<1)^scaledMs;
+          else
              hash = (hash << 1) ^ data[i];
          }
          return hash;








package com.hojoki.tdb;

import java.util.Calendar;
import java.util.GregorianCalendar;

import junit.framework.TestCase;

import org.junit.Test;

import com.hp.hpl.jena.query.Dataset;
import com.hp.hpl.jena.query.ReadWrite;
import com.hp.hpl.jena.rdf.listeners.StatementListener;
import com.hp.hpl.jena.rdf.model.Literal;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.rdf.model.ResourceFactory;
import com.hp.hpl.jena.rdf.model.Statement;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.tdb.TDBFactory;

public class Iteratortest extends TestCase {

   public Iteratortest(String testName) {
     super(testName);
   }

   Resource s0 = ResourceFactory.createResource("s://r0");
   Property p0 = ResourceFactory.createProperty("p://r0");

   Resource s1 = ResourceFactory.createResource("s://r1");
   Property p1 = ResourceFactory.createProperty("p://r1");
   Resource o1 = ResourceFactory.createResource("o://r1");

   Resource s2 = ResourceFactory.createResource("s://r2");
   Property p2 = ResourceFactory.createProperty("p://r2");
   Resource o2 = ResourceFactory.createResource("o://r2");

   @Test
   public void testGraphMemIterator() {

     ModelChangedListenerImpl listener = new ModelChangedListenerImpl();

     Dataset set = TDBFactory.createDataset("/tmp/test");
     set.begin(ReadWrite.WRITE);

     Model model = set.getDefaultModel();

     model.register(listener);

     Model model2 = ModelFactory.createDefaultModel();

     Calendar cal=GregorianCalendar.getInstance();
     cal.setTimeInMillis(System.currentTimeMillis()/100*100);
     Literal literal = model.createTypedLiteral(cal);
     model.add(s1, p1, model.createTypedLiteral(cal));
     Statement statement = model.listStatements(s1, p1, (RDFNode)null
).next();
     Literal value = statement.getLiteral();

     assertTrue(literal.equals(value));
     assertTrue(literal.hashCode()==value.hashCode());

     model.add(s1, p1, o1);

     model2.add(s1,p1,literal);

     model2.add(s2, p2, o2);
     model2.add(s1, p1, model.createTypedLiteral(cal));

     model.add(model2);

     model.unregister(listener);

     Model added=listener.getInsertModel();

     final StmtIterator objectStmtIter = added.listStatements(s0, p0,
(RDFNode) null);
     if (objectStmtIter != null) {
       while (objectStmtIter.hasNext()) {

         final Resource objectResource =
objectStmtIter.next().getObject().asResource();
         final StmtIterator updatedStmtIter =
objectResource.listProperties(p1);
         if (updatedStmtIter != null && updatedStmtIter.hasNext()) {
           Statement next = updatedStmtIter.next();
           if (updatedStmtIter.hasNext()) {
             Statement next2 = updatedStmtIter.next();
             if (next.toString().equals(next2.toString())) {
               // JenaUtil.printModel(added);
               throw new RuntimeException("object has more than one
IDENTICAL atom:updated, uri: '"
                   + objectResource.getURI() + "' statement " + next);
             }
           }
         }
       }
     }

     set.abort();

   }

}



class ModelChangedListenerImpl extends StatementListener {

   private Model insertModel = ModelFactory.createDefaultModel();
   private Model deleteModel = ModelFactory.createDefaultModel();

   public void addedStatement(final Statement statement) {

     insertModel.add(statement);
     deleteModel.remove(statement);
   }

   public void removedStatement(final Statement statement) {

     deleteModel.add(statement);
     insertModel.remove(statement);
   }

   public Model getInsertModel() {
     return this.insertModel;
   }

   public Model getDeleteModel() {
     return this.deleteModel;
   }
}




On 16.04.2013 19:31, Andy Seaborne wrote:
On 12/04/13 17:15, Andy Seaborne wrote:
On 12/04/13 15:06, "Dr. André Lanka" wrote:
Hello to all,

Hi there,

Could you put this on JIRA please? ideally with a complete test case to
make sure we're agree on the details.

https://issues.apache.org/jira/browse/JENA-437


Is it TDB specific only?

No, although TDB is more likely to bump into it.


      Thanks,
      Andy


we've got duplicated statements within the same model (stored in a
GraphTripleStoreMem). Duplicated means that each of the three components
s,p and o are pairwise equal between the statements.

The reason is that the literals have differing hashCodes so that they
are added twice to the model. This is because the hashCode method for
XSDDateTime doesn't respect the scale of the milliseconds (field 8 in
the data array). When you call Model.createTypedLiteral(Calendar) the
scale is either zero or three. Whereas TDB formats it (while reading
from the triple store) to 0,1,2 or 3 digits depending on the number of
zeros at the end (DateTimeNode.unpack). So you can put a xsd:dateTime
into TDB and get back a literal that equals the given one but has
another hashCode.

You can reproduce it by using a TDB backed model and do:

      Calendar cal=GregorianCalendar.getInstance();
      cal.setTimeInMillis(System.currentTimeMillis()/100*100);

      Literal literal = model.createTypedLiteral(cal);
      model.add(s1, p1, model.createTypedLiteral(cal));

      Statement statement = model.listStatements(s1, p1, (RDFNode)null
).next();
      Literal value = statement.getLiteral();

      assertTrue(literal.equals(value));
      assertTrue(literal.hashCode()==value.hashCode());


The last line fails.

In order to respect the general contract of equals, XSDDateTime should
get a special getHashCode(LiteralLabel) method instead of using the one
from BaseDatatype. For instance this method could leave out array index
7 and 8 and could use the fractional seconds (xor with the double value)
instead.

Cheers
André





Reply via email to