On 12/04/13 17:15, Andy Seaborne wrote:
On 12/04/13 15:06, "Dr. André Lanka" wrote:
Hello to all,
Hi there,
Could you put this on JIRA please? ideally with a complete test case to
make sure we're agree on the details.
https://issues.apache.org/jira/browse/JENA-437
Is it TDB specific only?
No, although TDB is more likely to bump into it.
Thanks,
Andy
we've got duplicated statements within the same model (stored in a
GraphTripleStoreMem). Duplicated means that each of the three components
s,p and o are pairwise equal between the statements.
The reason is that the literals have differing hashCodes so that they
are added twice to the model. This is because the hashCode method for
XSDDateTime doesn't respect the scale of the milliseconds (field 8 in
the data array). When you call Model.createTypedLiteral(Calendar) the
scale is either zero or three. Whereas TDB formats it (while reading
from the triple store) to 0,1,2 or 3 digits depending on the number of
zeros at the end (DateTimeNode.unpack). So you can put a xsd:dateTime
into TDB and get back a literal that equals the given one but has
another hashCode.
You can reproduce it by using a TDB backed model and do:
Calendar cal=GregorianCalendar.getInstance();
cal.setTimeInMillis(System.currentTimeMillis()/100*100);
Literal literal = model.createTypedLiteral(cal);
model.add(s1, p1, model.createTypedLiteral(cal));
Statement statement = model.listStatements(s1, p1, (RDFNode)null
).next();
Literal value = statement.getLiteral();
assertTrue(literal.equals(value));
assertTrue(literal.hashCode()==value.hashCode());
The last line fails.
In order to respect the general contract of equals, XSDDateTime should
get a special getHashCode(LiteralLabel) method instead of using the one
from BaseDatatype. For instance this method could leave out array index
7 and 8 and could use the fractional seconds (xor with the double value)
instead.
Cheers
André