Hi Francois,

So I've been thinking about this & if we add this to a small set of objects 
(compounds & compound sets) we can get sequence equality working. This will be 
done as part of the SequenceMixin class & we can do case sensitive & 
insensitive versions. We can also do some tricks WRT length and compound sets 
to reject a pair of sequences without the need to iterate through the sequence. 
The code will look like

SequenceMixin.sequenceEquality(dnaOne, dnaTwo);

or

SequenceMixin.sequenceEqualityIgnoreCase(dnaOne, dnaTwo);

Don't forget you can also use checksums like md5 & sha1 to calculate a value 
which should be very unlikely to clash (projects like InterPro use this 
technique to cache results against a very quick lookup). You can do this like:

MessageDigest m = MessageDigest.getInstance("MD5");
for(Compound c: seq) {
  m.update(c.getShortName().getBytes());
}
BigInteger i = new BigInteger(1,m.digest());
String md5checksum = String.format("%1$032X", i);

HTH

Andy

On 10 Mar 2011, at 12:47, Andy Yates wrote:

> This is where the subject becomes murky & will probably mean that any code 
> written for equals() & hashcode() will have to take them into account where 
> present. However Sequence compound identity would still be available from 
> another method but this will require an extension of the Sequence interface
> 
> Andy
> 
> On 10 Mar 2011, at 12:22, Francois Le Fevre wrote:
> 
>> This could be great. But for me equals means only séquence identity and not 
>> features. 
>> 
>> 
>>> Le 10 mars 2011 10:17, "Andy Yates" <[email protected]> a écrit :
>>> 
>>> I cannot remember the reason why we decided to not include equality for 
>>> these objects. It's not an unreasonable thing to want though. Assuming I 
>>> have some time soon I can have a look into implementing it on 
>>> AbstractCompound, AbstractSequence & the backing stores but it will be some 
>>> time away. If anyone else wants to give it a shot ... :)
>>> 
>>> Andy
>>> 
>>> On 10 Mar 2011, at 01:04, Andreas Prlic wrote:
>>> 
>>>> Hi François,
>>>> 
>>>> you could try to compare the st...
>>> 
>>> --
>>> Andrew Yates Ensembl Genomes Engineer
>>> EMBL-EBI Tel: +44-(0)1...
>>> 
>> 
> 
> -- 
> Andrew Yates                   Ensembl Genomes Engineer
> EMBL-EBI                       Tel: +44-(0)1223-492538
> Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
> Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/
> 
> 
> 
> 
> 
> _______________________________________________
> Biojava-l mailing list  -  [email protected]
> http://lists.open-bio.org/mailman/listinfo/biojava-l

-- 
Andrew Yates                   Ensembl Genomes Engineer
EMBL-EBI                       Tel: +44-(0)1223-492538
Wellcome Trust Genome Campus   Fax: +44-(0)1223-494468
Cambridge CB10 1SD, UK         http://www.ensemblgenomes.org/





_______________________________________________
Biojava-l mailing list  -  [email protected]
http://lists.open-bio.org/mailman/listinfo/biojava-l

Reply via email to