Having trouble with the Comparator. If I compare Object, no issue:
If I compare Annotation, it doesn't recognize the method
Kameron Arthur Cole
Watson Content
Analytics Applications
and Support
email:
[email protected]
| Tel: 305-389-8512
upload logs here
From: Richard Eckart de Castilho <[email protected]>
To: [email protected]
Date: 11/18/2014 02:34 AM
Subject: Re: can't remove duplicate Annotations with Java Set Collection
On 17.11.2014, at 20:59, Kameron Cole <[email protected]> wrote:
> I am trying to get rid of duplicates in the FSIndex. I thought a very
> clever way to do this would be to just push them into a Set Collection in
> Java, which does not allow duplicates. This is very (very) standard Java:
>
> ArrayList al = new ArrayList();
> // add elements to al, including duplicates
> HashSet hs = new HashSet();
> hs.addAll(al);
> al.clear();
> al.addAll(hs);
There is no universal definition of equality other than object equality.
And this is what Java defaults to unless equals() and hashCode() are
implemented.
Since each UIMA user might have a different opinion on what is equal, UIMA
defers this decision to its indexing mechanism instead of hard-baking it
into equals()/hashcode() methods.
I suggest you do the following:
- implement a Comparator<FeatureStructure> or Comparator<AnnotationFS>
according to your definition of equality
- create a TreeSet based on your comparator
- drop all your annotations into this TreeSet
- "duplicates" according to your definition are dropped. The rest is sorted
(or not) depending on what your comparator returns in a non-equality case
(return value != 0).
Cheers,
-- Richard