Having trouble with the Comparator.  If I compare Object, no issue:


If I compare Annotation, it doesn't recognize the method





                                                                       
                                                                       
                                                                       
 Kameron Arthur Cole                                                   
 Watson Content                                                        
 Analytics Applications                                                
 and Support                                                           
 email:                                                                
 [email protected]                                                
 | Tel: 305-389-8512                                                   
 upload logs here                                                      
                                                                       
                                                                       
                                                                       
                                                                       
                                                                       






From:   Richard Eckart de Castilho <[email protected]>
To:     [email protected]
Date:   11/18/2014 02:34 AM
Subject:        Re: can't remove duplicate Annotations with Java Set Collection



On 17.11.2014, at 20:59, Kameron Cole <[email protected]> wrote:

> I am trying to get rid of duplicates in the FSIndex.  I thought a very
> clever way to do this would be to just push them into a Set Collection in
> Java, which does not allow duplicates. This is very (very) standard Java:
>
> ArrayList al = new ArrayList();
> // add elements to al, including duplicates
> HashSet hs = new HashSet();
> hs.addAll(al);
> al.clear();
> al.addAll(hs);

There is no universal definition of equality other than object equality.
And this is what Java defaults to unless equals() and hashCode() are
implemented.
Since each UIMA user might have a different opinion on what is equal, UIMA
defers this decision to its indexing mechanism instead of hard-baking it
into equals()/hashcode() methods.

I suggest you do the following:

- implement a Comparator<FeatureStructure> or Comparator<AnnotationFS>
according to your definition of equality

- create a TreeSet based on your comparator

- drop all your annotations into this TreeSet

- "duplicates" according to your definition are dropped. The rest is sorted
(or not) depending on what your comparator returns in a non-equality case
(return value != 0).

Cheers,

-- Richard

Reply via email to