This is what I needed, thanks!
- Jonathan
On Sat, 2008-01-05 at 08:53 +0100, Thilo Goetz wrote:
> jonathan doklovic wrote:
> > Hi,
> >
> > I have been looking at Contraints and Filters.
> > I understand how to use them to get an iterator that matches a certain
> > type, but I want to do the opposite....
> >
> > I have annotations for 3 types: City, State, and Location (where
> > location contains a city and a state)
> >
> > Now I want to create a filtered iterator that basically returns any city
> > annotations that are NOT already within a Location annotation.
> >
> > Is there any way to do this?
> >
> > Thanks,
> >
> > - Jonathan
>
> Jonathan,
>
> first, let me make sure I understand what it is that you need. So for
> example,
> for a sentence "the exhibition will visit New York, NY, and Paris, France" you
> would might have city annotations for "New York" and "Paris", a state
> annotation
> for "NY", and a location annotation for "New York, NY". You would want to
> find
> the city annotation for Paris, but not the one for New York.
>
> If this is what you're trying to do, I don't know of an easy answer. The
> fastest
> method would involve iterating over locations and cities in parallel, but that
> gets really messy and there are a ton of boundary cases to consider. So
> here's
> something that's a bit less efficient, but still ok performance-wise.
> Unfortunately, it still involves some relatively advanced use of CAS
> iterators.
>
> Please note: I just typed this in. It compiles, but has never run. If you
> can't get it to work, I'll need a real example ;-) And if this is not the
> problem you're trying to solve, also let us know. I'll stick the method here
> in the text, and the complete file in an attachment.
>
> HTH,
> Thilo
>
> public List<AnnotationFS> findOrphanedCities(CAS cas) {
> // Obtain type system info; replace with correct type names
> Type cityType = cas.getTypeSystem().getType("city");
> Type locationType = cas.getTypeSystem().getType("location");
> Feature beginFeat =
> cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
> Feature endFeat =
> cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
> // Create an empty location annotation to position iterator
> AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
> // Obtain city and annotation iterators
> FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
> FSIterator locationIterator =
> cas.getAnnotationIndex(locationType).iterator();
> // Result list
> List<AnnotationFS> list = new ArrayList<AnnotationFS>();
> // Iterate over all cities and collect those that are not covered by a
> location
> for (cityIterator.moveToFirst(); cityIterator.isValid();
> cityIterator.moveToNext()) {
> AnnotationFS city = (AnnotationFS) cityIterator.get();
> // Set the search location to the position of the current city
> locationSearch.setIntValue(beginFeat, city.getBegin());
> locationSearch.setIntValue(endFeat, city.getEnd());
> // Set the location iterator to that location, if it exists
> locationIterator.moveTo(locationSearch);
> // Check that the iterator is valid, and that the location it points
> to covers the city
> if (locationIterator.isValid()) {
> AnnotationFS loc = (AnnotationFS) locationIterator.get();
> if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >=
> city.getEnd())) {
> list.add(city);
> }
> }
> }
> return list;
> }
>
> plain text document attachment (CityFinder.java)
> /*
> * Licensed to the Apache Software Foundation (ASF) under one
> * or more contributor license agreements. See the NOTICE file
> * distributed with this work for additional information
> * regarding copyright ownership. The ASF licenses this file
> * to you under the Apache License, Version 2.0 (the
> * "License"); you may not use this file except in compliance
> * with the License. You may obtain a copy of the License at
> *
> * http://www.apache.org/licenses/LICENSE-2.0
> *
> * Unless required by applicable law or agreed to in writing,
> * software distributed under the License is distributed on an
> * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
> * KIND, either express or implied. See the License for the
> * specific language governing permissions and limitations
> * under the License.
> */
>
>
> package org.apache.uima.test;
>
> import java.util.ArrayList;
> import java.util.List;
>
> import org.apache.uima.cas.CAS;
> import org.apache.uima.cas.FSIterator;
> import org.apache.uima.cas.Feature;
> import org.apache.uima.cas.Type;
> import org.apache.uima.cas.text.AnnotationFS;
>
> /**
> * TODO: Create type commment.
> */
> public class CityFinder {
>
> public List<AnnotationFS> findOrphanedCities(CAS cas) {
> // Obtain type system info; replace with correct type names
> Type cityType = cas.getTypeSystem().getType("city");
> Type locationType = cas.getTypeSystem().getType("location");
> Feature beginFeat =
> cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
> Feature endFeat =
> cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
> // Create an empty location annotation to position iterator
> AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
> // Obtain city and annotation iterators
> FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
> FSIterator locationIterator =
> cas.getAnnotationIndex(locationType).iterator();
> // Result list
> List<AnnotationFS> list = new ArrayList<AnnotationFS>();
> // Iterate over all cities and collect those that are not covered by a
> location
> for (cityIterator.moveToFirst(); cityIterator.isValid();
> cityIterator.moveToNext()) {
> AnnotationFS city = (AnnotationFS) cityIterator.get();
> // Set the search location to the position of the current city
> locationSearch.setIntValue(beginFeat, city.getBegin());
> locationSearch.setIntValue(endFeat, city.getEnd());
> // Set the location iterator to that location, if it exists
> locationIterator.moveTo(locationSearch);
> // Check that the iterator is valid, and that the location it points to
> covers the city
> if (locationIterator.isValid()) {
> AnnotationFS loc = (AnnotationFS) locationIterator.get();
> if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >=
> city.getEnd())) {
> list.add(city);
> }
> }
> }
> return list;
> }
>
> }