jonathan doklovic wrote:
Hi,
I have been looking at Contraints and Filters.
I understand how to use them to get an iterator that matches a certain
type, but I want to do the opposite....
I have annotations for 3 types: City, State, and Location (where
location contains a city and a state)
Now I want to create a filtered iterator that basically returns any city
annotations that are NOT already within a Location annotation.
Is there any way to do this?
Thanks,
- Jonathan
Jonathan,
first, let me make sure I understand what it is that you need. So for example,
for a sentence "the exhibition will visit New York, NY, and Paris, France" you
would might have city annotations for "New York" and "Paris", a state annotation
for "NY", and a location annotation for "New York, NY". You would want to find
the city annotation for Paris, but not the one for New York.
If this is what you're trying to do, I don't know of an easy answer. The
fastest
method would involve iterating over locations and cities in parallel, but that
gets really messy and there are a ton of boundary cases to consider. So here's
something that's a bit less efficient, but still ok performance-wise.
Unfortunately, it still involves some relatively advanced use of CAS iterators.
Please note: I just typed this in. It compiles, but has never run. If you
can't get it to work, I'll need a real example ;-) And if this is not the
problem you're trying to solve, also let us know. I'll stick the method here
in the text, and the complete file in an attachment.
HTH,
Thilo
public List<AnnotationFS> findOrphanedCities(CAS cas) {
// Obtain type system info; replace with correct type names
Type cityType = cas.getTypeSystem().getType("city");
Type locationType = cas.getTypeSystem().getType("location");
Feature beginFeat =
cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
Feature endFeat =
cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
// Create an empty location annotation to position iterator
AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
// Obtain city and annotation iterators
FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
FSIterator locationIterator =
cas.getAnnotationIndex(locationType).iterator();
// Result list
List<AnnotationFS> list = new ArrayList<AnnotationFS>();
// Iterate over all cities and collect those that are not covered by a
location
for (cityIterator.moveToFirst(); cityIterator.isValid();
cityIterator.moveToNext()) {
AnnotationFS city = (AnnotationFS) cityIterator.get();
// Set the search location to the position of the current city
locationSearch.setIntValue(beginFeat, city.getBegin());
locationSearch.setIntValue(endFeat, city.getEnd());
// Set the location iterator to that location, if it exists
locationIterator.moveTo(locationSearch);
// Check that the iterator is valid, and that the location it points to
covers the city
if (locationIterator.isValid()) {
AnnotationFS loc = (AnnotationFS) locationIterator.get();
if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >=
city.getEnd())) {
list.add(city);
}
}
}
return list;
}
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.apache.uima.test;
import java.util.ArrayList;
import java.util.List;
import org.apache.uima.cas.CAS;
import org.apache.uima.cas.FSIterator;
import org.apache.uima.cas.Feature;
import org.apache.uima.cas.Type;
import org.apache.uima.cas.text.AnnotationFS;
/**
* TODO: Create type commment.
*/
public class CityFinder {
public List<AnnotationFS> findOrphanedCities(CAS cas) {
// Obtain type system info; replace with correct type names
Type cityType = cas.getTypeSystem().getType("city");
Type locationType = cas.getTypeSystem().getType("location");
Feature beginFeat =
cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_BEGIN);
Feature endFeat =
cas.getTypeSystem().getFeatureByFullName(CAS.FEATURE_FULL_NAME_END);
// Create an empty location annotation to position iterator
AnnotationFS locationSearch = cas.createAnnotation(cityType, 0, 0);
// Obtain city and annotation iterators
FSIterator cityIterator = cas.getAnnotationIndex(cityType).iterator();
FSIterator locationIterator =
cas.getAnnotationIndex(locationType).iterator();
// Result list
List<AnnotationFS> list = new ArrayList<AnnotationFS>();
// Iterate over all cities and collect those that are not covered by a
location
for (cityIterator.moveToFirst(); cityIterator.isValid();
cityIterator.moveToNext()) {
AnnotationFS city = (AnnotationFS) cityIterator.get();
// Set the search location to the position of the current city
locationSearch.setIntValue(beginFeat, city.getBegin());
locationSearch.setIntValue(endFeat, city.getEnd());
// Set the location iterator to that location, if it exists
locationIterator.moveTo(locationSearch);
// Check that the iterator is valid, and that the location it points to
covers the city
if (locationIterator.isValid()) {
AnnotationFS loc = (AnnotationFS) locationIterator.get();
if ((loc.getBegin() <= city.getBegin()) && (loc.getEnd() >=
city.getEnd())) {
list.add(city);
}
}
}
return list;
}
}