Re: [Kim-discussion] extracting Token.category feature of Object types annotations

Philip Alexiev @ Ontotext Tue, 28 Sep 2010 01:50:30 -0700

Hi Mehnaz,

Generally, KIM has an Entity Annotation Set, which determines which are the 
annotations of interest to KIM.  The set is defined in  nerc.properties by the 
property:


com.ontotext.kim.KIMConstants.IE_ANN_TYPES

If it is omitted, a default value is set, which is:

new HashSet<String>(Arrays
        .asList("Entity", "Organization", "Person", "Time", "Date",
                        "Percent", "Location", "Position", "Money", "Abstract",
                        "ContactInfo", "Object", "Event", "Brand", 
"GeneralTerm",
                        "KeyPhrase", "KeyPerson", "KeyOrganization", 
"KeyLocation"));


Token annotations are used to form the context around these, and all 
annotations in the final document that are not in the set, are removed.

That is why, you will not be able to see Tokens in the stored KIM document. 
They are stripped at the end of the pipeline. In order to keep them, you should 
add Token to the list.

POS category is only applicable to Tokens.  Generally for single word 
annotations you can assign the POS of the underlying Token. But you should test 
how this works in your case.  Generally it is a better idea to use tokens as 
context in rules and only keep meaningful annotations at the end.

This is how you can get the category of the underlying Token in the RHS of a 
rule:

Annotation someAnnotation = .... (initialization here) ;
AnnotationSet tokenSet inputAS.get("Token", 
someAnnotation.getStartNode().getOffset(), 
someAnnotation.getEndNode().getOffset());

if (tokenSet != null && tokenSet.size() > 0) {
        Annotation token = tokenSet.iterator().next();
        String category = token.getFeatures().get("category");
        FeatureMap features = someAnnotation.getFeatures();  // returns a copy 
of the FeatureMap
        features.put("category", category);
        someAnnotation.setFeatures(features);
}


HTH
Philip Alexiev


On Sep 28, 2010, at 5:31 AM, Mehnaz Adnan wrote:

> Hi,
>  
> I am using KIM 2.3 RC1. I am extracting Object annotations by using KIM API. 
> Here is the code. Is there a way I can also extract the POS category of the 
> annotation. I suppose I have to do some changes in Jape but where I don't 
> know . For example I have a sentence
> "laceration to right forearm"
> 
>  
> for which two Object annotations are created, one is for 'laceration' and 
> other is for "forearm" . I also want to extract the Token.category features 
> which is normally created in POS tagger pipline in Gate.
> 
>  
>  
> apiSemAnn.execute(kdocFromText);
> 
> 
> 
> KIMAnnotationSet kimASet = kdocFromText.getAnnotations();
> 
> Set typesSet = kimASet.getAllTypes();
> KIMAnnotationSet kimFilteredASet=
> 
> null;
> 
> 
> Iterator iterator = typesSet.iterator();
> 
> 
> while (iterator.hasNext()){
> Object key = iterator.next();
> 
> kimFilteredASet = kimASet.get(String.valueOf(key));
> 
> 
> }
> 
> 
> List <KIMAnnotation> annotations =
> 
> new ArrayList<KIMAnnotation>(kimFilteredASet);
> Collections.sort(annotations,
> 
> new Comparator<KIMAnnotation>(){
> 
> public int compare(KIMAnnotation o1, KIMAnnotation o2) {
> 
> if (o1.getStartOffset() > o2.getEndOffset()) return 1;
> 
> else if (o1.getStartOffset() < o2.getEndOffset())
> 
> return -1;
> 
> else return 0;
> 
> 
> }
> 
> });
> 
> 
> 
> -- 
> Mehnaz Adnan 
> Ph.D. Candidate,
> Department of Computer Science-Tamaki
> University of Auckland
> Phone: 09-3737599 ext 83274
> email: [email protected]
> _______________________________________________
> Kim-discussion mailing list
> [email protected]
> http://ontotext.com/mailman/listinfo/kim-discussion

_______________________________________________
Kim-discussion mailing list
[email protected]
http://ontotext.com/mailman/listinfo/kim-discussion

Re: [Kim-discussion] extracting Token.category feature of Object types annotations

Reply via email to