[ 
https://issues.apache.org/jira/browse/UIMA-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905646#action_12905646
 ] 

Marshall Schor commented on UIMA-1861:
--------------------------------------

Thanks for the fixes/patch.  Here are a few suggested changes, to take 
advantage of JCas better. (I've attached this version as patch2 above)

1) Although this annotator is set up as a JCas annotator, it is missing the 
JCas type for TokenAnnotation.  Because of this, it goes to some lengths to not 
make use of this type where it could be useful.  To add the JCas cover types 
for this is easy: open the desc/SnowballAnnotator.xml descriptor in the 
Component Descriptor editor in Eclipse, click the typesystem page, and push the 
JCasGen button.  This will generate the missing classes for the types and add 
them to the project.

If the TokenAnnotation JCas type was available, the lines:

(original)
      // iterate over all token annotations and add stem if available
      FSIterator tokenIterator = 
aJCas.getCas().getAnnotationIndex(this.tokenAnnotation).iterator();

(with patch)

    // iterate over all token annotations and add stem if available
      FSIterator tokenIterator = 
aJCas.getAnnotationIndex(this.tokenAnnotation).iterator();
    // note: causes a warning leading to a suppress warnings, related to 
generics

could be written
    // iterate over all token annotations and add stem if available
      FSIterator<TokenAnnotation> tokenIterator = 
(FSIterator<TokenAnnotation>)(FSIterator<?>)  // very ugly "double-fisted cast"
                                                  
aJCas.getAnnotationIndex(TokenAnnotation.type).iterator();
 
and the code in the bottom method (typeSystemInit) would not be needed. The 
"double-fisted cast" is described here 
http://markmail.org/message/w5kpympalj6tvqq3.

Alternatively, to avoid the double cast, the FSIterator could be over the type 
Annotation, and an explicit cast of the next() could be done to TokenAnnotation:
    // iterate over all token annotations and add stem if available
      FSIterator<Annotation> tokenIterator = 
aJCas.getAnnotationIndex(TokenAnnotation.type).iterator();
      ...
      TokenAnnotation annot = (TokenAnnotation) tokenIterator.next();


The line further on down which reads

        // get stemmer result and set annotation feature
        annot.setStringValue(this.tokenAnnotationStemmFeature, 
stemmer.getCurrent());

would be better written (using JCas style) as:

        // get stemmer result and set annotation feature
        annot.setStem(stemmer.getCurrent());

If the JCas style is used, the typeSystemInit method can be deleted, along with 
all the constants added to support it, because the things its computing are not 
used.  In any case, it should not be called in the process method.  (The UIMA 
framework calls it directly, but only when the type system changes).

> SnowballAnnotator needs refactoring
> -----------------------------------
>
>                 Key: UIMA-1861
>                 URL: https://issues.apache.org/jira/browse/UIMA-1861
>             Project: UIMA
>          Issue Type: Bug
>          Components: Sandbox-SnowballAnnotator
>    Affects Versions: 2.3.1
>            Reporter: Tommaso Teofili
>            Assignee: Tommaso Teofili
>             Fix For: 2.3.1
>
>         Attachments: SnowballAnnotatorPatch2.txt, UIMA1861-patch.txt
>
>
> SnowballAnnotator is extending the deprecated JTextAnnotator_ImplBase, have 
> some unused imports and generics should be enabled.
> Moreover the initialize() method fails due to the AnnotatorContext object 
> being null when run in a 2.3.1-SNAPSHOT distribution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to