[ 
https://issues.apache.org/jira/browse/OPENNLP-331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133337#comment-13133337
 ] 

Joern Kottmann commented on OPENNLP-331:
----------------------------------------

The POS Tagger can output the best tag sequences or an array of k-best 
sequences. The Parser always gets the k-best sequences from the POS Tagger and 
then can take one which is not the best. I believe that happened in your case 
above.

When you do parsing, you should always use the returned Parse object to 
retrieve the chunks and pos tags (instead of doing it yourself).
                
> disagreement between POS of parser and POStagger
> ------------------------------------------------
>
>                 Key: OPENNLP-331
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-331
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Similarity
>            Reporter: Boris Galitsky
>            Assignee: Boris Galitsky
>         Attachments: patch.OPENNLP-331.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> ParserTool.parseLine(sentence, parser, 1) gives:
> How can I get short focus zoom lens for digital camera
> type = S
>       type = WHADVP
>               type = WRB, word = How
>       type = SQ
>               type = MD, word = can
>               type = NP
>                       type = PRP, word = I
>               type = VP
>                       type = VB, word = get
>                       type = NP
>                               type = JJ, word = short
>                               type = NN, word = focus
>                               type = NN, word = zoom   // ZOOM is NOUN: 
> correct
>                               type = NN, word = lens
>                       type = PP
>                               type = IN, word = for
>                               type = NP
>                                       type = JJ, word = digital
>                                       type = NN, word = camera
> BUT
> new POSTaggerME(model).tag(toks);
>  gives
> [WRB, MD, PRP, VB, JJ, NN, VBN, NN, IN, JJ, NN]
>                                         ****
>                      VBN is a problem!  
> zoom is NOT VBN - Verb, past participle

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to