[ 
https://issues.apache.org/jira/browse/OPENNLP-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13121013#comment-13121013
 ] 

Riccardo Tasso commented on OPENNLP-309:
----------------------------------------

As said in the mailing list the major problem having in having new 
POSDictionary is their serialization. My solution was to provide a public 
method which make the user able to set a POSDictionary (or an extensions of it) 
whenever he wants. In my case I train the posModel without any dictionary and I 
serialize it. In a second phase I load the model and attach my own 
POSDictionary. It doesn't need to be serialized because it provides its 
services querying an external database.

There is no problem to provide the patch for my code, if you agree with this 
kind of solution.

Riccardo
                
> facilitating the specialization of POSDictionary
> ------------------------------------------------
>
>                 Key: OPENNLP-309
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-309
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: POS Tagger
>            Reporter: Riccardo Tasso
>            Priority: Minor
>   Original Estimate: 6h
>  Remaining Estimate: 6h
>
> The train method in POSTaggerME receives in input a POSDictionary. This makes 
> the implementation of custom dictionaries painful.
> I suggest to replace the POSDictionary input as a TagDictionary.
> Another improvement may also be the declaration of POSDictionary fields as 
> protected, to help the extension of this class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to