[jira] [Commented] (OPENNLP-725) TokenNameFinderTrainer CLI not loading resources

Joern Kottmann (JIRA) Wed, 22 Oct 2014 15:11:15 -0700

    [ 
https://issues.apache.org/jira/browse/OPENNLP-725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180631#comment-14180631
 ]


Joern Kottmann commented on OPENNLP-725:
----------------------------------------

Originally the serializers where always mapped by resource file extensions. 
That design wasn't extensible because the mapping was static. To fix that we 
introduced the concept that the feature generator has to specify which 
serializers should be used on a per resource basis. That helps to load the 
artifacts from the resource folder initially. After they are loaded the 
SerializableArtifact interface is used to write them to the model. The model 
also writes which serializer classes have to be used to load them again.

The new way of handling the resources - by having them implement 
SerializableArtifact - is easier to use than the approach we used initially. It 
is probably a good idea to adapt all artifacts we have to that.

Looks like the resource for that dict is loaded by extension. So changing it to 
.w2vclasses will probably make it work.
Anyway, +1 to update it to use the new logic which doesn't depend on the file 
ending. I think that is more natural to use anyway.
The new approach should also be able to output a clearer error message. The 
resource file is either missing from the resource directory or it is not valid. 
In both cases a proper error can be given to the user.

I think now it just says it couldn't map a resource which doesn't really point 
a user int the right direction to solve the problem.

> TokenNameFinderTrainer CLI not loading resources
> ------------------------------------------------
>
>                 Key: OPENNLP-725
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-725
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Name Finder
>    Affects Versions: 1.6.0
>            Reporter: Rodrigo Agerri
>            Assignee: Rodrigo Agerri
>             Fix For: 1.6.0
>
>
> Passing an XML featuregen descriptor to the CLI TokenNameFinderTrainer with a 
> line such as 
> <w2vwordcluster dict="word2vec-test.txt" />
>  
> and with the -resource parameter properly set, the loadResources() method 
> does not  get the right serializer to create the resource (line 130 of 
> TokenNameFinderTrainerTool class). It looks in the ArtifactSerializers map 
> created at the beginning of the method but does not find a value for the key 
> (which is the file extension of the lexicon?). 
> Proposed solution: get the appropriate serializer from the element class 
> (e.g. w2vwordcluster). 
> Any comments?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OPENNLP-725) TokenNameFinderTrainer CLI not loading resources

Reply via email to