[
https://issues.apache.org/jira/browse/OPENNLP-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13060438#comment-13060438
]
Jörn Kottmann commented on OPENNLP-78:
--------------------------------------
Should we defer the issue, or come up with a few new features in the InSpan
generator? What do you think James?
Changing the feature generation might be difficult in a way which maintains
backward compatibility, on the other side is this rarely used and doesn't
really work right now.
I think we should add features which take the context and other things into
account, like the following features:
features.add(prefix + ":w=dic=" + tokens[index]);
features.add(prefix + ":w=dic");
features.add(prefix + ":w=dic" + "+wc:" +
FeatureGeneratorUtil.tokenFeature(tokens[index]));
if (index > 0) {
features.add(prefix + ":w=dic" + "+po:" + preds[index -1]);
features.add(prefix + ":w=dic" + "+pw:" + tokens[index -1]);
features.add(prefix + ":w=dic" + "+pwc:" +
FeatureGeneratorUtil.tokenFeature(tokens[index -1]));
}
if (index +1 <tokens.length) {
features.add(prefix + ":w=dic" + "+nw:" + tokens[index +1]);
features.add(prefix + ":w=dic" + "+nwc:" +
FeatureGeneratorUtil.tokenFeature(tokens[index +1]));
}
> NameFinder and Dictionary Integration
> -------------------------------------
>
> Key: OPENNLP-78
> URL: https://issues.apache.org/jira/browse/OPENNLP-78
> Project: OpenNLP
> Issue Type: New Feature
> Components: Name Finder
> Environment: Windows 7
> Reporter: James Kosin
> Assignee: James Kosin
> Fix For: tools-1.5.2-incubating
>
>
> Now that we have a NameFinder Dictionary and improved NameFinder tools; it
> would be nice to be able to integrate the dictionary and model to help
> improve the finding of names.
> This way, the name finder could be trained more on the surrounding text
> instead of attempting to memorize common names in the news that occur
> frequently.
> I've already got the name finder corpus, created the dictionaries with the
> data from the US Census.
> I just need to implement some method to help train the model; or be able to
> use the dictionaries post model creation to help with the finding of names.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira