[
https://issues.apache.org/jira/browse/OPENNLP-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Giaconia updated OPENNLP-756:
----------------------------------
Attachment: newCountryContextfile.txt
Just committed changes to this and other tickets. The new attached
countrycontext file should be used from now on.
other improvements that I hope someone will validate
Improved Regex handling in scorers and country context generator.
Upgraded Lucene dependency to 6.0.0
Fixed ProvinceProximityScorer and CountryProximityScorer
Fixed num rows returned bug
Added regex support to Country and Province in countrycontextfile, and added
headers for better editing in things like xl
Cleaned up some other code
All indexes should be rebuilt because of new country context file format
returned from the gazetteerIndexer class
> GeoEntityLinker Admin Boundary context generator should allow regex for more
> flexibility and better discovery of location context
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: OPENNLP-756
> URL: https://issues.apache.org/jira/browse/OPENNLP-756
> Project: OpenNLP
> Issue Type: Improvement
> Components: Entity Linker
> Affects Versions: addons-1.6.0
> Environment: java 7
> Reporter: Mark Giaconia
> Assignee: Mark Giaconia
> Fix For: addons-1.6.0
>
> Attachments: newCountryContextFile.txt, newCountryContextfile.txt
>
>
> Currently the way the AdminBoundaryContextGenerator discovers Country,
> Province, and County mentions is inflexible and misses a lot of mentions. The
> GeoEntityLinker should support regexes in the countrycontext file so that it
> will find more mentions based on user defined extensions via regex. This
> change propagates to several other classes called within the GeoEntityLinker
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)