[
https://issues.apache.org/jira/browse/OPENNLP-579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Giaconia updated OPENNLP-579:
----------------------------------
Attachment: opennlp.geoentitylinker.countrycontext.txt
entitylinker.properties
Properties file and country context file for the GeoEntityLinker
Gazateers can be downloaded here:
NGA GeoNames:
http://earth-info.nga.mil/gns/html/geonames_20131101.zip
USGS:
http://geonames.usgs.gov/docs/stategaz/NationalFile_20131020.zip
once these are downloaded, unzip them to a dir.
then, use the GazateerIndexer class in the addons geoentitylinker package to
create the lucene indexes.
Once they are complete (takes about an hour total)
input their paths into the entitylinker.properties file, also input the path to
the attached opennlp.geoentitylinker.countrycontext.txt file.
Once this is complete use this code to use the GeoEntityLinker
//point to your EntityLinkerProperties file location
String modelPath = "C:\\apache\\entitylinker\\";
EntityLinkerProperties properties = new EntityLinkerProperties(new
File(modelPath + "entitylinker.properties"));
//do NER with a location model to get some spans
//then the factory to get your linker
List<LinkedSpan> consolidatedLinkedData = EntityLinkerFactory.getLinker(
"location", properties).find(document, sentenceSpans,
allTokensInDoc, allnamesInDoc);
better documentation to follow
> Framework to dynamically link N-best matches from external data to named
> entities by type (EntityLinker framework)
> ------------------------------------------------------------------------------------------------------------------
>
> Key: OPENNLP-579
> URL: https://issues.apache.org/jira/browse/OPENNLP-579
> Project: OpenNLP
> Issue Type: Wish
> Components: Entity Linker
> Affects Versions: 1.6.0
> Environment: Any
> Reporter: Mark Giaconia
> Assignee: Joern Kottmann
> Priority: Minor
> Labels: features
> Fix For: 1.6.0
>
> Attachments: entitylinker.properties,
> opennlp.geoentitylinker.countrycontext.txt
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> A framework for integrating/linking external data to named entities. For
> instance, geocoding or georeferencing location entities to geonames gazateers
> can be implemented as an EntityLinker. Initially created ticket to
> specifically solve the georeferencing problem, but the framework should allow
> linkage of any external data to any entity type. Commercial applications that
> do this are expensive, and there are many free gazateers one could use to
> create solutions with OpenNLP. The capability should provide a default
> implementation using MySQL or Postgres and the USGS/Geonames Gazateers.
--
This message was sent by Atlassian JIRA
(v6.1#6144)