[ 
https://issues.apache.org/jira/browse/OPENNLP-579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Giaconia updated OPENNLP-579:
----------------------------------

    Description: 
A framework for integrating/linking external data to named entities. For 
instance, geocoding or georeferencing location entities to geonames gazateers 
can be implemented as an EntityLinker. Initially created ticket to specifically 
solve the georeferencing/geolocating/geotagging problem, but the framework 
should allow linkage of any external data to any entity type. Commercial 
applications that do this are expensive, and there are many free gazateers one 
could use to create solutions with OpenNLP. 
UPDATE: The current implementation of the GeoEntityLinker uses Lucene to store 
the Gazateers, and provides utils for indexing them. The impl returns lat, long 
(and other gaz fields) for toponyms extracted with NER.
All extracted toponyms are scored in four ways: fuzzy string matching, binning 
by location, context modeling, and country-mention proximity. These scores 
enable a good means of deciding what's worth keeping from the gaz.

  was:
A framework for integrating/linking external data to named entities. For 
instance, geocoding or georeferencing location entities to geonames gazateers 
can be implemented as an EntityLinker. Initially created ticket to specifically 
solve the georeferencing/geolocating/geotagging problem, but the framework 
should allow linkage of any external data to any entity type. Commercial 
applications that do this are expensive, and there are many free gazateers one 
could use to create solutions with OpenNLP. 
UPDATE: The current implementation uses Lucene to store the Gazateers, and 
provides utils for indexing them. The impl returns lat, long (and other gaz 
fields) for toponyms extracted with NER.


> Framework to dynamically link N-best matches from external data to named 
> entities by type (EntityLinker framework)
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-579
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-579
>             Project: OpenNLP
>          Issue Type: Wish
>          Components: Entity Linker
>    Affects Versions: 1.6.0
>         Environment: Any
>            Reporter: Mark Giaconia
>            Assignee: Joern Kottmann
>            Priority: Minor
>              Labels: features
>             Fix For: 1.6.0
>
>         Attachments: entitylinker.properties, 
> opennlp.geoentitylinker.countrycontext.txt
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> A framework for integrating/linking external data to named entities. For 
> instance, geocoding or georeferencing location entities to geonames gazateers 
> can be implemented as an EntityLinker. Initially created ticket to 
> specifically solve the georeferencing/geolocating/geotagging problem, but the 
> framework should allow linkage of any external data to any entity type. 
> Commercial applications that do this are expensive, and there are many free 
> gazateers one could use to create solutions with OpenNLP. 
> UPDATE: The current implementation of the GeoEntityLinker uses Lucene to 
> store the Gazateers, and provides utils for indexing them. The impl returns 
> lat, long (and other gaz fields) for toponyms extracted with NER.
> All extracted toponyms are scored in four ways: fuzzy string matching, 
> binning by location, context modeling, and country-mention proximity. These 
> scores enable a good means of deciding what's worth keeping from the gaz.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to