[
https://issues.apache.org/jira/browse/OPENNLP-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13682130#comment-13682130
]
Mark Giaconia commented on OPENNLP-579:
---------------------------------------
not sure if you saw my post on the dev thread...I implemented the functionality
of an aggregated entity linker, but not as a separate interface, and the
factory is now thread safe. The BaseEntityLinker abstract class takes care of
it by detecting when an input Span[] array contains multiple entity types (via
getType()), also, a user can optionally pass in a String[] of entitytypes to
constrain the linker creation to only those types (defined like any other
linkers in a properties file).
As a summary, here are the method signatures for the BaseEntityLinker abstract
class:
protected ArrayList<LinkedSpan<T>> getLinkedSpans(String[] tokens, Span[]
spans, EntityLinkerProperties properties) //auto detects if the Span[] contains
more than one type
protected ArrayList<LinkedSpan<T>> getAggregatedLinkedSpans(String[]
entitytypes, String[] tokens, Span[] spans, EntityLinkerProperties properties)
// types are filtered with first arg
protected Document<T> getLinkedSpans(Document<T> document,
EntityLinkerProperties properties)
public List<Document<T>> getLinkedSpans(List<Document<T>> documents,
EntityLinkerProperties properties)
//class declaration looks like this
public abstract class BaseEntityLinker<T extends BaseLink> {...}
so far the basic steps to use the framework flow like this....
1. create an implementation of an EntityLinker
--Optionally, use the Linkable and LinkableFactory framework inside your
EntityLinker impl (recommended due to its configuration-driven extensibility)
2. Create a props file and add entries using the following format:
linker.location=opennlp.tools.entitylinker.GeoEntityLinker
linker.location.linkables=opennlp.tools.entitylinker.PostGISGeoGazImpl,opennlp.tools.entitylinker.MySQLUSGSGazLinkable,opennlp.tools.entitylinker.MySQLGeoNamesGazLinkable
3. create a class that extends BaseEntityLinker
4. Use the OpenNLP namefinder et al in your own class, and retrieve LinkedSpans
via the class that extends BaseEntityLinker.
5. Do something awesome with the LinkedSpans
I also refined the Document and Sentence objects to make them more useful for
the Georeferencing impl I am working
thanks!
> Framework to dynamically link N-best matches from external data to named
> entities by type (EntityLinker framework)
> ------------------------------------------------------------------------------------------------------------------
>
> Key: OPENNLP-579
> URL: https://issues.apache.org/jira/browse/OPENNLP-579
> Project: OpenNLP
> Issue Type: Wish
> Components: Name Finder
> Affects Versions: 1.6.0
> Environment: Any
> Reporter: Mark Giaconia
> Priority: Minor
> Labels: features
> Fix For: 1.6.0
>
> Attachments: EntityLinker_30may2013.zip, entitylinker_8Jun2013.zip,
> entitylinker_9Jun2013.zip, entitylinkerFramework.zip,
> geonamefinder.properties, geonamefind.zip
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> A framework for integrating/linking external data to named entities. For
> instance, geocoding or georeferencing location entities to geonames gazateers
> can be implemented as an EntityLinker. Initially created ticket to
> specifically solve the georeferencing problem, but the framework should allow
> linkage of any external data to any entity type. Commercial applications that
> do this are expensive, and there are many free gazateers one could use to
> create solutions with OpenNLP. The capability should provide a default
> implementation using MySQL or Postgres and the USGS/Geonames Gazateers.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira