[ 
https://issues.apache.org/jira/browse/OPENNLP-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678818#comment-13678818
 ] 

Mark Giaconia edited comment on OPENNLP-579 at 6/8/13 7:09 PM:
---------------------------------------------------------------

Please take a close look at the EntityLinker framework. It needs scrutiny. 
(attached entitylinker_8Jun2013 file). 
It consists of two packages and a properties file. 
Drop the folder into the tools project and debug the Example class's main 
method, it has three example methods. The first example requires no 
dependencies so you should be able to step through everything.

The other two examples require PostGIS and MySQL and the USGS and Geonames 
gazateers "installed" on each. The scripts to do that are in the entitylinker 
package, and you will need to put the correct password in the properties file.
Thoughts:
- The properties object should be passed all the way through to the 
implementing Linkable so it can be used for random property acquisition (for DB 
conns etc), I think this would be helpful.
- I think it would benefit from some base classes that implement some of the 
basics.
-The factory should pool objects, because there is a lot of unnecessary 
instantiation at this point (or the way the factories are called needs to be 
managed better....) this becomes difficult when Span arrays can have multiple 
types of spans.
-The Find method that Utilizes the Document object is purely experimental, but 
let me know what you think.


Thanks!
MG
                
      was (Author: giaconia_mark):
    Please take a close look at the EntityLinker framework. It needs scrutiny. 
(attached entitylinker_8Jun2013 file). 
It consists of two packages and a properties file. 
Drop the folder into the tools project and debug the Example class's main 
method, it has three example methods. The first example requires no 
dependencies so you should be able to step through everything.

The other two examples require PostGIS and MySQL and the USGS and Geonames 
gazateers "installed" on each. The scripts to do that are in the entitylinker 
package, and you will need to put the correct password in the properties file.
Thoughts:
- The properties object should be passed all the way through to the 
implementing Linkable so it can be used for random property acquisition (for DB 
conns etc), I think this would be helpful.
- I think it would benefit from some base classes that implement some of the 
basics.
-The factory should pool objects, because there is a lot of unnecessary 
instantiation at this point (or the way the factories are called needs to be 
managed better....)
-The Find method that Utilizes the Document object is purely experimental, but 
let me know what you think.


Thanks!
MG
                  
> Framework to support Gazateer search in concert with NER for location 
> entities.
> -------------------------------------------------------------------------------
>
>                 Key: OPENNLP-579
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-579
>             Project: OpenNLP
>          Issue Type: Wish
>          Components: Name Finder
>    Affects Versions: 1.6.0
>         Environment: Any
>            Reporter: Mark Giaconia
>            Priority: Minor
>              Labels: features
>             Fix For: 1.6.0
>
>         Attachments: EntityLinker_30may2013.zip, entitylinker_8Jun2013.zip, 
> entitylinkerFramework.zip, geonamefinder.properties, geonamefind.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> An interface for defining a Gazeteer and the methods to search it, an 
> extended Span object, and a Namefinder that encapsulates a TokenNameFinder 
> for locations. Commercial applications that do this are extremely expensive, 
> and there are many free gazateers one could use to create a solution with 
> OpenNLP. The capability should provide a simple default implementation using 
> the most popular open source geospatial database, PostGIS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to