[
https://issues.apache.org/jira/browse/TIKA-1106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann updated TIKA-1106:
------------------------------------
Fix Version/s: (was: 1.13)
1.14
> CLAVIN Integration
> ------------------
>
> Key: TIKA-1106
> URL: https://issues.apache.org/jira/browse/TIKA-1106
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Affects Versions: 1.3
> Environment: All
> Reporter: Adam Estrada
> Assignee: Chris A. Mattmann
> Priority: Minor
> Labels: entity, geospatial, new-parser
> Fix For: 1.14
>
>
> I've been evaluating CLAVIN as a way to extract location information from
> unstructured text. It seems like meshing it with Tika in some way would make
> a lot of sense. From CLAVIN website...
> {quote}
> CLAVIN (*Cartographic Location And Vicinity INdexer*) is an open source
> software package for document geotagging and geoparsing that employs
> context-based geographic entity resolution. It combines a variety of open
> source tools with natural language processing techniques to extract location
> names from unstructured text documents and resolve them against gazetteer
> records. Importantly, CLAVIN does not simply "look up" location names;
> rather, it uses intelligent heuristics in an attempt to identify precisely
> which "Springfield" (for example) was intended by the author, based on the
> context of the document. CLAVIN also employs fuzzy search to handle
> incorrectly-spelled location names, and it recognizes alternative names
> (e.g., "Ivory Coast" and "Côte d'Ivoire") as referring to the same geographic
> entity. By enriching text documents with structured geo data, CLAVIN enables
> hierarchical geospatial search and advanced geospatial analytics on
> unstructured data.
> {quote}
> There was only one other instance of the word "clavin" mentioned in the ASF
> jira site so I thought it was definitely worth posting here.
> https://github.com/Berico-Technologies/CLAVIN
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)