I will give you my opinion about CLAVIN and Stanbol. CLAVIN seems to be a single purpose small project, but you immediately grasp the benefits. Look at the demo for 1 minute, and you *want* to add that feature to your software. It might even motivate you to learn more about Lucene.
On the contrary, Stanbol's benefits are not so clear. It is a much clever architecture, for sure. But there are too few demos of the added value of it. May be, you could organize a webinar, or record a Youtube video, so you help us understand the benefit of Stanbol over something simpler (like CLAVIN for example). On Mon, Jun 3, 2013 at 8:21 AM, Rupert Westenthaler < rupert.westentha...@gmail.com> wrote: > Hi Christian Ledermann, all > > This would be great and definitely a major improvement for Apache > Stanbol! Writing an EnhancementEngine based on this should be > relatively simple. > > The only thing I would suggest is to not use the integrated OpenNLP > NER based LocationExtractor implementation of CLAVIN. Stanbol provides > much more option regarding NLP so having an own Stanbol specific > implementation of the LocationExtractor interface [1] would allow to > use also different NER implementation as well as custom build NER > models for OpenNLP. > > Such an implementation would need to parses fise:TextAnnotations with > dc:type dbpedia:Place from the enhancement metadata and returns them > as "List<LocationOccurrence>". > > Christian would to have time to work on that? I would definitely help > you with this > > best > Rupert > > > > [1] > https://github.com/Berico-Technologies/CLAVIN/blob/master/src/main/java/com/berico/clavin/extractor/LocationExtractor.java > > On Thu, May 30, 2013 at 10:23 AM, Olivier Rossel > <olivier.ros...@gmail.com> wrote: > > +1 > > the demo looks great!!!! > > > > > > > > On Thu, May 30, 2013 at 9:51 AM, Christian Ledermann < > > christian.lederm...@gmail.com> wrote: > > > >> I just stumbled over this: > >> > >> " > >> CLAVIN (Cartographic Location And Vicinity INdexer) is an > >> award-winning open source > >> (apache 2 license) > >> software package for document geotagging and geoparsing that employs > >> context-based geographic entity resolution. > >> > >> It extracts location names from unstructured text and resolves them > >> against a gazetteer to produce data-rich geographic entities. > >> > >> CLAVIN does not simply "look up" location names – it uses intelligent > >> heuristics to identify exactly which "Springfield" (for example) was > >> intended by the author, based on the context of the document. CLAVIN > >> also employs fuzzy search to handle incorrectly-spelled location > >> names, and it recognizes alternative names (e.g., "Ivory Coast" and > >> "Côte d'Ivoire") as referring to the same geographic entity. > >> > >> By enriching text documents with structured geo data, CLAVIN enables > >> hierarchical geospatial search and advanced geospatial analytics on > >> unstructured data. > >> " > >> > >> http://clavin.bericotechnologies.com/ > >> > >> > >> Maybe this could be used as an enhancer for stanbol? > >> > >> > >> > >> -- > >> Best Regards, > >> > >> Christian Ledermann > >> > >> Nairobi - Kenya > >> Mobile : +254 702978914 > >> > >> <*)))>{ > >> > >> If you save the living environment, the biodiversity that we have left, > >> you will also automatically save the physical environment, too. But If > >> you only save the physical environment, you will ultimately lose both. > >> > >> 1) Don’t drive species to extinction > >> > >> 2) Don’t destroy a habitat that species rely on. > >> > >> 3) Don’t change the climate in ways that will result in the above. > >> > >> }<(((*> > >> > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen >