Joern, I am using Lucene inside the GeoEntityLinker impl, so do you think I
should move the entire GeoEntityLinker impl and all its classes to a new
module in the sandbox and leave only the entitylinker framework in opennlp
tools?
A second thought/option is to make the lucene pom entries optional in the
opennlptools pom, so users will have to add lucene to their pom to run the
geoentitylinker and the jars will not be included in the tools build


On Tue, Nov 5, 2013 at 3:39 AM, Jörn Kottmann <[email protected]> wrote:

> On 11/03/2013 02:22 AM, Mark G wrote:
>
>> I finished with the Lucene indexing of the Gazateers, just need to get
>> them
>> tied into the gaz lookups, which is fairly simple. Do you all think I
>> should disregard all the MySQL dependency and just have Lucene? The lucene
>> index files are only about 2.5 gigs total, so very manageable to
>> distribute
>> the files across a cluster. I could keep the MySQL classes as an option,
>> but at this point the Lucene based approach is really growing on me.
>> If I don't here from anyone I am going to remove the MySQL implementation.
>>
>
> +1 I believe a Lucene based solution is easier to handle for most people,
> because it can
> be fully integrated via API (no need to install anything) and therefor
> hides most of
> the complexity.
>
> Please avoid adding a dependincy for lucene to the opennlp-tools project,
> I suggest that we
> add this code to the sandbox, or a new addon area. If people want to use a
> Lucene based dictionary
> they can depend on that module explicitly.
>
> Jörn
>

Reply via email to