Re: OPENNLP-17 Custom feature gen config

Jörn Kottmann Fri, 06 May 2011 07:25:12 -0700

Jason commented on the jira issue itself.

If there is no objection I will now go ahead and finish the work
and check it in.


I believe having this feature is very important, because otherwise it is
very difficult to change the feature generation easily or have all sorts of
different language dependent feature generation.

Jörn

On 5/5/11 2:32 PM, Jörn Kottmann wrote:

Hi all,

https://issues.apache.org/jira/browse/OPENNLP-17

this issue is now discussed and around for quite some time
and I would like to finally try to reach some consensus here.

The issue proposes the ways to solve the problem of defining of
having a file which can be turned into a bunch of feature generator
objects.

These are all the solutions which I discussed for quite some time
with Tom Morton during SourceForge days.

I think the java script solution should not be implemented because
of the security problem it brings, having java script inside the models
makes it easy for someone to do something malicious on your machine
e.g. delete files. To prevent this we would need some kind of sandboxing
which brings more complexity and new issues.
The main disadvantage of having dependency injection (e.g. spring) inmy eyes isthat it adds a new dependency to the project which is not nice, sinceOpenNLPis a library which should come without dependencies. For example whenyou havea bigger project it is very annoying when every library bringsdifferent dependencies
in. And using OpenNLP should of course always be a positive experience.
Another disadvantge I see here is that the xml to describe the featuregeneration
is rather long compared to a custom xml based dsl.
The solution I am +1 for is to make a custom xml format which defineshow the featuregenerators are put together. The big advantage I see here overdependency injection is thatthe xml looks nicer and is also shorter which makes it easier todiscuss different feature
generations e.g. on the mailing list.
This solution is more or less already implemented and I would alsoextend our documentationto explain how it works. Another concern raised it that it might needmore maintance thanthe dependency injection solution, but I think that is not really betrue in the long run, since
the DI library might also changed and might need updating to new APIs.
Just coding against the java library usually produces code which isvery stable because the
underlying APIs never change and are very well tested.
If there are no objections from the other committers I would like togo ahead with
the custom xml solution.

Jörn

Re: OPENNLP-17 Custom feature gen config

Reply via email to