Jason commented on the jira issue itself.
If there is no objection I will now go ahead and finish the work
and check it in.
I believe having this feature is very important, because otherwise it is
very difficult to change the feature generation easily or have all sorts of
different language dependent feature generation.
Jörn
On 5/5/11 2:32 PM, Jörn Kottmann wrote:
Hi all,
https://issues.apache.org/jira/browse/OPENNLP-17
this issue is now discussed and around for quite some time
and I would like to finally try to reach some consensus here.
The issue proposes the ways to solve the problem of defining of
having a file which can be turned into a bunch of feature generator
objects.
These are all the solutions which I discussed for quite some time
with Tom Morton during SourceForge days.
I think the java script solution should not be implemented because
of the security problem it brings, having java script inside the models
makes it easy for someone to do something malicious on your machine
e.g. delete files. To prevent this we would need some kind of sandboxing
which brings more complexity and new issues.
The main disadvantage of having dependency injection (e.g. spring) in
my eyes is
that it adds a new dependency to the project which is not nice, since
OpenNLP
is a library which should come without dependencies. For example when
you have
a bigger project it is very annoying when every library brings
different dependencies
in. And using OpenNLP should of course always be a positive experience.
Another disadvantge I see here is that the xml to describe the feature
generation
is rather long compared to a custom xml based dsl.
The solution I am +1 for is to make a custom xml format which defines
how the feature
generators are put together. The big advantage I see here over
dependency injection is that
the xml looks nicer and is also shorter which makes it easier to
discuss different feature
generations e.g. on the mailing list.
This solution is more or less already implemented and I would also
extend our documentation
to explain how it works. Another concern raised it that it might need
more maintance than
the dependency injection solution, but I think that is not really be
true in the long run, since
the DI library might also changed and might need updating to new APIs.
Just coding against the java library usually produces code which is
very stable because the
underlying APIs never change and are very well tested.
If there are no objections from the other committers I would like to
go ahead with
the custom xml solution.
Jörn