In most of the code the interfaces are located at the component level
package
and in some components we have sub-packages for different implementations.
Jörn
On 06/02/2013 01:58 AM, Giaconia, Mark [USA] wrote:
I am still becoming familiar with the way the project is internally structured,
but I typically like to separate frameworks from implementations, so perhaps a
framework package that holds factories and interfaces and the like, and another
for implementations?
opennlp.tools.ml.framework
opennlp.tools.ml.impls
Let me know if I can help
Mark Giaconia
-----Original Message-----
From: Samik Raychaudhuri [mailto:[email protected]]
Sent: Friday, May 31, 2013 5:39 PM
To: [email protected]
Subject: [External] Re: Pluggable Machine Learning support
Yep, supporting the move to a new package/namespace.
On 5/31/2013 12:40 AM, Tommaso Teofili wrote:
big +1!
Tommaso
2013/5/31 William Colen <[email protected]>
I don't see any issue. People that uses Maxent directly would need to
change how they use it, but that is OK for a major release.
On Thu, May 30, 2013 at 5:56 PM, Jörn Kottmann <[email protected]> wrote:
Are there any objections to move the maxent/perceptron classes to an
opennlp.tools.ml package as part of this issue? Moving the things
would avoid a second interface layer and probably make using OpenNLP
Tools a bit easier, because then we are down to a single jar.
Jörn
On 05/30/2013 08:57 PM, William Colen wrote:
+1 to add pluggable machine learning algorithms
+1 to improve the API and remove deprecated methods in 1.6.0
You can assign related Jira issues to me and I will be glad to help.
On Thu, May 30, 2013 at 11:53 AM, Jörn Kottmann
<[email protected]>
wrote:
Hi all,
we spoke about it here and there already, to ensure that OpenNLP
can
stay
competitive with other NLP libraries I am proposing to make the
machine learning pluggable.
The extensions should not make it harder to use OpenNLP, if a user
loads
a
model OpenNLP should be capable of setting up everything by itself
without forcing the user to write custom integration code based on
the ml implementation.
We solved this problem already with the extension mechanism, we
build
to
support the customization of our components, I suggest that we
reuse
this
extension mechanism to load a ml implementation. To use a custom
ml implementation the user has to specify the class name of the
factory in the Algorithm field of the params file. The params file
is available during training and tagging time.
Most components in the tools package use the maxent library to do
classification. The Java interfaces for this are currently located
in
the
maxent package, to be able to swap the implementation the
interfaces should be defined inside the tools package. To make
things easier I propose to move the maxent and perceptron
implemention as well.
Through the code base we use the AbstractModel, thats a bit
unlucky because the only reason for this is the lack of model
serialization support in the MaxentModel interface, a
serialization method should be added to it, and maybe renamed to
ClassificationModel. This will break backward compatibility in
non-standard use cases.
To be able to test the extension mechanism I suggest that we
implement
an
addon which integrates liblinear and the Apache Mahout classifiers.
There are still a few deprecated 1.4 constructors and methods in
OpenNLP
which directly reference interfaces and classes in the maxent
library, these need to be removed, to be able to move the
interfaces to the
tools
package.
Any opinions?
Jörn