Re: [External] Re: Pluggable Machine Learning support

Jörn Kottmann Mon, 03 Jun 2013 00:58:45 -0700

In most of the code the interfaces are located at the component levelpackage

and in some components we have sub-packages for different implementations.


Jörn

On 06/02/2013 01:58 AM, Giaconia, Mark [USA] wrote:

I am still becoming familiar with the way the project is internally structured, 
but I typically like to separate frameworks from implementations, so perhaps a 
framework package that holds factories and interfaces and the like, and another 
for implementations?

opennlp.tools.ml.framework
opennlp.tools.ml.impls

Let me know if I can help


Mark Giaconia


-----Original Message-----
From: Samik Raychaudhuri [mailto:[email protected]]
Sent: Friday, May 31, 2013 5:39 PM
To: [email protected]
Subject: [External] Re: Pluggable Machine Learning support

Yep, supporting the move to a new package/namespace.

On 5/31/2013 12:40 AM, Tommaso Teofili wrote:

big +1!

Tommaso


2013/5/31 William Colen <[email protected]>

I don't see any issue. People that uses Maxent directly would need to
change how they use it, but that is OK for a major release.




On Thu, May 30, 2013 at 5:56 PM, Jörn Kottmann <[email protected]> wrote:

Are there any objections to move the maxent/perceptron classes to an
opennlp.tools.ml package as part of this issue? Moving the things
would avoid a second interface layer and probably make using OpenNLP
Tools a bit easier, because then we are down to a single jar.

Jörn


On 05/30/2013 08:57 PM, William Colen wrote:

+1 to add pluggable machine learning algorithms
+1 to improve the API and remove deprecated methods in 1.6.0

You can assign related Jira issues to me and I will be glad to help.


On Thu, May 30, 2013 at 11:53 AM, Jörn Kottmann
<[email protected]>
wrote:

   Hi all,

we spoke about it here and there already, to ensure that OpenNLP
can

stay

competitive with other NLP libraries I am proposing to make the
machine learning pluggable.

The extensions should not make it harder to use OpenNLP, if a user

loads

a
model OpenNLP should be capable of setting up everything by itself
without forcing the user to write custom integration code based on
the ml implementation.
We solved this problem already with the extension mechanism, we
build

to

support the customization of our components, I suggest that we
reuse

this

extension mechanism to load a ml implementation. To use a custom
ml implementation the user has to specify the class name of the
factory in the Algorithm field of the params file. The params file
is available during training and tagging time.

Most components in the tools package use the maxent library to do
classification. The Java interfaces for this are currently located
in

the

maxent package, to be able to swap the implementation the
interfaces should be defined inside the tools package. To make
things easier I propose to move the maxent and perceptron
implemention as well.

Through the code base we use the AbstractModel, thats a bit
unlucky because the only reason for this is the lack of model
serialization support in the MaxentModel interface, a
serialization method should be added to it, and maybe renamed to
ClassificationModel. This will break backward compatibility in
non-standard use cases.

To be able to test the extension mechanism I suggest that we
implement

an

addon which integrates liblinear and the Apache Mahout classifiers.

There are still a few deprecated 1.4 constructors and methods in

OpenNLP

which directly reference interfaces and classes in the maxent
library, these need to be removed, to be able to move the
interfaces to the

tools

package.

Any opinions?

Jörn

Re: [External] Re: Pluggable Machine Learning support

Reply via email to