Good answer. It made me realize I might not have expressed myself clearly.
So, I agree and would also avoid having two releases: that's too much
hassle. Also, so far MaxEnt is a required dependency for OpenNLP, tightly
integrated, and having them released separately is a hassle because of
necessary synchronization. With CLI it's different - CLI depends on OpenNLP
and they will remain in the same release.

My proposal is different. The release should remain a single one. Let me
make the proposal more concrete in terms of steps:

1) Create a opennlp-cli module with its own pom. This will make tools
package smaller and will improve the experience of library-only users: less
stuff to drag around.
2) Keep the same single release of OpenNLP.

Currently most of our code lives in opennlp-tools and is separated by java
> packages
> which I believe works really great and there is no need to cut this down
> further.
>
Exactly! And CLI is already a separate package.


> Maybe we should do even the opposite and also move the maxent code in
> there?
> I am +1 on that, actually.
>
It does make sense given current tight integration. However, conceptually,
this will break modularization. Even MaxEnt is not a pure maxent anymore -
there is perceptron inside as well. Nicolas Hernandez mentioned in a recent
thread "for may be considering alternatives to the MaxEnt algorithm".
Rolling everything into one bundle will make these possible plans more
difficult. If these plans would advance, this might lead to some
abstraction to interfaces and (several) implementations, which might become
optional dependencies. So I would keep current level of modularization with
respect to maxent.


On the other side you could argument to cut things down, then you might end
> up
> with a couple of different sub-projects. Another prime candidate for moving
> is the coreference package because it introduces an extra dependency,
> which no
> other component needs.
>
Nice point. Illustrates similar situation. This is actually, a good
argument in favor of per-component modules, but for now that would
complicate things too much. So, here I would refactor it to make the
dependency on JWNL optional, continuing on the lines of existing Dictionary
interface and providing means to register your own implementation. This
will get rid of dependency. Afterall, there are alternatives.


> When you think this further you could end up with a sub-project per
> component.
>
Although there are scenarios where this makes sense, I would avoid this for
the moment.

In my experience having a project which is cut into sub-projects for no
> strong reasons is often more complex and more difficult then it needs to
> be.
>
Modularization needs to be balanced. Not too few modules - you end up with
a mess in one huge bundle, but not too many either.

What do you think?

Aliaksandr

Reply via email to