On 11/24/11 1:39 PM, Aliaksandr Autayeu wrote:
> > I think the additional complexity (and things which can go wrong) isn't
> > worth the small advantage of less classes on the users class path.
>
The primary reason is to have proper modularity, out of which follow other
things. The CLI does not belong to the library, and this should be
reflected in the structure, to avoid, for example that CLI classes are by
chance used somewhere in the library. For example, while working in this, I
found the duplicated procedure. Proper code modularization makes one aware
of such things.
James, thank you for input!
Do you have a sample here? As far as I know we almost don't have
dependencies
on the CLI package from other classes, expect for the formats package.
A user who is using our library is responsible to not use API which is not
public. If he still wants to use things, it is not our problem when we break
compatibility. Violating this rule might cause pain to him.
Sadly we cannot enforce this in standard java jar files. In OSGi this
is actually possible, there you can define "private" classes, and that
is even possible for classes which are declared public.
> I kinda agree with you. I download the source frequently and mainly use
> the classes as libraries;
I'm curious - it always interesting to know the workflow of others, may be
it is more efficient. I also use sources. I have a development source (or
my experimental branch) and I do "mvn package" and unpack the release into
the new folder, then switch OPENNLP_HOME to it and I'm on another version.
This also enables immediate testing whether there is a new artifact missing
in the release, or badly packaged jars or files which are not committed.
Can you elaborate on your process, may be it is more efficient?
I usually go to opennlp-tools, do a "mvn clean install" after a code change
there and then type "bin/opennlp ..." to run the cli tools.
With an additional module I need to build it as well, or I have the risk
that
things are out of sync. This out of sync happens once in a while with
maxent.
however, I really need the CLI stuff to train
> and test the models that I use. Separating them would complicate the
release
Can you elaborate, how exactly having CLI module will complicate the
release? Would it increase the number of files released? No. Would a
developer have to enter one more command to make a release? No. Anything
else I'm not aware of?
> and cause more issues;
Please, can you elaborate on this too? May be there is something I don't
know. Which issues this can cause?
Having three separate jar files (maxent, tools, cli) can cause issues
when the
versions are incompatible (maybe someone forget to update maxent),
you need to put it three times on your class path, you can get issues
with inter-module code changes, an additional step in the build which can
go wrong, etc.
Additionally I once in a while use the cli stuff to do testing in my
UIMA-AS system.
There it is just handy that the cli stuff is on my server by default.
In UIMA-AS you have a lib folder where you put the jars of your application.
I can simply go to this lib folder and type something like this:
java -cp tools.jar:maxent.jar opennlp.tools.cmdline.CLI ...
and that gives me access to these tools.
Sure when it is separate I can copy it there manually, or just let my
maven build put it there,
but that is again an additional step, and the cli classes there don't
hurt me.
Well, we are still missing a few commands I wish to have on my server, e.g.
show me the version of a model.
> The examples in the classes
> themselves are being deprecated soon.
>
+1 I also wanted to get rid of "mains" in the classes. It is a library
after all, it should not contains such things.
The mains have been refactored into the cli package. They are still there
and should be removed with the 1.6 release, as well as other deprecated
code.
We don't have any examples, but if we would have some, they should be
distributed as source files. Because the whole point of having them is
that people can read and copy them.
Jörn