Hello,
I started to work on a guide which explains how to use
the existing tools, but still need to make progress to have
it in a useful state, up to now is just explains how
to install the Corpus Server.
Its in our wiki:
https://cwiki.apache.org/OPENNLP/labeling-wikinews-articles-with-the-corpus-server-and-the-uima-cas-editor.html
Next step will be to explain how to get the wikinews data loaded, how to
open it in the Cas Editor,
how to configure the eclipse plugin.
We would really need help with the web based labeling tools, the tagging
server and a bit later
also with labeling data.
If you would like to participate it should not be hard to get involved
for you.
Jörn
On 07/05/2012 11:25 AM, florent andré wrote:
Hi,
Simple and shared annotation tool will be really the way to go imo.
Thanks for starting this.
I see this :
https://cwiki.apache.org/OPENNLP/opennlp-annotations.html
And some code here :
https://svn.apache.org/repos/asf/opennlp/sandbox/
(corpus-server-*, caseditor-*)
Could it be possible to have some bootstraping information to try and
give a hand to that ?
What is the TODO list to get a working tool ?
Thanks for that !
++
On 06/26/2012 10:32 AM, Jörn Kottmann wrote:
On 06/26/2012 10:21 AM, Bertrand Delacretaz wrote:
I cannot speak for OpenNLP but I'm sure Stanbol would be happy to host
your models if that helps - a neutral place like the Apache Software
Foundation is probably good for such efforts.
We would like to offer models over at OpenNLP, the reason we do
not distribute them is that we are restricted in terms of the license
here at Apache for the models we have today.
It would be really great to have training data under an Open Source
license
we can use to produce models under AL 2.0.
We started to work on annotation tooling over at OpenNLP but are slow
because
we lack resources, would be nice to have people to help out with that.
Wikinews seems to be an interesting source of data and is available in
many languages.
Jörn