Hello,
lets update the proposal a little and extend it with the things we
discussed here.
Hannes and Olivier, do you want to take over the part about the web based
annotation tooling? I called it for now Corpus Refiner, but we can of
course change
the name to something else.
You should have editing rights now when you login with a confluence user.
Here is the link again:
https://cwiki.apache.org/OPENNLP/opennlp-annotations.html
Jörn
2011/6/24 James Kosin<[email protected]>:
Olivier,
No main() in the classes. So, how does one get the collection of
articles started?
It's meant to be used as a library. For instance, it is used by the
following custom pig Loader:
https://github.com/ogrisel/pignlproc/blob/master/src/main/java/pignlproc/storage/ParsingWikipediaLoader.java
which is in turn called in pig scripts such as:
https://github.com/ogrisel/pignlproc/blob/master/examples/extract_links.pig
Apache Pig is scripting language and runtime environment to perform
distributed data analysis on an Apache Hadoop (HDFS + MapReduce)
cluster.
http://pig.apache.org/