Hi Boris,

I think it would be a good idea to mature it a bit more and get everyone
a bit more familiar with the code base.

I created a jira to move the Porter Stemmer over to the tools package:
https://issues.apache.org/jira/browse/OPENNLP-337

This work includes the definition of an interface, and we would need to
write a test for the stemmer so we know it works, should be easy to test.

I just tried to compile the project and still get a couple of errors would be nice if you can fix these. It looks like the tests are referencing models which do not
exist in my file system.

Furthermore it would be nice if you can do the change you did for the pos tagger also for the chunker, where you extract the pos tags from the Parse objects instead of running the POS Tagger. The Parse object also includes the chunk information,
so there should be no need to run the chunker.

We would need a bit documentation so that people can understand what it does
and how it can be used.

What do you think?

Jörn

On 11/7/11 11:38 PM, Boris Galitsky wrote:
Hi Jörn

I think the 'similarity' module is in a good shape now, what would be the next steps?

Regards
Boris



Reply via email to