Hi, I was wondering if the training data for the OpenNLP maxent POS tagger models is public and available somewhere. I would like to train models for the pos tagger and the chunker that work on sentences without case (i.e. all capitalized). If I had the training data used for en-pos-maxent.bin, a first pass would simply mean capitalizing the tokens and running the trainer. It appears that the chunker training data somes from CONLL2000 ( http://www.cnts.ua.ac.be/conll2000/chunking/).
I would be happy to share the models with OpenNLP if anyone thought they would be of use to others. Peace. Michael