[ 
https://issues.apache.org/jira/browse/OPENNLP-368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13148446#comment-13148446
 ] 

Aliaksandr Autayeu commented on OPENNLP-368:
--------------------------------------------

I have a number of private datasets (17 and growing), ranging from 2K to
130K short labels (2-3 tokens on avg) like these

Cleaning_NN Equipment_NN and_CC Supplies_NNS
Commercial_JJ and_CC Military_JJ and_CC Private_JJ Vehicles_NNS and_CC
their_PP$ Accessories_NNS and_CC Components_NNS
Defense_NN and_CC Law_NN Enforcement_NN and_CC Security_NN and_CC Safety_NN
Equipment_NN and_CC Supplies_NNS
or these
groups_NNS of_IN animals_NNS
lower_JJR animals_NNS
mammals_NNS
landscapes_NNS with_IN waters_NNS ,_, waterscapes_NNS ,_, seascapes_NNS (_(
in_IN the_DT temperate_JJ zone_NN )_)

on which I regularly train and test tokenizer and POS tagger (have also
some NE-annotations, but currently do not work on them). Perhaps I can test
on these, if given proper instructions.

Aliaksandr


On Fri, Nov 11, 2011 at 12:02 PM, Joern Kottmann (Commented) (JIRA) <


                
> loops improved in opennlp-tools
> -------------------------------
>
>                 Key: OPENNLP-368
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-368
>             Project: OpenNLP
>          Issue Type: Improvement
>    Affects Versions: tools-1.5.3-incubating
>            Reporter: Aliaksandr Autayeu
>            Priority: Minor
>              Labels: patch
>         Attachments: 0008-loops-improved-in-tools.patch
>
>
> Many old-style indexed loops replaced with Java5 for each loops to improve 
> code readability and reduce possibility of bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to