[ 
https://issues.apache.org/jira/browse/OPENNLP-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498191#comment-14498191
 ] 

Łukasz Dróżdż edited comment on OPENNLP-287 at 4/16/15 3:42 PM:
----------------------------------------------------------------

Hi,

Here's my attempt at providing a sample POS dictionary file, as well as test 
code for programmatic usage, for both reading in and writing back the 
dictionary and using it to training a POS tagger. See the attached files for 
details.

The XML structure of a POS dictionary is:

<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
  <entry tags="tag1 tag2">
    <token>token1</token>
  </entry>
  <entry tags="tag1">
    <token>token2</token>
  </entry>
</dictionary>

Hope that helps.


was (Author: ldrozdz):
Hi,

Here's my attempt at providing a sample POS dictionary file, as well as test 
code for programmatic usage, both in reading in and writing back the dictionary 
and using it to training a POS tagger. See the attached files for details.

The XML structure of a POS dictionary is:

<?xml version="1.0" encoding="UTF-8"?>
<dictionary>
  <entry tags="tag1 tag2">
    <token>token1</token>
  </entry>
  <entry tags="tag1">
    <token>token2</token>
  </entry>
</dictionary>

Hope that helps.

> Extend POS Tagger documentation with more information about the tag dictionary
> ------------------------------------------------------------------------------
>
>                 Key: OPENNLP-287
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-287
>             Project: OpenNLP
>          Issue Type: Improvement
>          Components: Documentation, POS Tagger
>            Reporter: Joern Kottmann
>            Priority: Minor
>         Attachments: TaggerDictionaryTest.java, dictionary.xml, en-pos.train
>
>
> Extend the POS Tagger tag dictionary section as described in the 
> documentation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to