Re: Some questions about Dictionary and DictionaryNameFinder

James Kosin Wed, 21 Dec 2011 16:27:59 -0800

Loic,

First, the Span[] it returns is an array of Span that contain start and
end markers for the names found in the text.  It is a class that wraps
the start and end markers to help easily use them.  Each Span has a
method to getStart(), and getEnd() that return these values.  To get the
name found for one of the Spans, you just have to get the list of tokens
you provided as the input to find()... and use something like this:


    String sentence[] = {"The", "match", "was", "won", "by", "Vanessa",
"Williams", "."};

    Span names[] = mNameFinder.find(sentence);

    for ( i = 0; i < names.length; i ++ ) {
        Span entry = names[ i ];

        for ( j = entry.getStart(); j < entry.getEnd(); j ++ ) {
            // use sentence[ j ] here for each index, to build the full
name found or have a list of tokens.
        }
        // here, you should have created something in the loop above for
the first name found.
        // each sentence can have multiple names found... so, keep looping.
    }
    // here you are done with this sentence.

To find the tag, you will have to keep a copy of the Dictionary you
passed into the DictionaryNameFinder constructor.  Then, look up the
tokens at the end of the second loop.  This will allow you to get the
longest match in the dictionary with the Span specified.
Unfortunately, I see we don't have anything to get the matching entry
from the dictionary.... or I'm overlooking something.

Jorn:  Any comments on this?

Hope this helps some,
James



On 12/21/2011 7:54 AM, Loic Descotte wrote:
>  
>>>>> On 12/20/2011 8:38 AM, Loic Descotte wrote:
>>>>>> Hello,
>>>>>> I'm trying to use OpenNLP Dictionary and DictionaryNameFinder to do a
>>>>>> dictionnary lookup.
>>>>>>
>>>>>> I'm building my dictionary with the DictionarySerializer class.
>>>>>> My dictionary contains entries with attributes.
>>>>>>
>>>>>> Example :
>>>>>>
>>>>>> <dictionary case_sensitive="false">
>>>>>> <entry ref="cheese">
>>>>>>     <token>cheddar</token>
>>>>>> </entry>
>>>>>> <entry ref="vegetable">
>>>>>>     <token>tomato</token>
>>>>>> </entry>
>>>>>> </dictionary>
>>>>>>
>>>>>>
>>>>>> The keyword lookup is working but there are things I don't know how to
>>>>>> do.
>>>>>>
>>>>>> 1.
>>>>>> When I find a token in a text , I get a list of Span objects :
>>>>>>
>>>>>> Span[] spans = finder.find(tokenizedText);
>>>>>>
>>>>>> I don't know how to retrieve the found token attributes:
>>>>>> For example, if I find "tomato", I would like to be able to retrieve
>>>>>> the "ref" attribute (vegetable).
>>>>>>
>>>>>>

Re: Some questions about Dictionary and DictionaryNameFinder

Reply via email to