Hi James, thanks for the tips. Right now I use a separated Map to store attributes, that I build in the same time as the dictionary it works pretty well too. Loic Le 22.12.2011 01:27, James Kosin a écrit : Loic, First, the Span[] it returns is an array of Span that contain start and end markers for the names found in the text. It is a class that wraps the start and end markers to help easily use them. Each Span has a method to getStart(), and getEnd() that return these values. To get the name found for one of the Spans, you just have to get the list of tokens you provided as the input to find()... and use something like this:String sentence[] = {"The", "match", "was", "won", "by", "Vanessa", "Williams", "."}; Span names[] = mNameFinder.find(sentence); for ( i = 0; i < names.length; i ++ ) { Span entry = names[ i ]; for ( j = entry.getStart(); j < entry.getEnd(); j ++ ) { // use sentence[ j ] here for each index, to build the full name found or have a list of tokens. } // here, you should have created something in the loop above for the first name found. // each sentence can have multiple names found... so, keep looping. } // here you are done with this sentence. To find the tag, you will have to keep a copy of the Dictionary you passed into the DictionaryNameFinder constructor. Then, look up the tokens at the end of the second loop. This will allow you to get the longest match in the dictionary with the Span specified. Unfortunately, I see we don't have anything to get the matching entry from the dictionary.... or I'm overlooking something. Jorn: Any comments on this? Hope this helps some, James On 12/21/2011 7:54 AM, Loic Descotte wrote:On 12/20/2011 8:38 AM, Loic Descotte wrote:Hello, I'm trying to use OpenNLP Dictionary and DictionaryNameFinder to do a dictionnary lookup. I'm building my dictionary with the DictionarySerializer class. My dictionary contains entries with attributes. Example : <dictionary case_sensitive="false"> <entry ref="cheese"> <token>cheddar</token> </entry> <entry ref="vegetable"> <token>tomato</token> </entry> </dictionary> The keyword lookup is working but there are things I don't know how to do. 1. When I find a token in a text , I get a list of Span objects : Span[] spans = finder.find(tokenizedText); I don't know how to retrieve the found token attributes: For example, if I find "tomato", I would like to be able to retrieve the "ref" attribute (vegetable). --
Kelkoo SAS Société par Actions Simplifiée Au capital de € 4.168.964,30 Siège social : 8, rue du Sentier 75002 Paris 425 093 069 RCS Paris Ce message et les pièces jointes sont confidentiels et établis à l'attention exclusive de leurs destinataires. Si vous n'êtes pas le destinataire de ce message, merci de le détruire et d'en avertir l'expéditeur. |
- Some questions about Dictionary and DictionaryNameFinder Loic Descotte
- Re: Some questions about Dictionary and DictionaryNameF... James Kosin
- Re: Some questions about Dictionary and DictionaryNameF... James Kosin
- Re: Some questions about Dictionary and DictionaryN... Loic Descotte
- Re: Some questions about Dictionary and Diction... Loic Descotte
- Re: Some questions about Dictionary and Dic... James Kosin
- Re: Some questions about Dictionary an... Loic Descotte
- Re: Some questions about Dictionar... James Kosin
- Re: Some questions about Dicti... Loic Descotte