Loic,

For #2:
Your dictionary will contain more than 1 <token></token> for the same entry.
> <entry ref="cheese">
>    <token>cheddar</token>
    <token>cheese</token>
> </entry>
would label "cheddar" "cheese" as a "cheese".

I'm still looking at the bug for #3.

James

On 12/20/2011 8:38 AM, Loic Descotte wrote:
> Hello,
> I'm trying to use OpenNLP Dictionary and DictionaryNameFinder to do a
> dictionnary lookup.
>
> I'm building my dictionary with the DictionarySerializer class.
> My dictionary contains entries with attributes.
>
> Example :
>
> <dictionary case_sensitive="false">
> <entry ref="cheese">
>    <token>cheddar</token>
> </entry>
> <entry ref="vegetable">
>    <token>tomato</token>
> </entry>
> </dictionary>
>
>
> The keyword lookup is working but there are things I don't know how to
> do.
>
> 1.
> When I find a token in a text , I get a list of Span objects :
>
> Span[] spans = finder.find(tokenizedText);
>
> I don't know how to retrieve the found token attributes:
> For example, if I find "tomato", I would like to be able to retrieve
> the "ref" attribute (vegetable).
>
> 2.
> If in my dictionary I want to find a composed name (e.g. green
> cabbage) , I am able to find "green", "cabage", but not "green
> cabbage". Is there a special way to insert composed names in the
> dictionary?
>
> 3. I've set my dictionnary to "case_sensitive="false" " but if there
> is "Tomato" in my text, then "tomato" will not be found.
>
> Thanks a lot for your help
>
> -- 
> Loic
>
> ________________________________
> Kelkoo SAS
> Société par Actions Simplifiée
> Au capital de € 4.168.964,30
> Siège social : 8, rue du Sentier 75002 Paris
> 425 093 069 RCS Paris
>
> Ce message et les pièces jointes sont confidentiels et établis à
> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le
> destinataire de ce message, merci de le détruire et d'en avertir
> l'expéditeur.
>

Reply via email to