Hi, Thanks it is still being improved.
I am not sure what you mean by type or database ID. Currently the sense source and the sense ID are returned. For example: "I went to the bank to deposit money." target : bank (index : 4) expected output : [WORDNET bank%1:14:00:: 21.6, WORDNET bank%1:21:01:: 5.8,... etc] Where "bank%1:14:00::" is a SenseKey which you can query WordNet with to give you a sense definition. You can do this using the default dictionary : Dictionary.getDefaultResourceInstance().getWordBySenseKey(sensekey).getSynset().getGloss(); Hope this is what you are looking for, otherwise please clarify. Anthony Beylerian On Tue, Sep 8, 2015 at 5:34 PM, Cristian Petroaca < cristian.petro...@gmail.com> wrote: > Hi Anthony, > > I had a chance to test the wsd component. That's great work. Thanks. > One question, is it possible to return the wordnet type (or database id) of > the disambiguated word? > > Thanks, > Cristian > > On Fri, Jul 24, 2015 at 1:14 PM, Anthony Beylerian < > anthonybeyler...@hotmail.com> wrote: > > > Hi, > > > > To try out the ongoing implementations, after checking out the sandbox > > repository please try these steps : > > 1- Create a resource models directory: > > > > - src > > - test > > - resources > > + models > > > > 2- Include the following pre-trained models and dictionary in that > > directory: > > You can find those here [1] if you like or pre-train your own models. > > > > { > > en-token.bin, > > en-pos-maxent.bin, > > en-sent.bin,en-ner-person.bin,en-lemmatizer.dict > > } > > > > As to train the IMS approach you need to include training data like > > senseval3 [2]: > > For now, please add these folders : > > - src > > - test > > - resources > > - supervised > > + raw > > + models > > + dictionary > > > > You can find the data files here [2]. > > > > 3- We included two examples [LeskTester.java] and [IMSTester.java] that > > you can run directly, or make your own tests. > > > > To run a custom test, minimally you need to have a tokenized text or > > sentence for example for Lesk: > > > > 1>> String[] words = Loader.getTokenizer().tokenize(sentence); > > > > Chose the index of the word to disambiguate in the token array. > > > > 2>> int wordIndex= 6; > > > > Then just create a WSDisambiguator object for example for Lesk : > > > > 3>> Lesk lesk = new Lesk(); > > > > And you can call the default disambiguation method > > > > 4>> lesk.disambiguate(words,wordIndex); > > > > You will get an array of strings with the following format : > > > > Lesk : [Source SenseKey Score] > > > > To read the sense definitions you can use the method : > > [opennlp.tools.disambiguator.Constants.printResults] > > > > For using the variations of Lesk, you will need to create and configure a > > parameters object: > > 5>> LeskParameters leskParams = new LeskParameters(); > > 6>> > > leskParams.setLeskType(LeskParameters.LESK_TYPE.LESK_BASIC_CTXT_WIN_BF); > > 7>> leskParams.setWin_b_size(4); 8>> > > leskParams.setDepth(3); 9>> lesk.setParams(leskParams); > > > > Typically, IMS should perform better than Lesk, since Lesk is a classic > > method but it usually used as a baseline along with the most frequent > sense > > (MFS). > > However, we will be testing and adding more techniques. > > > > In any case, please feel free to ask for more details. > > > > Best, > > > > Anthony > > > > [1] : > > > https://drive.google.com/folderview?id=0B67Iu3pf6WucfjdYNGhDc3hkTXd1a3FORnNUYzd3dV9YeWlyMFczeHU0SE1TcWwyU1lhZFU&usp=sharing > > [2] : > > > https://drive.google.com/file/d/0ByL0dmKXzHVfSXA3SVZiMnVfOGc/view?usp=sharing > > > Date: Fri, 24 Jul 2015 09:54:02 +0200 > > > Subject: Re: Word Sense Disambiguator > > > From: kottm...@gmail.com > > > To: dev@opennlp.apache.org > > > > > > It would be nice if you could share instructions on how to run it. > > > I also would like to give it a try. > > > > > > Jörn > > > > > > On Fri, Jul 24, 2015 at 4:54 AM, Anthony Beylerian < > > > anthonybeyler...@hotmail.com> wrote: > > > > > > > Hello, > > > > Yes for the moment we are only using WordNet for sense > definitions.The > > > > plan is to complete the package by mid to late August, but if you > like > > you > > > > can follow up on the progress from the sandbox. > > > > Best regards, > > > > Anthony > > > > > Date: Thu, 23 Jul 2015 15:36:57 +0300 > > > > > Subject: Word Sense Disambiguator > > > > > From: cristian.petro...@gmail.com > > > > > To: dev@opennlp.apache.org > > > > > > > > > > Hi, > > > > > > > > > > I saw that there are people actively working on a Word Sense > > > > Disambiguator. > > > > > DO you guys know when will the module be ready to use? Also I > assume > > that > > > > > wordnet is used to define the disambiguated word meaning? > > > > > > > > > > Thanks, > > > > > Cristian > > > > > > > > > > > > >