Hi, To try out the ongoing implementations, after checking out the sandbox repository please try these steps : 1- Create a resource models directory:
- src - test - resources + models 2- Include the following pre-trained models and dictionary in that directory: You can find those here [1] if you like or pre-train your own models. { en-token.bin, en-pos-maxent.bin, en-sent.bin,en-ner-person.bin,en-lemmatizer.dict } As to train the IMS approach you need to include training data like senseval3 [2]: For now, please add these folders : - src - test - resources - supervised + raw + models + dictionary You can find the data files here [2]. 3- We included two examples [LeskTester.java] and [IMSTester.java] that you can run directly, or make your own tests. To run a custom test, minimally you need to have a tokenized text or sentence for example for Lesk: 1>> String[] words = Loader.getTokenizer().tokenize(sentence); Chose the index of the word to disambiguate in the token array. 2>> int wordIndex= 6; Then just create a WSDisambiguator object for example for Lesk : 3>> Lesk lesk = new Lesk(); And you can call the default disambiguation method 4>> lesk.disambiguate(words,wordIndex); You will get an array of strings with the following format : Lesk : [Source SenseKey Score] To read the sense definitions you can use the method : [opennlp.tools.disambiguator.Constants.printResults] For using the variations of Lesk, you will need to create and configure a parameters object: 5>> LeskParameters leskParams = new LeskParameters(); 6>> leskParams.setLeskType(LeskParameters.LESK_TYPE.LESK_BASIC_CTXT_WIN_BF); 7>> leskParams.setWin_b_size(4); 8>> leskParams.setDepth(3); 9>> lesk.setParams(leskParams); Typically, IMS should perform better than Lesk, since Lesk is a classic method but it usually used as a baseline along with the most frequent sense (MFS). However, we will be testing and adding more techniques. In any case, please feel free to ask for more details. Best, Anthony [1] : https://drive.google.com/folderview?id=0B67Iu3pf6WucfjdYNGhDc3hkTXd1a3FORnNUYzd3dV9YeWlyMFczeHU0SE1TcWwyU1lhZFU&usp=sharing [2] : https://drive.google.com/file/d/0ByL0dmKXzHVfSXA3SVZiMnVfOGc/view?usp=sharing > Date: Fri, 24 Jul 2015 09:54:02 +0200 > Subject: Re: Word Sense Disambiguator > From: kottm...@gmail.com > To: dev@opennlp.apache.org > > It would be nice if you could share instructions on how to run it. > I also would like to give it a try. > > Jörn > > On Fri, Jul 24, 2015 at 4:54 AM, Anthony Beylerian < > anthonybeyler...@hotmail.com> wrote: > > > Hello, > > Yes for the moment we are only using WordNet for sense definitions.The > > plan is to complete the package by mid to late August, but if you like you > > can follow up on the progress from the sandbox. > > Best regards, > > Anthony > > > Date: Thu, 23 Jul 2015 15:36:57 +0300 > > > Subject: Word Sense Disambiguator > > > From: cristian.petro...@gmail.com > > > To: dev@opennlp.apache.org > > > > > > Hi, > > > > > > I saw that there are people actively working on a Word Sense > > Disambiguator. > > > DO you guys know when will the module be ready to use? Also I assume that > > > wordnet is used to define the disambiguated word meaning? > > > > > > Thanks, > > > Cristian > > > >