Dear Rodrigo, As Anthony mentioned in his previous email, I already started the implementation of the IMS approach. The pre-processing and the extraction of features have already been finished. Regarding the approach itself, it shows some potential according to the author though the features proposed are not so many, and are basic. I think the approach itself might be enhanced if we add more context specific features from some other approaches... (To do that, I need to run many experiments using different combinations of features, however, that should not be a problem). But the approach itself requires a linear SVM classifier, and as far as I know, OpenNLP has only a Maximum Entropy classifier. Is it OK to use libsvm ?
Regarding the training data, I started collecting some from different sources. Most of the existing rich corpora are licensed (Including the ones mentioned in the paper). The free ones I got for now are from the Senseval and Semeval websites. However, these are used just to evaluate the proposed methods in the workshops. Therefore, the words to disambiguate are few in number though the training data for each word are rich enough. In any case, the first tests with Senseval and Semeval collected should be finished soon. However, I am not sure if there is a rich enough Dataset we can use to make our model for the WSD module in the OpenNLP library. If you have any recommendation, I would be grateful if you can help me on this point. On the other hand, we're cleaning our implementation of the different variations of Lesk. However, we are currently using JWNL. If there are no objections, we will migrate to extJWNL. As Jörn mentioned sending an initial patch, should we separate our codes and upload two different patches to the two issues we created on the Jira (however, this means a lot of redundancy in the code), or shall we keep them in one project and upload it? If we opt for the latter case, which issue should we upload the patch to ? Thanks, Mondher, Anthony On Mon, Jun 8, 2015 at 7:51 PM, Rodrigo Agerri <rage...@apache.org> wrote: > Hello, > > +1 for using extJWNL instead of JWNL, I use it in some other projects > too and it is very nice IMHO. > > R > > On Sat, Jun 6, 2015 at 12:55 PM, Aliaksandr Autayeu > <aliaksa...@autayeu.com> wrote: > > Thinking of impartiality... Anyway, I'm the author of extJWNL in case you > > have questions. > > > > Aliaksandr > > > > On 6 June 2015 at 11:43, Richard Eckart de Castilho < > > richard.eck...@gmail.com> wrote: > > > >> On 05.06.2015, at 14:24, Anthony Beylerian < > anthonybeyler...@hotmail.com> > >> wrote: > >> > >> > So just to make sure, we are currently relying on JWNL to access > WordNet > >> as a resource. > >> > >> There is a more modern fork of JWNL available called > >> http://extjwnl.sourceforge.net . > >> It includes provisions of loading WordNet from the classpath, e.g. > >> from Maven dependencies. It might be a nice replacement for JWNL and is > >> also licensed > >> under the BSD license. Pre-packaged WordNet Maven artifacts are also > >> available. > >> > >> Cheers, > >> > >> -- Richard >