Hi,

Actually I have finished the implementation of most of the parts of the IMS
approach. I also made a parser for the Senseval-3 data.

However I am currently working on two main points:

- I am trying to figure out how to use the MaxEnt classifier. Unfortunately
there is no enough documentation, so I am trying to see how it is used by
the other components of OpenNLP. Any recommendation ?

- I am training on semcor and I will use it as soon as I finish the
implementation of the "train", "load" and "disambiguate" methods in the IMS
approach.

Regarding extJWNL, I actually worked with JWNL in a previous work. I
checked extJWNL, and what I would query WordNet for won't be different.

I will upload a patch as soon as I implement the aforementioned methods.

Best regards,

Mondher

On Fri, Jun 19, 2015 at 5:09 PM, Rodrigo Agerri <[email protected]>
wrote:

> Hi Mondher,
>
> On Fri, Jun 12, 2015 at 1:01 PM, Mondher Bouazizi
> <[email protected]> wrote:
> > Dear Rodrigo,
> >
> > Here is what I am planning to do in the next step:
> >
> > 1- I am currently implementing the IMS method, and using Senseval 3 data.
>
> Hi, I guess you are training on semcor?
>
> http://web.eecs.umich.edu/~mihalcea/downloads.html#semcor
>
>
> > Since the disambiguation training set, has to be very big (few hundreds
> of
> > MBs if we want it to contain all the words),I thought, may be it would be
> > better to load the model related to the word to disambiguate. Therefore,
> I
> > made the following:
> >
> >    - A folder containing the ".xml" files where the data related to each
> >    word are stored.
> >    - A folder containing the ".bin" files (one for each word)
> >
> > The idea is that each time, the module is called to disambiguate a word,
> we
> > first check if the ".bin" file exists. If it is there, we use the file to
> > disambiguate the word. Otherwise (i.e., the bin file is not there), we
> > check if the ".xml" file exist. If it does, the xml file data will be
> used
> > to train the model, and create the ".bin" file  (That way, the next time
> > the user wants to disambiguate the same word, the ".bin" file is already
> > there).
> >
> > Is that OK ?
>
> OK.
>
> >
> > 2- For the implementation, I should will refer to some external sources
> > (e.g., WordNet) to get the sense definition, because the classifier will
> > return only the ID of the sense. I have the choice either to query
> WordNet
> > for the senses (Note: in the case of Senseval 3 data, the senses are
> > already extracted and put in a separate file, however, when we generalize
> > the approach, probably the senses won't be given) or collect all the
> senses
> > once, store them, and refer to them to get the sense. I am planning to
> > implement the first approach, however, since I am experimenting on the
> > Senseval 3 data set, I will first use the resources I have there.
>
> Good. You can do that with extJWN, right?
>
> Cheers,
>
> Rodrigo
>

Reply via email to