Hi Mondher,

On Fri, Jun 12, 2015 at 1:01 PM, Mondher Bouazizi
<[email protected]> wrote:
> Dear Rodrigo,
>
> Here is what I am planning to do in the next step:
>
> 1- I am currently implementing the IMS method, and using Senseval 3 data.

Hi, I guess you are training on semcor?

http://web.eecs.umich.edu/~mihalcea/downloads.html#semcor


> Since the disambiguation training set, has to be very big (few hundreds of
> MBs if we want it to contain all the words),I thought, may be it would be
> better to load the model related to the word to disambiguate. Therefore, I
> made the following:
>
>    - A folder containing the ".xml" files where the data related to each
>    word are stored.
>    - A folder containing the ".bin" files (one for each word)
>
> The idea is that each time, the module is called to disambiguate a word, we
> first check if the ".bin" file exists. If it is there, we use the file to
> disambiguate the word. Otherwise (i.e., the bin file is not there), we
> check if the ".xml" file exist. If it does, the xml file data will be used
> to train the model, and create the ".bin" file  (That way, the next time
> the user wants to disambiguate the same word, the ".bin" file is already
> there).
>
> Is that OK ?

OK.

>
> 2- For the implementation, I should will refer to some external sources
> (e.g., WordNet) to get the sense definition, because the classifier will
> return only the ID of the sense. I have the choice either to query WordNet
> for the senses (Note: in the case of Senseval 3 data, the senses are
> already extracted and put in a separate file, however, when we generalize
> the approach, probably the senses won't be given) or collect all the senses
> once, store them, and refer to them to get the sense. I am planning to
> implement the first approach, however, since I am experimenting on the
> Senseval 3 data set, I will first use the resources I have there.

Good. You can do that with extJWN, right?

Cheers,

Rodrigo

Reply via email to