[jira] [Updated] (OPENNLP-790) Add an evaluator to the WSDisambiguator component

Mondher Bouazizi (JIRA) Mon, 27 Jul 2015 21:02:33 -0700

     [ 
https://issues.apache.org/jira/browse/OPENNLP-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mondher Bouazizi updated OPENNLP-790:
-------------------------------------
    Attachment: opennlp-wsd-20150728.patch

Dear all,

Kindly find attached the patch after I made our own semcor reader. Please note 
that, since the previous patch hasn't been applied, this patch contains the 
following:

- Fix for the IMS approach to Support Semsor3.0 data
- The output format is now [Source SenseKey] so it corresponds to that of Lesk.
- Removed some unused variables.
- Added Some parameters to let the user select the source of data he wants to 
use.
- Implemented the IMS Evaluator.
- Added and clarified some parts of the documentation.

Please Note that the following classes are no longer needed:
- IMSFactory
- Tester

Kindly also find the links to download both semcor3.0 [1] and senseval3 [2] 
data sets respectively. The files are slightly different from the original ones 
since many of them contain some issues (later, I will make sure to handle this 
issues in the program itself).
(@Anothony: please consider the new changes.)

Please note that the the folders in src/test/resources are now as follows:
- src
  - test
    - resources
      + semcor3.0
      + senseval3
      - supervised
        + models


---------------

To run the IMS evaluator, please follow these steps:

1- Create a resource models directory:

- src
  - test
    - resources
      + models

2- Include the following pre-trained models and dictionary in that directory:
You can find those here [1] if you like or pre-train your own models.

{
en-token.bin,
en-pos-maxent.bin,
en-sent.bin,
en-ner-person.bin,
en-lemmatizer.dict
}

3- Please add these folders :
- src
  - test
    - resources
       - semcor3.0
       - senseval3

As mentioned before, they are to be downloaded from [1] and [2].

4- I included an example[IMSTester.java] that you can run tedirectly, or make 
your own tests. The result is in the format:
[Source SenseKey]

5- The IMS evaluator is made to evaluate senseval3 data, using semcor3.0 as a 
source for training data.

If you need any further details, Please feel free to ask.

Yours sincerely,

Mondher

[1] 
https://drive.google.com/file/d/0ByL0dmKXzHVfd05ONGxKblQ4QXc/view?usp=sharing
[2] 
https://drive.google.com/file/d/0ByL0dmKXzHVfUlRhRnZUR003bUk/view?usp=sharing

> Add an evaluator to the WSDisambiguator component
> -------------------------------------------------
>
>                 Key: OPENNLP-790
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-790
>             Project: OpenNLP
>          Issue Type: Task
>          Components: wsd
>            Reporter: Anthony Beylerian
>         Attachments: edu.mit.jsemcor_1.0.1.jar, evaluatorv1.patch, 
> evaluatorv2.patch, helpers_data.patch, opennlp-wsd-20150718.patch, 
> opennlp-wsd-20150728.patch
>
>
> The WSDisambiguator needs an evaluator to measure the performance of its 
> different implementations. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (OPENNLP-790) Add an evaluator to the WSDisambiguator component

Reply via email to