[
https://issues.apache.org/jira/browse/OPENNLP-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mondher Bouazizi updated OPENNLP-790:
-------------------------------------
Attachment: opennlp-wsd-20150728.patch
Dear all,
Kindly find attached the patch after I made our own semcor reader. Please note
that, since the previous patch hasn't been applied, this patch contains the
following:
- Fix for the IMS approach to Support Semsor3.0 data
- The output format is now [Source SenseKey] so it corresponds to that of Lesk.
- Removed some unused variables.
- Added Some parameters to let the user select the source of data he wants to
use.
- Implemented the IMS Evaluator.
- Added and clarified some parts of the documentation.
Please Note that the following classes are no longer needed:
- IMSFactory
- Tester
Kindly also find the links to download both semcor3.0 [1] and senseval3 [2]
data sets respectively. The files are slightly different from the original ones
since many of them contain some issues (later, I will make sure to handle this
issues in the program itself).
(@Anothony: please consider the new changes.)
Please note that the the folders in src/test/resources are now as follows:
- src
- test
- resources
+ semcor3.0
+ senseval3
- supervised
+ models
---------------
To run the IMS evaluator, please follow these steps:
1- Create a resource models directory:
- src
- test
- resources
+ models
2- Include the following pre-trained models and dictionary in that directory:
You can find those here [1] if you like or pre-train your own models.
{
en-token.bin,
en-pos-maxent.bin,
en-sent.bin,
en-ner-person.bin,
en-lemmatizer.dict
}
3- Please add these folders :
- src
- test
- resources
- semcor3.0
- senseval3
As mentioned before, they are to be downloaded from [1] and [2].
4- I included an example[IMSTester.java] that you can run tedirectly, or make
your own tests. The result is in the format:
[Source SenseKey]
5- The IMS evaluator is made to evaluate senseval3 data, using semcor3.0 as a
source for training data.
If you need any further details, Please feel free to ask.
Yours sincerely,
Mondher
[1]
https://drive.google.com/file/d/0ByL0dmKXzHVfd05ONGxKblQ4QXc/view?usp=sharing
[2]
https://drive.google.com/file/d/0ByL0dmKXzHVfUlRhRnZUR003bUk/view?usp=sharing
> Add an evaluator to the WSDisambiguator component
> -------------------------------------------------
>
> Key: OPENNLP-790
> URL: https://issues.apache.org/jira/browse/OPENNLP-790
> Project: OpenNLP
> Issue Type: Task
> Components: wsd
> Reporter: Anthony Beylerian
> Attachments: edu.mit.jsemcor_1.0.1.jar, evaluatorv1.patch,
> evaluatorv2.patch, helpers_data.patch, opennlp-wsd-20150718.patch,
> opennlp-wsd-20150728.patch
>
>
> The WSDisambiguator needs an evaluator to measure the performance of its
> different implementations.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)