Re: Word sense disambiguation

Cristian Petroaca Thu, 01 Mar 2018 11:40:09 -0800

I know it's not open source. I was referring to replicating their graph
based model using Wordnet.


On Wed, Feb 28, 2018 at 8:47 AM, Rodrigo Agerri <rodrigo.age...@ehu.eus>
wrote:

> Hello,
>
> Babelfy is not open source software. DBpedia Spotlight performs Named
> Entity Disambiguation (APL 2.0), UKB (GPL) does WSD and obtains very
> good results, and the IMS system is available for download. There will
> be others, I am sure, but just talking off the top of my head.
>
> HTH
>
> R
>
> On Tue, Feb 27, 2018 at 9:22 PM, Cristian Petroaca
> <cristian.petro...@gmail.com> wrote:
> > I agree with you. WSD should be included in OpenNLP once it has a
> > reasonably good performance.
> > On the other hand, I have seen few libraries or APIs doing WSD and almost
> > none doing it right. That may be indicative of how hard the problem is.
> >
> > The only promising api I found is Babelfy : http://babelfy.org/about. It
> > uses a graph based model based on their BabelNet Knowledge base in order
> to
> > predict word senses. I think it's based on this paper:
> > http://www.aclweb.org/anthology/Q14-1019. Any thoughts on this?
> >
> > On Sat, Feb 24, 2018 at 7:49 PM, Anthony Beylerian <
> > anthony.beyler...@gmail.com> wrote:
> >
> >> Hey Cristian,
> >>
> >> We have tried different approaches such as:
> >>
> >> - Lesk (original) [1]
> >> - Most frequent sense from the data (MFS)
> >> - Extended Lesk (with different scoring functions)
> >> - It makes sense (IMS) [2]
> >> - A sense clustering approach (I don't immediately recall the reference)
> >>
> >> Lesk and MFS are meant to be used as baselines for evaluation purpose
> only.
> >> The extended version of Lesk is an effort to improve the original,
> through
> >> additional information from semantic relationships.
> >> Although it's not very accurate, it could be useful since it is an
> >> unsupervised method (no need for large training data).
> >> However, there were some caveats, as both approaches need to pre-load
> >> dictionaries as well as score a semantic graph from WordNet at runtime.
> >>
> >> IMS is a supervised method which we were hoping to mainly use, since it
> >> scored around 80% accuracy on SemEval, however that is only for the
> >> coarse-grained case. However, in reality words have various degrees of
> >> polysemy, and when tested in the fine-grained case the results were much
> >> lower.
> >> We have also experimented with a simple clustering approach but the
> >> improvements were not considerable as far as I remember.
> >>
> >> I just checked the latest results on Semeval2015 [3] and they look a bit
> >> improved on the fine-grained case ~65% F1.
> >> However, in some particular domains it looks like the accuracy
> increases,
> >> so it could depend on the use case.
> >>
> >> On the other hand, there could be some more recent studies that could
> yield
> >> better results, but that would need some more investigation.
> >>
> >> There are also some other issues such as lack of direct multi-lingual
> >> support from WordNet, missing sense definitions etc.
> >> We were also still looking for a better source of sense definitions back
> >> then.
> >> In any case, I believe it would be better to have higher performance
> before
> >> putting this in the official distribution, however that highly depends
> on
> >> the team.
> >> Otherwise, different parts of the code just need some simple
> refactoring as
> >> well.
> >>
> >> Best,
> >>
> >> Anthony
> >>
> >> [1] : M. Lesk, Automatic sense disambiguation using machine readable
> >> dictionaries
> >> [2] : https://www.comp.nus.edu.sg/~nght/pubs/ims.pdf
> >> [3] : http://alt.qcri.org/semeval2015/task13/index.php?id=results
> >>
> >> On Wed, Feb 21, 2018 at 5:26 AM, Cristian Petroaca <
> >> cristian.petro...@gmail.com> wrote:
> >>
> >> > Hi Anthony,
> >> >
> >> > I'd be interested to discuss this further.
> >> > What are the wsd methods used? Any links to papers?
> >> > How does the module perform when being evaluated against Senseval?
> >> >
> >> > How much work do you think it's necessary in order to have a
> functioning
> >> > WSD module in the context of OpenNLP?
> >> >
> >> > Thanks,
> >> > Cristian
> >> >
> >> >
> >> >
> >> > On Tue, Feb 20, 2018 at 8:09 AM, Anthony Beylerian <
> >> > anthony.beyler...@gmail.com> wrote:
> >> >
> >> >> Hi Cristian,
> >> >>
> >> >> Thank you for your interest.
> >> >>
> >> >> The WSD module is currently experimental, so as far as I am aware
> there
> >> >> is no timeline for it.
> >> >>
> >> >> You can find the sandboxed version here:
> >> >> https://github.com/apache/opennlp-sandbox/tree/master/opennlp-wsd
> >> >>
> >> >> I personally didn't have the time to revisit this for a while and
> there
> >> >> are still some details to work out.
> >> >> But if you are really interested, you are welcome to discuss and
> >> >> contribute.
> >> >> I will assist as much as possible.
> >> >>
> >> >> Best,
> >> >>
> >> >> Anthony
> >> >>
> >> >> On Sun, Feb 18, 2018 at 5:52 AM, Cristian Petroaca <
> >> >> cristian.petro...@gmail.com> wrote:
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> I'm interested in word sense disambiguation (particularly based on
> >> >>> Wordnet). I noticed that the latest OpenNLP version doesn't have any
> >> but
> >> >>> I
> >> >>> remember that a couple of years ago there was somebody working on
> >> >>> implementing it. Why isn't it in the official OpenNLP jar? Is there
> a
> >> >>> timeline for adding it?
> >> >>>
> >> >>> Thanks,
> >> >>> Cristian
> >> >>>
> >> >>
> >> >>
> >> >
> >>
>

Re: Word sense disambiguation

Reply via email to