I know it's not open source. I was referring to replicating their graph based model using Wordnet.
On Wed, Feb 28, 2018 at 8:47 AM, Rodrigo Agerri <rodrigo.age...@ehu.eus> wrote: > Hello, > > Babelfy is not open source software. DBpedia Spotlight performs Named > Entity Disambiguation (APL 2.0), UKB (GPL) does WSD and obtains very > good results, and the IMS system is available for download. There will > be others, I am sure, but just talking off the top of my head. > > HTH > > R > > On Tue, Feb 27, 2018 at 9:22 PM, Cristian Petroaca > <cristian.petro...@gmail.com> wrote: > > I agree with you. WSD should be included in OpenNLP once it has a > > reasonably good performance. > > On the other hand, I have seen few libraries or APIs doing WSD and almost > > none doing it right. That may be indicative of how hard the problem is. > > > > The only promising api I found is Babelfy : http://babelfy.org/about. It > > uses a graph based model based on their BabelNet Knowledge base in order > to > > predict word senses. I think it's based on this paper: > > http://www.aclweb.org/anthology/Q14-1019. Any thoughts on this? > > > > On Sat, Feb 24, 2018 at 7:49 PM, Anthony Beylerian < > > anthony.beyler...@gmail.com> wrote: > > > >> Hey Cristian, > >> > >> We have tried different approaches such as: > >> > >> - Lesk (original) [1] > >> - Most frequent sense from the data (MFS) > >> - Extended Lesk (with different scoring functions) > >> - It makes sense (IMS) [2] > >> - A sense clustering approach (I don't immediately recall the reference) > >> > >> Lesk and MFS are meant to be used as baselines for evaluation purpose > only. > >> The extended version of Lesk is an effort to improve the original, > through > >> additional information from semantic relationships. > >> Although it's not very accurate, it could be useful since it is an > >> unsupervised method (no need for large training data). > >> However, there were some caveats, as both approaches need to pre-load > >> dictionaries as well as score a semantic graph from WordNet at runtime. > >> > >> IMS is a supervised method which we were hoping to mainly use, since it > >> scored around 80% accuracy on SemEval, however that is only for the > >> coarse-grained case. However, in reality words have various degrees of > >> polysemy, and when tested in the fine-grained case the results were much > >> lower. > >> We have also experimented with a simple clustering approach but the > >> improvements were not considerable as far as I remember. > >> > >> I just checked the latest results on Semeval2015 [3] and they look a bit > >> improved on the fine-grained case ~65% F1. > >> However, in some particular domains it looks like the accuracy > increases, > >> so it could depend on the use case. > >> > >> On the other hand, there could be some more recent studies that could > yield > >> better results, but that would need some more investigation. > >> > >> There are also some other issues such as lack of direct multi-lingual > >> support from WordNet, missing sense definitions etc. > >> We were also still looking for a better source of sense definitions back > >> then. > >> In any case, I believe it would be better to have higher performance > before > >> putting this in the official distribution, however that highly depends > on > >> the team. > >> Otherwise, different parts of the code just need some simple > refactoring as > >> well. > >> > >> Best, > >> > >> Anthony > >> > >> [1] : M. Lesk, Automatic sense disambiguation using machine readable > >> dictionaries > >> [2] : https://www.comp.nus.edu.sg/~nght/pubs/ims.pdf > >> [3] : http://alt.qcri.org/semeval2015/task13/index.php?id=results > >> > >> On Wed, Feb 21, 2018 at 5:26 AM, Cristian Petroaca < > >> cristian.petro...@gmail.com> wrote: > >> > >> > Hi Anthony, > >> > > >> > I'd be interested to discuss this further. > >> > What are the wsd methods used? Any links to papers? > >> > How does the module perform when being evaluated against Senseval? > >> > > >> > How much work do you think it's necessary in order to have a > functioning > >> > WSD module in the context of OpenNLP? > >> > > >> > Thanks, > >> > Cristian > >> > > >> > > >> > > >> > On Tue, Feb 20, 2018 at 8:09 AM, Anthony Beylerian < > >> > anthony.beyler...@gmail.com> wrote: > >> > > >> >> Hi Cristian, > >> >> > >> >> Thank you for your interest. > >> >> > >> >> The WSD module is currently experimental, so as far as I am aware > there > >> >> is no timeline for it. > >> >> > >> >> You can find the sandboxed version here: > >> >> https://github.com/apache/opennlp-sandbox/tree/master/opennlp-wsd > >> >> > >> >> I personally didn't have the time to revisit this for a while and > there > >> >> are still some details to work out. > >> >> But if you are really interested, you are welcome to discuss and > >> >> contribute. > >> >> I will assist as much as possible. > >> >> > >> >> Best, > >> >> > >> >> Anthony > >> >> > >> >> On Sun, Feb 18, 2018 at 5:52 AM, Cristian Petroaca < > >> >> cristian.petro...@gmail.com> wrote: > >> >> > >> >>> Hi, > >> >>> > >> >>> I'm interested in word sense disambiguation (particularly based on > >> >>> Wordnet). I noticed that the latest OpenNLP version doesn't have any > >> but > >> >>> I > >> >>> remember that a couple of years ago there was somebody working on > >> >>> implementing it. Why isn't it in the official OpenNLP jar? Is there > a > >> >>> timeline for adding it? > >> >>> > >> >>> Thanks, > >> >>> Cristian > >> >>> > >> >> > >> >> > >> > > >> >