Hallo Abhishek,
Cool that have you here! For the keyword search topic, please checkout
* http://goo.gl/dPbP3F
* http://dl.acm.org/citation.cfm?id=2488488
Feel free to contact me for questions and/or a warm-up task.
Best regards,
Axel
Hi all,
Recently I checked out the ideas list of DBpedia for GSoC 2015 and I
should admit one thing that every idea is more interesting than the
previous one. While I was looking out for ideas that interests me I
found following ideas most fascinating and I wish I could work on all
of them but unfortunately I couldn't:
1) 5.1 Fact Extraction from Wikipedia Text
2) 5.9 Keyword Search on DBpedia Data
3) 5.16 DBpedia Spotlight - Better Context Vectors
4) 5.17 DBpedia Spotlight - Better Surface form Matching
5) 5.19 DBpedia Spotlight - Confidence / Relevance Scores
But in all these I found a couple of ideas interlinked, in other words
one solution might leads to another. Like in 5.1, 5.16, 5.17 our
primary problems are Entity Linking (EL) and Word Sense Disambiguation
(WSD) from raw text to DBpedia entities so as to understand raw text
and disambiguate senses or entities. So if we can address these two
tasks efficiently then we can solve problems associated with these
three ideas.
Following are some methods which were there in the research papers
mentioned in references of these ideas.
1) FrameNet: Identify frames (indicating a particular type of
situation along with its participants, i.e. task, doer and props), and
then identify Logical Units, and their associated Frame Elements by
using models trained primarily on crowd-sourced data. Primarily used
for Automatic Semantic Role Labeling.
2) Babelfy: Using a wide semantic network, encoding structural and
lexical information of both type encyclopedic and lexicographic like
Wikipedia and WordNet resp., we can also accomplish our tasks (EL and
WSD). In this a graphical method along with some heuristics is used to
extract out the most relevant meaning from the text.
3) Word2vec / Glove - Methods for designing word vectors based on the
context. These are primarily employed for WSD.
Moreover if those problems are solved then we can address keyword
search (5.9) and Confidence Scoring (5.19) effectively as both require
association of entities to the raw text which will provide concerned
entity and its attributes to search with and the confidence score.
So I would like to work on 5.16 or 5.17 which will encompass those two
tasks (EL and WSD) and for this I would like to ask which method will
be the best for these two tasks? According to me it is the babelfy
method which will be appropriate for both of these tasks.
Thanks,
Abhishek Gupta
On Feb 23, 2015 5:46 PM, "Thiago Galery" <[email protected]
<mailto:[email protected]>> wrote:
Hi Abishek, if you are interested in contributing to any DBpedia
project or participating in Gsoc this year it might be a good idea
to take a look at this page http://wiki.dbpedia.org/gsoc2015/ideas
. This might help you to specify how/where you can contribute.
Hope this helps,
Thiago
On Sun, Feb 22, 2015 at 2:09 PM, Abhishek Gupta <[email protected]
<mailto:[email protected]>> wrote:
Hi all,
I am Abhishek Gupta. I am a student of Electrical Engineering
from IIT Delhi. Recently I have worked on the projects related
to Machine Learning and Natural Language Processing (i.e.
Information Extraction) in which I extracted Named Entities
from raw text to populate knowledge base with new entities.
Hence I am inclined to work in this area. Besides this I am
also familiar with programming languages like C, C++ and Java
primarily.
So I presume that I can contribute a lot towards extracting
structured data from wikipedia which is one of the primary
step towards Dbpedia's primary goal.
So can anyone please help me out where to start from so as to
contribute towards this?
Regards
Abhishek Gupta
------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and
Dashboards
with Interactivity, Sharing, Native Excel Exports, App
Integration & more
Get technology previously reserved for billion-dollar
corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
--
Axel Ngonga, Dr. rer. nat
Head of AKSW
Augustusplatz 10
Room P905
04109 Leipzig
http://aksw.org/AxelNgonga
Tel: +49 (0)341 9732341
Fax: +49 (0)341 9732239
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc