Hi,
I am just taking this opportunity to introduce myself to the group. My name
is Ved Mathai, and I am from Bangalore, India. I am a Masters student in
Computer Science at the International Institute of Information Technology,
Bangalore (IIIT-B). My research interests include web ontology and
semantics, nlp, information retrieval. I am learning Machine Learning, Game
Theory and Agent based modeling as part of course work.
But for the last 8 months, I have been working on a project, which is
actually another student's PhD project which uses dbpedia very closely. It
attempts to take a simple csv file from let's say data.gov, about some
topic say commodity prices or traffic details. Many of these tables aren't
topically mapped to an ontology. So by using type information from DBpedia
(and skos-classes) we find the most common types (non trivial) for each
data value and store their frequency of occurrence. Then we map properties
?p (?s ?p ?o) where ?s is an item from one row one column and ?o is the
data from the same row another column and we map the frequency of this
property occurring between columns. And from this not only will we know the
topic (theme) but also how the columns relate to each other (scheme) so
that we can now suggest tuples back to a dataset (say dbpedia itself).
We are awaiting acceptance for our paper in VLDB this year for this. The
code for this however is available on github (not recent version). In this
project, we faced a tables where a lot of columns are date values which
exist on Wikipedia but not in DBpedia or not in xsd:format. So it seemed
plausible that this whole domain of date time
<http://wiki.dbpedia.org/ideas/idea/156/parsing-time-information-as-xsddates-from-wikipedia-plain-text/>
can take some working on from the DBpedia extractor so that other projects
built upon this benefit. And I have got some time on my hands, so I thought
might as well make a proper contribution through Gsoc to the main code,
rather than make ad hoc versions here for my research.
Thanks,
Ved
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to