Re: [Dbpedia-gsoc] Wiktionary 2 RDF Assistance GUI

Sebastian Hellmann Mon, 22 Apr 2013 12:02:26 -0700

Hi  Debajyoti,

I am not quite sure, what your idea is and what you mean with the"creation and generation of basic triples, chaining of triples etc inpython" . Didn't you use the extraction framework to get triples fromWikipedia/Wiktionary? Actually, I am not sure, whether you are talkingabout Wiktionary or Wikipedia.

All the best,
Sebastian





Am 17.04.2013 12:01, schrieb Debajyoti Datta:

Dear Dbpedia Developers,
After a couple of days of fiddling with creation and generation ofbasic triples, chaining of triples etc in python, following are someof the things I have finally managed to pen down.
This is related to giving users the ability to configure and extracttriples and having a very responsive functional GUI for the same.
A GUI can be the key here because a visual cue for feed forwardinference will be a great and an effective way for generating triplesand allowing users to modify existing triples . (Discussed in detaillater)
Firstly almost every user can generate some new triple which areobvious choices. Like the deterministic ones: If I know the distancebetween two places is 16 kms then the corresponding distance betweenthe two is 9.941 miles. Now having a GUI to do that can be efficientby having similar mappings ( like kms to miles, dollars to othercurrencies and so on...) Essentially from this, one can infer newinformation from an already existing triple. And the GUI will giveusers the option to create these new links or add a completelydifferent link again based on a certain dictionary of words orpre-specified rules.
The GUI will have options based on (may be already used key words!) toextract these triples for other feed forward inferences likeclassifications ( like if Kolkata Mumbai and Chennai are port citiesthen they are close to the seas and so on..)
Obviously what counts as "Information" and which rules are appropriatewill vary depending on the context but the idea is that by using rulesalong with some knowledge and outside services we can generate newassertions from our existing set of assertions and this part isslightly easier to achieve when one has a web front-end (or rather avisually pleasing web front end) that guides the user.
Not just applicable in this context but even for geocoding rules whensomeone is googling a location of a place via the address the webfrontend can also return the lattitude, longitude and an image. Thiswill also be another way in which wikitionary users can create newtriples involving places.
Another approach of wikitionary users to create new triples wouldagain be some information that users will provide the best! Likesuppose affairs of famous personalities ( Like Britney broke up withsome famous actor harry to be with another famous actor potter ).These sort of triple generation can also be very helpful and having afrontend GUI to do that would only add to the existing knowledge base.
The GUI should provide a way so that users create these triples in alogical way without duplication based on certain rules. Like ifsomeone wants to add that famous individual X and famous individual Yare "dating", then instead of dating having other words like"going_out" , "affair", "girlfriend/boyfriend" etc also should notresult in duplication of triples or redundancy in triples. (I wish Ithought of a better example!)
Also these logs of user actions should all then be stored for furthercreation of new triples that can be done through machine learningalgorithms in multiple ways. Like having a new restaurant in an areapopular with the Chinese population and culture will likely be aChinese restaurant and these can be achieved with the use of popularmachine learning algorithms like k-means but the problem here is withso much data and so many iterations... ( I guess coming up withoptimal algorithms will be difficult and/or resource intensive. Butwill be something challenging to try out! ).
The technical stuff: Have to still figure out the correct approach toprocessing and displaying RDF data. A lot of that has already beendone based on XML technologies. Highly sophisticated framework likeCocoon provide means for complex XML based output generationtasks. Apparently processing RDF data on the XML level with XML toolsis possible when onepreprocesses the RDF and derives a canonical serialization of the RDF.Actually a lot more needs to be figured out in this aspect and willwait for your feedback before I go ahead.
About me:
I am a student of information systems with a lot of programmingexperience. Fell in love with MOOCs and thus the following HumanComputer Interaction<http://www.google.com/url?q=https%3A%2F%2Fdocs.google.com%2Fopen%3Fid%3D0BwMKr-KwTT8KWU44OHQ0WGVKX2s>,Machine Learning<http://www.google.com/url?q=https%3A%2F%2Fdocs.google.com%2Fopen%3Fid%3D0BwMKr-KwTT8KWTUxeU9BWkZUaXc> andSocial Network Analysis<http://www.google.com/url?q=https%3A%2F%2Fdocs.google.com%2Fopen%3Fid%3D0BwMKr-KwTT8KNjdQb1RNQXk1cEE> happened!I came across a brilliant initiative by OPEN HPI, the "Semantic Web"course and will probably go into semantic web research in the future.I would love to have feedback on the above ideas and the feasibilityof them. It is true that my experience with semantic web is limitedbut have a lot of coding experience in machine learning, python andhave other open source contributions. (Like designer and developer ofopen-advice.org <http://open-advice.org> under Lydia Pintscher ofKDE.) Winner of Google Developer Group hackathon for the best businessapp and some more similar but not so interesting stuff....
Last summer I was a Google Summer of Code intern at Connexions, andhere<http://blog.cnx.org/2012/08/google-summer-of-code-2012-comes-to.html>is a post from my mentoring organization about my work.
Looking forward to all your feedback.

Best Regards,
Debajyoti Datta


------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter


_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc



--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig

Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,http://dbpedia.org/Wiktionary , http://dbpedia.org

Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter

_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] Wiktionary 2 RDF Assistance GUI

Reply via email to