Dear Sebastian,
I did not use the extraction framework to get the triples. I did go through
the resources Dimitris sent me and I will draft a proposal after some more
research in the next couple of days.
Thank you for your reply!
Best Regards,
Debajyoti Datta
On Tue, Apr 23, 2013 at 12:32 AM, Sebastian Hellmann <
[email protected]> wrote:
> Hi Debajyoti,
>
> I am not quite sure, what your idea is and what you mean with the
> "creation and generation of basic triples, chaining of triples etc in
> python" . Didn't you use the extraction framework to get triples from
> Wikipedia/Wiktionary? Actually, I am not sure, whether you are talking
> about Wiktionary or Wikipedia.
> All the best,
> Sebastian
>
>
>
>
> Am 17.04.2013 12:01, schrieb Debajyoti Datta:
>
> Dear Dbpedia Developers,
>
> After a couple of days of fiddling with creation and generation of basic
> triples, chaining of triples etc in python, following are some of the
> things I have finally managed to pen down.
>
> This is related to giving users the ability to configure and extract
> triples and having a very responsive functional GUI for the same.
>
> A GUI can be the key here because a visual cue for feed forward
> inference will be a great and an effective way for generating triples and
> allowing users to modify existing triples . (Discussed in detail later)
>
> Firstly almost every user can generate some new triple which are obvious
> choices. Like the deterministic ones: If I know the distance between two
> places is 16 kms then the corresponding distance between the two is 9.941
> miles. Now having a GUI to do that can be efficient by having similar
> mappings ( like kms to miles, dollars to other currencies and so on...)
> Essentially from this, one can infer new information from an already
> existing triple. And the GUI will give users the option to create these new
> links or add a completely different link again based on a certain
> dictionary of words or pre-specified rules.
>
> The GUI will have options based on (may be already used key words!) to
> extract these triples for other feed forward inferences like
> classifications ( like if Kolkata Mumbai and Chennai are port cities then
> they are close to the seas and so on..)
>
> Obviously what counts as "Information" and which rules are appropriate
> will vary depending on the context but the idea is that by using rules
> along with some knowledge and outside services we can generate new
> assertions from our existing set of assertions and this part is slightly
> easier to achieve when one has a web front-end (or rather a visually
> pleasing web front end) that guides the user.
>
> Not just applicable in this context but even for geocoding rules when
> someone is googling a location of a place via the address the web frontend
> can also return the lattitude, longitude and an image. This will also be
> another way in which wikitionary users can create new triples involving
> places.
>
> Another approach of wikitionary users to create new triples would again
> be some information that users will provide the best! Like suppose affairs
> of famous personalities ( Like Britney broke up with some famous actor
> harry to be with another famous actor potter ). These sort of triple
> generation can also be very helpful and having a frontend GUI to do that
> would only add to the existing knowledge base.
>
> The GUI should provide a way so that users create these triples in a
> logical way without duplication based on certain rules. Like if someone
> wants to add that famous individual X and famous individual Y are "dating",
> then instead of dating having other words like "going_out" , "affair",
> "girlfriend/boyfriend" etc also should not result in duplication of triples
> or redundancy in triples. (I wish I thought of a better example!)
>
> Also these logs of user actions should all then be stored for further
> creation of new triples that can be done through machine learning
> algorithms in multiple ways. Like having a new restaurant in an area
> popular with the Chinese population and culture will likely be a Chinese
> restaurant and these can be achieved with the use of popular machine
> learning algorithms like k-means but the problem here is with so much data
> and so many iterations... ( I guess coming up with optimal algorithms will
> be difficult and/or resource intensive. But will be something challenging
> to try out! ).
>
> The technical stuff: Have to still figure out the correct approach to
> processing and displaying RDF data. A lot of that has already been done
> based on XML technologies. Highly sophisticated framework like Cocoon
> provide means for complex XML based output generation
> tasks. Apparently processing RDF data on the XML level with XML tools is
> possible when one
> preprocesses the RDF and derives a canonical serialization of the RDF.
> Actually a lot more needs to be figured out in this aspect and will wait
> for your feedback before I go ahead.
>
> About me:
>
> I am a student of information systems with a lot of programming
> experience. Fell in love with MOOCs and thus the following Human Computer
> Interaction<http://www.google.com/url?q=https%3A%2F%2Fdocs.google.com%2Fopen%3Fid%3D0BwMKr-KwTT8KWU44OHQ0WGVKX2s>
> , Machine
> Learning<http://www.google.com/url?q=https%3A%2F%2Fdocs.google.com%2Fopen%3Fid%3D0BwMKr-KwTT8KWTUxeU9BWkZUaXc>
> and Social Network
> Analysis<http://www.google.com/url?q=https%3A%2F%2Fdocs.google.com%2Fopen%3Fid%3D0BwMKr-KwTT8KNjdQb1RNQXk1cEE>
> happened!
> I came across a brilliant initiative by OPEN HPI, the "Semantic Web" course
> and will probably go into semantic web research in the future. I would love
> to have feedback on the above ideas and the feasibility of them. It is true
> that my experience with semantic web is limited but have a lot of coding
> experience in machine learning, python and have other open source
> contributions. (Like designer and developer of open-advice.org under
> Lydia Pintscher of KDE.) Winner of Google Developer Group hackathon for the
> best business app and some more similar but not so interesting stuff....
>
> Last summer I was a Google Summer of Code intern at Connexions, and
> here<http://blog.cnx.org/2012/08/google-summer-of-code-2012-comes-to.html>is
> a post from my mentoring organization about my work.
>
> Looking forward to all your feedback.
>
> Best Regards,
> Debajyoti Datta
>
>
> ------------------------------------------------------------------------------
> Precog is a next-generation analytics platform capable of advanced
> analytics on semi-structured data. The platform includes APIs for building
> apps and a phenomenal toolset for data science. Developers can use
> our toolset for easy data analysis & visualization. Get a free
> account!http://www2.precog.com/precogplatform/slashdotnewsletter
>
>
>
> _______________________________________________
> Dbpedia-gsoc mailing
> [email protected]https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>
>
>
> --
> Dipl. Inf. Sebastian Hellmann
>
> Department of Computer Science, University of Leipzig
> Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,
> http://dbpedia.org/Wiktionary , http://dbpedia.org
> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
> Research Group: http://aksw.org
>
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc