Re: [Dbpedia-gsoc] Dbpedia-gsoc Digest, Vol 12, Issue 2

Saumyarajsinh zala Tue, 03 Mar 2015 01:09:08 -0800

unsubscribe me

On Mon, Mar 2, 2015 at 7:18 PM, <[email protected]>
wrote:


> Send Dbpedia-gsoc mailing list submissions to
>         [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
> or, via email, send a message with subject or body 'help' to
>         [email protected]
>
> You can reach the person managing the list at
>         [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Dbpedia-gsoc digest..."
>
>
> Today's Topics:
>
>    1. Re: GSoC Participation (Marco Fossati)
>    2. Fact extraction from wikipedia text (Emilio Dorigatti)
>    3. Want to Contribute in GSOC 2015 for DBpedia (Harsh Garg)
>    4. Fwd: Re: [Dbpedia-discussion] Getting started with DBpedia
>       (GSoC 2015) (Marco Fossati)
>    5. Re: Fwd: GSOC_2015 Fact Extraction from Wikipedia Text
>       (Marco Fossati)
>    6. Re: Fact extraction from wikipedia text (Marco Fossati)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 02 Mar 2015 10:58:47 +0100
> From: Marco Fossati <[email protected]>
> Subject: Re: [Dbpedia-gsoc] GSoC Participation
> To: Emilio Dorigatti <[email protected]>,
>         [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Hi Emilio,
>
> The following page will give you all the details of our ideas:
> http://wiki.dbpedia.org/gsoc2015/ideas#h460-6
>
> Cheers!
>
> On 2/23/15 8:07 PM, Emilio Dorigatti wrote:
> > Hello,
> > my name is Emilio Dorigatti. Currently I am attending the second year of
> > a bachelor degree in computer science at the university of Povo (Trento,
> > Italy). I am very interested in artificial intelligence and plan to get
> > a master degree in that field; in the meantime I am getting my hands
> > dirty with some projects for fun, especially about machine learning and
> > planning. Lately I've been taking an interest for linked data and the
> > semantic web and I would really love to deepen my knowledge on this
> > subject, dbpedia seems to be just the right project!
> >
> > Nowadays I program mostly in python, but I also know C, C# and, to some
> > extent, java. I am proficient in object oriented programming and have a
> > fairly good knowledge of functional programming. I will start with some
> > of the warm up tasks on github to get acquainted with scala and the
> > codebase, and I will also read about the proposed projects for the
> > Summer of Code. Is there anything else I can do?
> >
> > Emilio.
> >
> >
> >
> ------------------------------------------------------------------------------
> > Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> > from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> > with Interactivity, Sharing, Native Excel Exports, App Integration & more
> > Get technology previously reserved for billion-dollar corporations, FREE
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
> >
> >
> >
> > _______________________________________________
> > Dbpedia-gsoc mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
> >
>
> --
> Marco Fossati
> http://about.me/marco.fossati
> Twitter: @hjfocs
> Skype: hell_j
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 2 Mar 2015 11:40:48 +0100
> From: Emilio Dorigatti <[email protected]>
> Subject: [Dbpedia-gsoc] Fact extraction from wikipedia text
> To: [email protected]
> Message-ID:
>         <CAJi717dD6f=
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> Hello,
> I am also interested in working in the project about fact extraction from
> wikipedia text, I would like to ask for some clarifications about the
> machine learning part of it. The core of the project is to train a
> classifier using a training set built following the approaches described in
> the linked papers. As I understood it, the following tasks are needed;
> given a sentence
>
>  1a. Identify all the LUs using NLP techniques;
>  2b. Identify all the entities in the sentence which may represent FEs
> using again NLP techniques (ASRL perhaps?)
>  2. Use the FrameNet definition for the identified LUs to find the required
> FEs;
>  3. Ask the user whether a certain entity fits a certain FE (for all
> entities and FEs);
>  4. Understand which is the correct LU based on the meanings given in step
> (3).
>
> In the linked papers few is mentioned about steps (1a) and (1b) (but
> clarification has already been asked for), step (2) is straightforward and
> step (4) has already been implemented, the classifier is needed for step
> (3). Thus, it has to answers questions such as "can this entity be this
> FE?" or "is this entity this FE in this context?" (the latter being a lot
> harder in my opinion). It is not clear to me, though, which features should
> be used to train this classifier.
>
> Frequently, in text classification, there is an one-to-one mapping between
> words and features; in this case  FEs have to be used instead of words
> (FrameNet currently recognizes slightly more than 10k FEs). There is also a
> need for features identifying the possible entities, but clearly we cannot
> use the whole DBpedia knowledge base (roughly 4.6 million entities) for
> this. I see that FEs belonging to a frame are usually of different types,
> so I think using *classes* instead of *instances* could be a promising
> alternative (DBpedia has 685 classes). Probably other features are needed
> though.
>
> Sorry for the long wall of text, I tried to express my thoughts in the
> shortest way I could. What do you think?
>
> Emilio.
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> Message: 3
> Date: Mon, 2 Mar 2015 16:42:22 +0530
> From: Harsh Garg <[email protected]>
> Subject: [Dbpedia-gsoc] Want to Contribute in GSOC 2015 for DBpedia
> To: [email protected]
> Message-ID:
>         <
> cakq5ir5retrs5d97sghch9h92xby+b_ef8tavf3j6qt5uo-...@mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Hello sir,
>
> sir myself Harsh Garg.
> Currently  I am pursuing my  B-tech degree from  jaypee institute of
> information technology(3rd year).I want to contribute in this DBpedia
> project.I know i am a newbie. but i write many small programs like twiiter
> crawler using twittersearch libray in python and text file  compression
>  using c++,videos related website and many more as an academic project .
> I can easily code in python and c++ language.Sir please help me how i start
> this project.
> I am interested in
> *New Dynamic Extractors from Wikipedia Content with JSONpedia Faceted
> Browser*
> -------------- next part --------------
> An HTML attachment was scrubbed...
>
> ------------------------------
>
> Message: 4
> Date: Mon, 02 Mar 2015 12:27:15 +0100
> From: Marco Fossati <[email protected]>
> Subject: [Dbpedia-gsoc] Fwd: Re: [Dbpedia-discussion] Getting started
>         with DBpedia (GSoC 2015)
> To: dbpedia-gsoc <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Forwarding this to the specific mailing list.
> @Alberto, please continue the conversation here.
> Cheers!
>
>
> -------- Forwarded Message --------
> Subject: Re: [Dbpedia-discussion] Getting started with DBpedia (GSoC 2015)
> Date: Mon, 02 Mar 2015 12:25:37 +0100
> From: Marco Fossati <[email protected]>
> To: Dimitris Kontokostas <[email protected]>, Alberto Nicoletti
> <[email protected]>
> CC: [email protected]
> <[email protected]>
>
> Hi Alberto,
>
> have a look at our idea page:
> http://wiki.dbpedia.org/gsoc2015/ideas#h460-6
>
> Cheers!
>
> On 2/27/15 9:52 AM, Dimitris Kontokostas wrote:
> > Hi Alberto and welcome to DBpedia
> >
> > please look at the suggested topics we provided. Then, depending on you
> > preferences we could give you some warm-up tasks related to the topics
> > of your interest.
> >
> > For everyone, a very good introduction to DBpedia is the following
> article:
> > Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris
> > Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey,
> > Patrick van Kleef, S?ren Auer, Christian Bizer. DBpedia ? A Large-scale,
> > Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web
> > Journal, Vol. 6 No. 2, pp 167?195, 2015.
> >
> > Cheers,
> > Dimtiris
> >
> > On Tue, Feb 24, 2015 at 8:44 PM, Alberto Nicoletti <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Hi everyone,
> >     I'm a Computer Science student from University of Bologna, in Italy,
> >     i'm looking forward to this year's Google Summer of Code and i've
> >     seen DBpedia has been selected in 2013 and 2014 so, hoping it will
> >     be selected this year too, i'm interested in this organization :)
> >
> >     I just wanted to ask if you could give me some advice to get me
> >     started and some documentation I can read to comprehend your work.
> >
> >     I noticed there are some issues on GitHub tagged as "GSoC Warmup
> >     task", so I think they are some little issues that can be resolved
> >     from a "newbie" like me, am I right?
> >
> >     I'm also new to open source development in a real organization so if
> >     there is something I should know, I please you to let me know.
> >
> >     Thank you very much in forward,
> >     Alberto Nicoletti
> >
> >
>  
> ------------------------------------------------------------------------------
> >     Dive into the World of Parallel Programming The Go Parallel Website,
> >     sponsored
> >     by Intel and developed in partnership with Slashdot Media, is your
> >     hub for all
> >     things parallel software development, from weekly thought leadership
> >     blogs to
> >     news, videos, case studies, tutorials and more. Take a look and join
> the
> >     conversation now. http://goparallel.sourceforge.net/
> >     _______________________________________________
> >     Dbpedia-discussion mailing list
> >     [email protected]
> >     <mailto:[email protected]>
> >     https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >
> >
> >
> >
> > --
> > Kontokostas Dimitris
> >
> >
> >
> ------------------------------------------------------------------------------
> > Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> > by Intel and developed in partnership with Slashdot Media, is your hub
> for all
> > things parallel software development, from weekly thought leadership
> blogs to
> > news, videos, case studies, tutorials and more. Take a look and join the
> > conversation now. http://goparallel.sourceforge.net/
> >
> >
> >
> > _______________________________________________
> > Dbpedia-discussion mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
> >
>
> --
> Marco Fossati
> http://about.me/marco.fossati
> Twitter: @hjfocs
> Skype: hell_j
>
>
>
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 02 Mar 2015 12:53:51 +0100
> From: Marco Fossati <[email protected]>
> Subject: Re: [Dbpedia-gsoc] Fwd: GSOC_2015 Fact Extraction from
>         Wikipedia       Text
> To: kasun perera <[email protected]>,      dbpedia-gsoc
>         <[email protected]>
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Hi Kasun and thanks for the feedback on the project idea!
> You can find my answers inline.
> Cheers!
>
> On 3/2/15 5:10 AM, kasun perera wrote:
> >
> > Forwarding my last email since I didn't get any feedback.
> > Thanks
> >
> > ---------- Forwarded message ----------
> >
> > Hi Marco and others
> >
> > I like to work on the Gsoc project "Fact Extraction from Wikipedia Text"
> > during this summer.
> >
> > I went through the project description and the research papers mentioned
> > under the description. I have few questions to clarify.
> >
> > 1- As mentioned in the project idea the main objective is the
> > implementation of a new text extractor. Will this need to be implemented
> > inside the current extraction-framework?
> Ideally yes.
> > Or would it be a completely new
> > tool?
> >
> > 2- Also it mentioned the use of NLP techniques to process Wikipedia
> > text. Does this means extraction of Dependency relationships to get the
> > frame elements (FE) and lexical unit(LU)?
> Dependency parsing may not be needed, since entity linking can be
> applied to fulfill the task.
> > There are several NLP
> > libraries like Stanford parser, RelEx, NLTK etc. Is there any decision
> > made which NLP library to use?
> NLTK could be a way to go if we decide to use Python, but there is no
> constraint on libraries.
> The ones that serve our purposes are the good ones. :-)
> >
> > 3- Also regarding the content of a Wikipedia page; do we use all the
> > sentences from the Wikipedia page? My idea is it's better if we can use
> > important sentences rather than all the sentences. If that is the better
> > idea we have to come up with a criteria to select important sentences.
> Good point.
> I would first proceed with a domain-specific use case (i.e., soccer) to
> assess the feasibility of the idea. Then, we can generalize.
> Hence, we want to extract specific facts from sentences that may trigger
> soccer-related frames.
> Verb extraction and ranking (i.e., step A of the idea) would cater for
> this task.
>
> Cheers!
> >
> >
> >
> > --
> > Regards
> >
> > Kasun Perera
> >
> >
> >
> >
> > --
> > Regards
> >
> > Kasun Perera
> >
>
> --
> Marco Fossati
> http://about.me/marco.fossati
> Twitter: @hjfocs
> Skype: hell_j
>
>
>
> ------------------------------
>
> Message: 6
> Date: Mon, 02 Mar 2015 14:48:39 +0100
> From: Marco Fossati <[email protected]>
> Subject: Re: [Dbpedia-gsoc] Fact extraction from wikipedia text
> To: Emilio Dorigatti <[email protected]>,
>         [email protected]
> Message-ID: <[email protected]>
> Content-Type: text/plain; charset=utf-8; format=flowed
>
> Hi Emilio,
>
> On 3/2/15 11:40 AM, Emilio Dorigatti wrote:
> > Hello,
> > I am also interested in working in the project about fact extraction
> > from wikipedia text, I would like to ask for some clarifications about
> > the machine learning part of it. The core of the project is to train a
> > classifier using a training set built following the approaches described
> > in the linked papers. As I understood it, the following tasks are
> > needed; given a sentence
> >
> >   1a. Identify all the LUs using NLP techniques;
> >   2b. Identify all the entities in the sentence which may represent FEs
> > using again NLP techniques (ASRL perhaps?)
> Entity linking is the way to go.
> >   2. Use the FrameNet definition for the identified LUs to find the
> > required FEs;
> FrameNet may be either too specific or too complex for crowdsourcing.
> Hence, we should adapt/simplify the frame and FEs definitions accordingly.
> >   3. Ask the user whether a certain entity fits a certain FE (for all
> > entities and FEs);
> >   4. Understand which is the correct LU based on the meanings given in
> > step (3).
> The correct LU should be already there, and we want to minimize LU
> ambiguity, i.e., how many frames can be triggered by one LU.
> Thus, the selection of LU via verb ranking will be a VERY important step.
> >
> > In the linked papers few is mentioned about steps (1a) and (1b) (but
> > clarification has already been asked for), step (2) is straightforward
> > and step (4) has already been implemented, the classifier is needed for
> > step (3). Thus, it has to answers questions such as "can this entity be
> > this FE?" or "is this entity this FE in this context?" (the latter being
> > a lot harder in my opinion). It is not clear to me, though, which
> > features should be used to train this classifier.
> Good point.
> I already have a baseline including linguistic features other than the
> FEs and frames themselves (that will come as output of the crowdsourced
> annotation).
> We should first test it, and then tune the features if needed.
> >
> > Frequently, in text classification, there is an one-to-one mapping
> > between words and features; in this case  FEs have to be used instead of
> > words (FrameNet currently recognizes slightly more than 10k FEs). There
> > is also a need for features identifying the possible entities, but
> > clearly we cannot use the whole DBpedia knowledge base (roughly 4.6
> > million entities) for this. I see that FEs belonging to a frame are
> > usually of different types, so I think using /classes/ instead of
> > /instances/ could be a promising alternative (DBpedia has 685 classes).
> +1 for the entity types. This feature is actually implemented as a
> suggestion mechanism in the referenced workshop paper, and we could
> reuse it as an extra feature.
> But first we need to focus on something that works, then we can tune.
> > Probably other features are needed though.
> >
> > Sorry for the long wall of text, I tried to express my thoughts in the
> > shortest way I could. What do you think?
> That's a great feedback, please keep up with it!
> Cheers!
> >
> > Emilio.
> >
> >
> >
> ------------------------------------------------------------------------------
> > Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> > by Intel and developed in partnership with Slashdot Media, is your hub
> for all
> > things parallel software development, from weekly thought leadership
> blogs to
> > news, videos, case studies, tutorials and more. Take a look and join the
> > conversation now. http://goparallel.sourceforge.net/
> >
> >
> >
> > _______________________________________________
> > Dbpedia-gsoc mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
> >
>
> --
> Marco Fossati
> http://about.me/marco.fossati
> Twitter: @hjfocs
> Skype: hell_j
>
>
>
> ------------------------------
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website,
> sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for
> all
> things parallel software development, from weekly thought leadership blogs
> to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
>
> ------------------------------
>
> _______________________________________________
> Dbpedia-gsoc mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>
>
> End of Dbpedia-gsoc Digest, Vol 12, Issue 2
> *******************************************
>



-- 
Thanks and Regards,
Saumyarajsinh Zala,
+918007405601

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/

_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] Dbpedia-gsoc Digest, Vol 12, Issue 2

Reply via email to