Re: [Dbpedia-gsoc] GSOC_2015 Fact Extraction from Wikipedia Text

kasun perera Sun, 15 Mar 2015 23:29:51 -0700

Hi Marco

After going through the Warm-up tasks I have started writing the GSOC
proposal. But going through the code-repo and Warm-up tasks I see current
code take as input a Wikipedia corpus and perform the following steps:


   1. Verb extraction and ranking
   2. Frame Classifier Training
   3. Frame Extraction


But in project idea 5.1. it says these steps are to be implemented during
the GSOC period. But I see above steps are already implemented in the
current fact-extractor code right? So what are the project expectations
during the GSOC period? please clarify.


On Wed, Mar 4, 2015 at 1:59 PM, [email protected] <[email protected]>
wrote:

> Hi Kasun,
>
> The sentence substrings identified via entity linking will be the fe
> candidates.
> Then I think you got the idea behind the verb ranking step.
>
> Cheers!
>
> ----- Reply message -----
> Da: "kasun perera" <[email protected]>
> A: "Marco Fossati" <[email protected]>
> Cc: "dbpedia-gsoc" <[email protected]>
> Oggetto: GSOC_2015 Fact Extraction from Wikipedia Text
> Data: mer, mar 4, 2015 07:08
>
>
> Hi Marco
>
> On Mon, Mar 2, 2015 at 5:23 PM, Marco Fossati <[email protected]>
> wrote:
>
>>
>>
>>> 2- Also it mentioned the use of NLP techniques to process Wikipedia
>>> text. Does this means extraction of Dependency relationships to get the
>>> frame elements (FE) and lexical unit(LU)?
>>>
>> Dependency parsing may not be needed, since entity linking can be applied
>> to fulfill the task.
>
>
> I'm not clear what you mean by use of entity-linking to identify FE
> candidates. In general Named entity linking (NEL) means linking the
> mentions of entities in text to a central knowledge base(e.g. Wikipedia).
> Do you mean to use the above concept to find FE's? Can you please clarify
> bit more on use of entity-linking to identify FE's?
>
> This is the my understanding of the step-1 of the idea i.e. Verb
> extraction and Ranking.
> We use a list of domains (e.g. Sports) then dig in to more specific
> sub-domain (e.g. Soccer, Cricket, Rugby ect) of Wikipedia. The navigate to
> specific wiki-pages under the sub domain. For each wiki page we extract and
> rank the verbs based on the sub-domain and higher ranked verbs are used as
> LU's.
> What are your comments about this idea?
>
> Thanks
>
>
>
>
> --
> Regards
>
> Kasun Perera
>
>


-- 
Regards

Kasun Perera

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/

_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Re: [Dbpedia-gsoc] GSOC_2015 Fact Extraction from Wikipedia Text

Reply via email to