Hi Jo,

Thanks for your suggestions. I checked Mahout's website. It does support
LDA. So I think we can train the entity-topic model on hadoop by following
Mahout's LDA implementation.
Considering the limited time, I put the support for google mention corpus
as an optional work in the proposal. Is that ok?

Best Regards,
Wei


On Fri, May 3, 2013 at 12:24 AM, Wang Wei <[email protected]> wrote:

> Hi all,
>
> Thanks a lot for your comments. I have just updated my proposal. Looking
> forward to your further feedback.
>
> Best Regards,
> Wei
>
>
> On Wed, May 1, 2013 at 5:59 PM, Joachim Daiber 
> <[email protected]>wrote:
>
>> I was referring to the layout of the formulas. Here is a screenshot.
>>
>> Jo
>>
>>
>> On Wed, May 1, 2013 at 11:14 AM, Wang Wei <[email protected]> wrote:
>>
>>> Hi Jo,
>>>
>>> Thanks for your comments.
>>>
>>>
>>>
>>>    - there is a formating problem for me starting with your github
>>>    account
>>>
>>> what do you mean by formating problem?
>>>
>>>
>>>    - could you fix the formulas?    do you mean the formulas in my
>>>    proposal?
>>>
>>>
>>>
>>> I revised the paragraph for introducing the three models. (Still not
>>> good enough..., I'll try to improve it later)
>>>
>>>
>>>
>>>
>>> On Wed, May 1, 2013 at 2:37 PM, Wang Wei <[email protected]> wrote:
>>>
>>>> Hi Jo,
>>>>
>>>> I revised my proposal by adding the topic-based disambiguation method.
>>>> Actually, I found a mistake in my previous discussion on the topic-based
>>>> model. The topic-based model jointly models *the context
>>>> compatibility(which I missed)* and topic coherence. Thus, it can
>>>> perform well for both topic related and non-related entities. It is also a
>>>> generative model.
>>>>
>>>> Thanks.
>>>>
>>>> Regards,
>>>> Wei
>>>>
>>>>
>>>> On Mon, Apr 29, 2013 at 10:03 PM, Wang Wei <[email protected]>wrote:
>>>>
>>>>> Hi Jo,
>>>>>
>>>>> I will think about the disambiguation part and revise my proposal.
>>>>> Thanks.
>>>>>
>>>>> Best Regards,
>>>>> Wei Wang
>>>>>
>>>>>
>>>>> On Mon, Apr 29, 2013 at 7:57 PM, Wang Wei <[email protected]>wrote:
>>>>>
>>>>>> Hi Jo,
>>>>>>
>>>>>> I read the proposal of the topic extraction work from last year's
>>>>>> GSoC. Yes, the topic-based disambiguation method is based on the LDA 
>>>>>> model.
>>>>>> But their objectives are different: topic extraction is to assign topic
>>>>>> categories for  a document, while topic-based disambiguation is to
>>>>>> disambiguate entities based on the document's topic. For example, if a
>>>>>> document's topic is about 'mobile phones', then word 'Apple' would likely
>>>>>> be assigned as Apple Inc. . But, as I mentioned in my proposal, for topic
>>>>>> related entities, they can be disambiguated correctly; for other
>>>>>> entities, they are not guaranteed to be disambiguated correctly by
>>>>>> topic-based disambiguation method.
>>>>>>
>>>>>> In addition, for the generative model(the default disambiguation
>>>>>> model), it has a strong assumption: p(c|e)=p_e(t_1) p_e(t_2)...p_e(t_n),
>>>>>> i.e., the terms are independent given the entity e. Some improvements may
>>>>>> be achieved if this assumption is removed.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Best Regards,
>>>>>> Wei Wang
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 22, 2013 at 2:30 AM, Joachim Daiber <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Hey,
>>>>>>>
>>>>>>> you can have a look at Hector's github repository from last GSoC
>>>>>>> (this is not merged into the main branch yet):
>>>>>>>
>>>>>>> https://github.com/hunterhector/dbpedia-spotlight
>>>>>>>
>>>>>>> I think this was the paper he implemented:
>>>>>>>
>>>>>>> Han, X., 2011. Collective Entity Linking in Web Text : A Graph-Based
>>>>>>> Method. In *Proceedings of the 34th international ACM SIGIR
>>>>>>> conference on Research and development in Information*. pp. 765-774.
>>>>>>>
>>>>>>> Best,
>>>>>>> Jo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Apr 21, 2013 at 6:22 PM, Wang Wei <[email protected]>wrote:
>>>>>>>
>>>>>>>> Hi Jo,
>>>>>>>>
>>>>>>>> I am trying to learning something about the idea for "efficient
>>>>>>>> graph based disambiguation". However, the introduction is very short. 
>>>>>>>> Do
>>>>>>>> you have any further materials for the disambiguation methods used by
>>>>>>>> db-pedia? In the disambiguation code directory[1],  which one is graph
>>>>>>>> based?
>>>>>>>>
>>>>>>>> Thanks a lot.
>>>>>>>>
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Wei Wang
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to