Hello Amine,
I created a better description of the project idea here:
http://wiki.dbpedia.org/gsoc2014/ideas/ExtractionwithMapReduce?v=5uv
We are still looking for a (Hadoop) mentor but I am very confident that we
will find one in the following days.
Best,
Dimitris
On Sat, Feb 15, 2014 at 5:40 PM, Andrea Di Menna <[email protected]> wrote:
> Hello Mohamed and welcome!
> Unfortunately I have very little experience with Hadoop myself but I would
> love to help with this task.
> Looking forward to discussing your suggestions.
> Cheers
> Andrea
> Il 15/feb/2014 08:42 "Dimitris Kontokostas" <[email protected]> ha
> scritto:
>
> Hello Mouhoub and welcome to the community
>>
>> I thought of this idea after Andrea di Menna committed the cool
>> dump-split feature. So I am a possible (co-)mentor for this project.
>> The reason why I didn't put any mentor here yet (And a full description)
>> is because we don't have any mentor at the moment (including me) with
>> experience in MapReduce.
>> I have a good idea about it but never got any hands-on experience.
>>
>> We will try to find someone by the application start period but in the
>> meantime I can set some the requirements. You are also welcome to suggest a
>> mentor for this project.
>> You look familiar with the DBpedia extraction framework so, in your
>> application you can suggest your own idea extensions
>>
>> DBpedia is not accepted yet as an organization so you cannot use the
>> melange system at the moment. We can continue with the public mailing list
>> if you are confortable with it or otherwise wait. Either is fine for us.
>> I will be traveling next week but I will try to find some time and extend
>> the idea description.
>>
>> Best,
>> Dimitis
>>
>>
>> On Sat, Feb 15, 2014 at 7:54 AM, Mohamed Amine MOUHOUB <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> I am a PhD Student from University of Paris Dauphine, I work on the
>>> search of linked data and linked services. Basically I am interested in
>>> searching and integrating data from the LOD, and DBpedia is at the center
>>> of my interests as it is at the core of the LOD graph, and is considered,
>>> in my opinion as a starting point to the rest of the LOD.
>>>
>>> Anyway, I am particularily interested in Hadoop. I started experiencing
>>> with Hadoop on 2011, and then with Amazon EMR in 2012. I have worked on
>>> some data mining projects with Hadoop. As a trainee at an Open Data company
>>> in Paris (Data Publica) I worked on a project to discover open data sources
>>> in France using Hadoop and the internet archive of Common Crawl. (120 Tb of
>>> web documents to be analyzed, clustered, etc).
>>> In September 2012, my project won the Common Crawl's Code Contest.
>>>
>>> I am very interested in the proposed idea of extraction of using Map
>>> Reduce. I think it is a very interesting contribution to the performance of
>>> the extraction framework. Moreover, the nature of the wikipedia input data,
>>> and the nature of the output (rdf triples), and the individuality of the
>>> processing for each entry makes the extraction highly parallelisable using
>>> MapReduce. I am ready to submit a proposal for this idea, but I don't see
>>> any mentors attributed to the idea. The idea is not very well described in
>>> the wiki page, but I can provide in the upcoming days a briefely-detailed
>>> proposal for an implementation. I am also interested in co-authoring a
>>> conference paper about the project.
>>> Any mentor interested in ??? Should I send my described proposal via
>>> this mailing list or directly submit it to the google summer code page ?
>>>
>>> Best regards,
>>> Amine Mouhoub
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Android apps run on BlackBerry 10
>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
>>> Now with support for Jelly Bean, Bluetooth, Mapview and more.
>>> Get your Android app in front of a whole new audience. Start now.
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> Dbpedia-gsoc mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>>>
>>>
>>
>>
>> --
>> Kontokostas Dimitris
>>
>>
>> ------------------------------------------------------------------------------
>> Android apps run on BlackBerry 10
>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
>> Now with support for Jelly Bean, Bluetooth, Mapview and more.
>> Get your Android app in front of a whole new audience. Start now.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Dbpedia-gsoc mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>>
>>
--
Kontokostas Dimitris
------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc