On 24 April 2013 20:55, Rahul Sharnagat <[email protected]> wrote:
> Hi Dimitris,
>     Since last few days, i am trying to understand the dataparser and
> mapping code.I also went little higher in hierarchy to understand the
> dependencies. Things are getting clear now but will take some more time to
> understand all nuances. Also I successfully installed the extraction
> framework.
>     But there is one problem for getting the dump to work upon. As per
> documentation (here and here), i could not find download.properties.file in
> master branch in dump folder. But i explored the folder and found
> download.minimal.properties. I tweaked it according to instructions for my
> requirement but i am getting a error (attached is full debug log and tweaked
> minimal.properties). I tried to find similar error in archived message but
> could not find it. Can you help me in this regard ?

Strange. Could you just try again? It works for me. Maybe it was a
temporary problem at Wikimedia. Or maybe something is wrong with your
network? What does http://dumps.wikimedia.org/enwiki/ look like in
your browser?

I updated extraction-framework to the latest version from GitHub,
copied your download.minimal.properties file into my dump/ folder,
changed the value of base-dir and executed

../clean-install-run download config=download.minimal.properties

Below is an excerpt from the result.

Cheers,
JC

[INFO] launcher 'download' selected =>
org.dbpedia.extraction.dump.download.Download
done: 0 -
todo: 1 - wiki=en,locale=en
downloading 'http://dumps.wikimedia.org/enwiki/' to
'/Users/jcsahnwaldt/tmp/enwiki/index.html'
read 3.6132812 KB of 3.6132812 KB in 0.014 seconds (258.0915 KB/s)
downloading 'http://dumps.wikimedia.org/enwiki/20130403/' to
'/Users/jcsahnwaldt/tmp/enwiki/20130403/index.html'
read 102.23535 KB of 102.23535 KB in 0.907 seconds (112.71813 KB/s)
date page 'http://dumps.wikimedia.org/enwiki/20130403/' has all files
[pages-articles.xml.bz2]
downloading 
'http://dumps.wikimedia.org/enwiki/20130403/enwiki-20130403-pages-articles.xml.bz2'
to 
'/Users/jcsahnwaldt/tmp/enwiki/20130403/enwiki-20130403-pages-articles.xml.bz2'


>     I am also reading Dbpedia mapping wiki to understand how ontology is
> created and infobox to ontology mapping is done and relate it to code. Since
> little more  than a week is left for final proposal, I want to create a good
> draft by 1st. I will try to send a rough draft by tomorrow.
>
> Thanks.
>
>
>
> On Tue, Apr 23, 2013 at 11:58 AM, Rahul Sharnagat <[email protected]>
> wrote:
>>
>> Thanks Dimitris.
>> I will look into this issue and related code and  get back to you if i
>> face any problems.
>>
>>
>> On Mon, Apr 22, 2013 at 6:07 PM, Dimitris Kontokostas <[email protected]>
>> wrote:
>>>
>>> Hi Rahul,
>>>
>>> A very good warm-up task for this idea is issue #36
>>> (https://github.com/dbpedia/extraction-framework/issues/36)
>>> With this task you will get to know the parser internals and see the
>>> actual need to crowd-source the rules.
>>>
>>> Take a first look and we'll be available for further details
>>>
>>> Cheers,
>>> Dimitris
>>>
>>>
>>> On Mon, Apr 22, 2013 at 5:02 AM, Rahul Sharnagat <[email protected]>
>>> wrote:
>>>>
>>>> Sorry, forgot to add mailing list. Just hit the reply button. :)
>>>>
>>>>
>>>> On Mon, Apr 22, 2013 at 2:19 AM, Dimitris Kontokostas
>>>> <[email protected]> wrote:
>>>>>
>>>>> Please put the mailing list in cc :)
>>>>>
>>>>> Cheers,
>>>>> Dimitris
>>>>>
>>>>> ----
>>>>> Send from my mobile
>>>>>
>>>>> Στις 21 Απρ 2013 7:55 μ.μ., ο χρήστης "Rahul Sharnagat"
>>>>> <[email protected]> έγραψε:
>>>>>
>>>>>> Hi Dimitris,
>>>>>>         Thanks for the reply.
>>>>>>         I am looking for some warm up task relating to this idea . I
>>>>>> have started reading about scala and Dbpedia. It should not take much 
>>>>>> time
>>>>>> to get accustomed to scala since i have previously worked in haskell. 
>>>>>> Please
>>>>>> give me some direction for a warm up task.
>>>>>>
>>>>>>
>>>>>> On Sun, Apr 21, 2013 at 9:39 PM, Dimitris Kontokostas
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hi Rahul,
>>>>>>>
>>>>>>> The application period did not start yet so there is still time left
>>>>>>> :)
>>>>>>>
>>>>>>> Did you read the idea page [1]? The description is pretty big but you
>>>>>>> can ask anything you don't understand completely.
>>>>>>> Everything should be clear when you write your application ;)
>>>>>>>
>>>>>>> Best,
>>>>>>> Dimitris
>>>>>>>
>>>>>>> [1] http://wiki.dbpedia.org/gsoc2013/ideas/CrowdsourceTestsAndRules
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Apr 21, 2013 at 4:06 PM, Rahul Sharnagat
>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> Hi Dimitris,
>>>>>>>>
>>>>>>>>     I am Rahul Sharnagat, master student at IIT Bombay. I am
>>>>>>>> planning to apply for DBpedia GSoC project.
>>>>>>>>
>>>>>>>>     I am interested in the project, Crowdsource tests and extraction
>>>>>>>> rules. I am working on Named entity Recognition(NER) and Entiity 
>>>>>>>> mining as
>>>>>>>> my masters project. I think working with Dbpedia would help me a lot in
>>>>>>>> that. I have interned at Yahoo last summer working on refining news 
>>>>>>>> indexes.
>>>>>>>>
>>>>>>>>     I know I am late due to my final exams, but it will be great if
>>>>>>>> you can help me get started. I have been reading dbpedia wikipages, 
>>>>>>>> also
>>>>>>>> have downloaded code from github.
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best Regards,
>>>>>>>> Rahul Sharnagat
>>>>>>>> CSE MTech, IITB
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------------
>>>>>>>> Precog is a next-generation analytics platform capable of advanced
>>>>>>>> analytics on semi-structured data. The platform includes APIs for
>>>>>>>> building
>>>>>>>> apps and a phenomenal toolset for data science. Developers can use
>>>>>>>> our toolset for easy data analysis & visualization. Get a free
>>>>>>>> account!
>>>>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>>>>>>>> _______________________________________________
>>>>>>>> Dbpedia-gsoc mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Kontokostas Dimitris
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Rahul Sharnagat
>>>>>> CSE MTech, IITB
>>>>>> H14, B505
>>>>>> +91.9860.451.056
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Rahul Sharnagat
>>>> CSE MTech, IITB
>>>> H14, B505
>>>> +91.9860.451.056
>>>
>>>
>>>
>>>
>>> --
>>> Kontokostas Dimitris
>>
>>
>>
>>
>> --
>> Best Regards,
>> Rahul Sharnagat
>> CSE MTech, IITB
>>
>
>
>
> --
> Best Regards,
> Rahul Sharnagat
> CSE MTech, IITB
> H14, B505
> +91.9860.451.056
>
> ------------------------------------------------------------------------------
> Try New Relic Now & We'll Send You this Cool Shirt
> New Relic is the only SaaS-based application performance monitoring service
> that delivers powerful full stack analytics. Optimize and monitor your
> browser, app, & servers with just a few lines of code. Try New Relic
> and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
> _______________________________________________
> Dbpedia-gsoc mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc

Reply via email to