On 24 April 2013 20:55, Rahul Sharnagat <[email protected]> wrote: > Hi Dimitris, > Since last few days, i am trying to understand the dataparser and > mapping code.I also went little higher in hierarchy to understand the > dependencies. Things are getting clear now but will take some more time to > understand all nuances. Also I successfully installed the extraction > framework. > But there is one problem for getting the dump to work upon. As per > documentation (here and here), i could not find download.properties.file in > master branch in dump folder. But i explored the folder and found > download.minimal.properties. I tweaked it according to instructions for my > requirement but i am getting a error (attached is full debug log and tweaked > minimal.properties). I tried to find similar error in archived message but > could not find it. Can you help me in this regard ?
Strange. Could you just try again? It works for me. Maybe it was a temporary problem at Wikimedia. Or maybe something is wrong with your network? What does http://dumps.wikimedia.org/enwiki/ look like in your browser? I updated extraction-framework to the latest version from GitHub, copied your download.minimal.properties file into my dump/ folder, changed the value of base-dir and executed ../clean-install-run download config=download.minimal.properties Below is an excerpt from the result. Cheers, JC [INFO] launcher 'download' selected => org.dbpedia.extraction.dump.download.Download done: 0 - todo: 1 - wiki=en,locale=en downloading 'http://dumps.wikimedia.org/enwiki/' to '/Users/jcsahnwaldt/tmp/enwiki/index.html' read 3.6132812 KB of 3.6132812 KB in 0.014 seconds (258.0915 KB/s) downloading 'http://dumps.wikimedia.org/enwiki/20130403/' to '/Users/jcsahnwaldt/tmp/enwiki/20130403/index.html' read 102.23535 KB of 102.23535 KB in 0.907 seconds (112.71813 KB/s) date page 'http://dumps.wikimedia.org/enwiki/20130403/' has all files [pages-articles.xml.bz2] downloading 'http://dumps.wikimedia.org/enwiki/20130403/enwiki-20130403-pages-articles.xml.bz2' to '/Users/jcsahnwaldt/tmp/enwiki/20130403/enwiki-20130403-pages-articles.xml.bz2' > I am also reading Dbpedia mapping wiki to understand how ontology is > created and infobox to ontology mapping is done and relate it to code. Since > little more than a week is left for final proposal, I want to create a good > draft by 1st. I will try to send a rough draft by tomorrow. > > Thanks. > > > > On Tue, Apr 23, 2013 at 11:58 AM, Rahul Sharnagat <[email protected]> > wrote: >> >> Thanks Dimitris. >> I will look into this issue and related code and get back to you if i >> face any problems. >> >> >> On Mon, Apr 22, 2013 at 6:07 PM, Dimitris Kontokostas <[email protected]> >> wrote: >>> >>> Hi Rahul, >>> >>> A very good warm-up task for this idea is issue #36 >>> (https://github.com/dbpedia/extraction-framework/issues/36) >>> With this task you will get to know the parser internals and see the >>> actual need to crowd-source the rules. >>> >>> Take a first look and we'll be available for further details >>> >>> Cheers, >>> Dimitris >>> >>> >>> On Mon, Apr 22, 2013 at 5:02 AM, Rahul Sharnagat <[email protected]> >>> wrote: >>>> >>>> Sorry, forgot to add mailing list. Just hit the reply button. :) >>>> >>>> >>>> On Mon, Apr 22, 2013 at 2:19 AM, Dimitris Kontokostas >>>> <[email protected]> wrote: >>>>> >>>>> Please put the mailing list in cc :) >>>>> >>>>> Cheers, >>>>> Dimitris >>>>> >>>>> ---- >>>>> Send from my mobile >>>>> >>>>> Στις 21 Απρ 2013 7:55 μ.μ., ο χρήστης "Rahul Sharnagat" >>>>> <[email protected]> έγραψε: >>>>> >>>>>> Hi Dimitris, >>>>>> Thanks for the reply. >>>>>> I am looking for some warm up task relating to this idea . I >>>>>> have started reading about scala and Dbpedia. It should not take much >>>>>> time >>>>>> to get accustomed to scala since i have previously worked in haskell. >>>>>> Please >>>>>> give me some direction for a warm up task. >>>>>> >>>>>> >>>>>> On Sun, Apr 21, 2013 at 9:39 PM, Dimitris Kontokostas >>>>>> <[email protected]> wrote: >>>>>>> >>>>>>> Hi Rahul, >>>>>>> >>>>>>> The application period did not start yet so there is still time left >>>>>>> :) >>>>>>> >>>>>>> Did you read the idea page [1]? The description is pretty big but you >>>>>>> can ask anything you don't understand completely. >>>>>>> Everything should be clear when you write your application ;) >>>>>>> >>>>>>> Best, >>>>>>> Dimitris >>>>>>> >>>>>>> [1] http://wiki.dbpedia.org/gsoc2013/ideas/CrowdsourceTestsAndRules >>>>>>> >>>>>>> >>>>>>> On Sun, Apr 21, 2013 at 4:06 PM, Rahul Sharnagat >>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> Hi Dimitris, >>>>>>>> >>>>>>>> I am Rahul Sharnagat, master student at IIT Bombay. I am >>>>>>>> planning to apply for DBpedia GSoC project. >>>>>>>> >>>>>>>> I am interested in the project, Crowdsource tests and extraction >>>>>>>> rules. I am working on Named entity Recognition(NER) and Entiity >>>>>>>> mining as >>>>>>>> my masters project. I think working with Dbpedia would help me a lot in >>>>>>>> that. I have interned at Yahoo last summer working on refining news >>>>>>>> indexes. >>>>>>>> >>>>>>>> I know I am late due to my final exams, but it will be great if >>>>>>>> you can help me get started. I have been reading dbpedia wikipages, >>>>>>>> also >>>>>>>> have downloaded code from github. >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Best Regards, >>>>>>>> Rahul Sharnagat >>>>>>>> CSE MTech, IITB >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Precog is a next-generation analytics platform capable of advanced >>>>>>>> analytics on semi-structured data. The platform includes APIs for >>>>>>>> building >>>>>>>> apps and a phenomenal toolset for data science. Developers can use >>>>>>>> our toolset for easy data analysis & visualization. Get a free >>>>>>>> account! >>>>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter >>>>>>>> _______________________________________________ >>>>>>>> Dbpedia-gsoc mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Kontokostas Dimitris >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Rahul Sharnagat >>>>>> CSE MTech, IITB >>>>>> H14, B505 >>>>>> +91.9860.451.056 >>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Rahul Sharnagat >>>> CSE MTech, IITB >>>> H14, B505 >>>> +91.9860.451.056 >>> >>> >>> >>> >>> -- >>> Kontokostas Dimitris >> >> >> >> >> -- >> Best Regards, >> Rahul Sharnagat >> CSE MTech, IITB >> > > > > -- > Best Regards, > Rahul Sharnagat > CSE MTech, IITB > H14, B505 > +91.9860.451.056 > > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr > _______________________________________________ > Dbpedia-gsoc mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc > ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Dbpedia-gsoc mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
