Hi Rahul,
You should put your main effort in your application but I think this task
will also help you get a better idea on what to expect.
Regarding the proxy, we have the following launcher in dump/pom.xml, please
uncomment and adappt the proxy settings
<launcher>
<id>download</id>
<mainClass>org.dbpedia.extraction.dump.download.Download</mainClass>
<!--
<jvmArgs>
<jvmArg>-Dhttp.proxyHost=proxy.server.com
</jvmArg>
<jvmArg>-Dhttp.proxyPort=80</jvmArg>
<jvmArg>-Dhttp.nonProxyHosts="localhost|127.0.0.1"</jvmArg>
</jvmArgs>
-->
<!-- ../run download config=download.properties
-->
</launcher>
On Thu, Apr 25, 2013 at 5:29 AM, Rahul Sharnagat <[email protected]>wrote:
> Hi Jona,
> I think, i know the problem. I am on my institute network which works
> through a proxy server. To get the maven working i had to set the proxy
> settings in settings.xml and provided it to mvn command but currently i am
> putting in $HOME/.m2/ folder. Is downloading of wiki dump accepts the maven
> proxy setting or global environment of http_proxy? May be this can be the
> source of error. I will try to get on a no proxy network and try it again.
>
>
>
> On Thu, Apr 25, 2013 at 4:09 AM, Jona Christopher Sahnwaldt <
> [email protected]> wrote:
>
>> On 24 April 2013 20:55, Rahul Sharnagat <[email protected]> wrote:
>> > Hi Dimitris,
>> > Since last few days, i am trying to understand the dataparser and
>> > mapping code.I also went little higher in hierarchy to understand the
>> > dependencies. Things are getting clear now but will take some more time
>> to
>> > understand all nuances. Also I successfully installed the extraction
>> > framework.
>> > But there is one problem for getting the dump to work upon. As per
>> > documentation (here and here), i could not find
>> download.properties.file in
>> > master branch in dump folder. But i explored the folder and found
>> > download.minimal.properties. I tweaked it according to instructions for
>> my
>> > requirement but i am getting a error (attached is full debug log and
>> tweaked
>> > minimal.properties). I tried to find similar error in archived message
>> but
>> > could not find it. Can you help me in this regard ?
>>
>> Strange. Could you just try again? It works for me. Maybe it was a
>> temporary problem at Wikimedia. Or maybe something is wrong with your
>> network? What does http://dumps.wikimedia.org/enwiki/ look like in
>> your browser?
>>
>> I updated extraction-framework to the latest version from GitHub,
>> copied your download.minimal.properties file into my dump/ folder,
>> changed the value of base-dir and executed
>>
>> ../clean-install-run download config=download.minimal.properties
>>
>> Below is an excerpt from the result.
>>
>> Cheers,
>> JC
>>
>> [INFO] launcher 'download' selected =>
>> org.dbpedia.extraction.dump.download.Download
>> done: 0 -
>> todo: 1 - wiki=en,locale=en
>> downloading 'http://dumps.wikimedia.org/enwiki/' to
>> '/Users/jcsahnwaldt/tmp/enwiki/index.html'
>> read 3.6132812 KB of 3.6132812 KB in 0.014 seconds (258.0915 KB/s)
>> downloading 'http://dumps.wikimedia.org/enwiki/20130403/' to
>> '/Users/jcsahnwaldt/tmp/enwiki/20130403/index.html'
>> read 102.23535 KB of 102.23535 KB in 0.907 seconds (112.71813 KB/s)
>> date page 'http://dumps.wikimedia.org/enwiki/20130403/' has all files
>> [pages-articles.xml.bz2]
>> downloading '
>> http://dumps.wikimedia.org/enwiki/20130403/enwiki-20130403-pages-articles.xml.bz2
>> '
>> to
>> '/Users/jcsahnwaldt/tmp/enwiki/20130403/enwiki-20130403-pages-articles.xml.bz2'
>>
>>
>> > I am also reading Dbpedia mapping wiki to understand how ontology is
>> > created and infobox to ontology mapping is done and relate it to code.
>> Since
>> > little more than a week is left for final proposal, I want to create a
>> good
>> > draft by 1st. I will try to send a rough draft by tomorrow.
>> >
>> > Thanks.
>> >
>> >
>> >
>> > On Tue, Apr 23, 2013 at 11:58 AM, Rahul Sharnagat <
>> [email protected]>
>> > wrote:
>> >>
>> >> Thanks Dimitris.
>> >> I will look into this issue and related code and get back to you if i
>> >> face any problems.
>> >>
>> >>
>> >> On Mon, Apr 22, 2013 at 6:07 PM, Dimitris Kontokostas <
>> [email protected]>
>> >> wrote:
>> >>>
>> >>> Hi Rahul,
>> >>>
>> >>> A very good warm-up task for this idea is issue #36
>> >>> (https://github.com/dbpedia/extraction-framework/issues/36)
>> >>> With this task you will get to know the parser internals and see the
>> >>> actual need to crowd-source the rules.
>> >>>
>> >>> Take a first look and we'll be available for further details
>> >>>
>> >>> Cheers,
>> >>> Dimitris
>> >>>
>> >>>
>> >>> On Mon, Apr 22, 2013 at 5:02 AM, Rahul Sharnagat <
>> [email protected]>
>> >>> wrote:
>> >>>>
>> >>>> Sorry, forgot to add mailing list. Just hit the reply button. :)
>> >>>>
>> >>>>
>> >>>> On Mon, Apr 22, 2013 at 2:19 AM, Dimitris Kontokostas
>> >>>> <[email protected]> wrote:
>> >>>>>
>> >>>>> Please put the mailing list in cc :)
>> >>>>>
>> >>>>> Cheers,
>> >>>>> Dimitris
>> >>>>>
>> >>>>> ----
>> >>>>> Send from my mobile
>> >>>>>
>> >>>>> Στις 21 Απρ 2013 7:55 μ.μ., ο χρήστης "Rahul Sharnagat"
>> >>>>> <[email protected]> έγραψε:
>> >>>>>
>> >>>>>> Hi Dimitris,
>> >>>>>> Thanks for the reply.
>> >>>>>> I am looking for some warm up task relating to this idea .
>> I
>> >>>>>> have started reading about scala and Dbpedia. It should not take
>> much time
>> >>>>>> to get accustomed to scala since i have previously worked in
>> haskell. Please
>> >>>>>> give me some direction for a warm up task.
>> >>>>>>
>> >>>>>>
>> >>>>>> On Sun, Apr 21, 2013 at 9:39 PM, Dimitris Kontokostas
>> >>>>>> <[email protected]> wrote:
>> >>>>>>>
>> >>>>>>> Hi Rahul,
>> >>>>>>>
>> >>>>>>> The application period did not start yet so there is still time
>> left
>> >>>>>>> :)
>> >>>>>>>
>> >>>>>>> Did you read the idea page [1]? The description is pretty big but
>> you
>> >>>>>>> can ask anything you don't understand completely.
>> >>>>>>> Everything should be clear when you write your application ;)
>> >>>>>>>
>> >>>>>>> Best,
>> >>>>>>> Dimitris
>> >>>>>>>
>> >>>>>>> [1]
>> http://wiki.dbpedia.org/gsoc2013/ideas/CrowdsourceTestsAndRules
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Sun, Apr 21, 2013 at 4:06 PM, Rahul Sharnagat
>> >>>>>>> <[email protected]> wrote:
>> >>>>>>>>
>> >>>>>>>> Hi Dimitris,
>> >>>>>>>>
>> >>>>>>>> I am Rahul Sharnagat, master student at IIT Bombay. I am
>> >>>>>>>> planning to apply for DBpedia GSoC project.
>> >>>>>>>>
>> >>>>>>>> I am interested in the project, Crowdsource tests and
>> extraction
>> >>>>>>>> rules. I am working on Named entity Recognition(NER) and Entiity
>> mining as
>> >>>>>>>> my masters project. I think working with Dbpedia would help me a
>> lot in
>> >>>>>>>> that. I have interned at Yahoo last summer working on refining
>> news indexes.
>> >>>>>>>>
>> >>>>>>>> I know I am late due to my final exams, but it will be great
>> if
>> >>>>>>>> you can help me get started. I have been reading dbpedia
>> wikipages, also
>> >>>>>>>> have downloaded code from github.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>> Best Regards,
>> >>>>>>>> Rahul Sharnagat
>> >>>>>>>> CSE MTech, IITB
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> ------------------------------------------------------------------------------
>> >>>>>>>> Precog is a next-generation analytics platform capable of
>> advanced
>> >>>>>>>> analytics on semi-structured data. The platform includes APIs for
>> >>>>>>>> building
>> >>>>>>>> apps and a phenomenal toolset for data science. Developers can
>> use
>> >>>>>>>> our toolset for easy data analysis & visualization. Get a free
>> >>>>>>>> account!
>> >>>>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> >>>>>>>> _______________________________________________
>> >>>>>>>> Dbpedia-gsoc mailing list
>> >>>>>>>> [email protected]
>> >>>>>>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Kontokostas Dimitris
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Best Regards,
>> >>>>>> Rahul Sharnagat
>> >>>>>> CSE MTech, IITB
>> >>>>>> H14, B505
>> >>>>>> +91.9860.451.056
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Best Regards,
>> >>>> Rahul Sharnagat
>> >>>> CSE MTech, IITB
>> >>>> H14, B505
>> >>>> +91.9860.451.056
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Kontokostas Dimitris
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards,
>> >> Rahul Sharnagat
>> >> CSE MTech, IITB
>> >>
>> >
>> >
>> >
>> > --
>> > Best Regards,
>> > Rahul Sharnagat
>> > CSE MTech, IITB
>> > H14, B505
>> > +91.9860.451.056
>> >
>> >
>> ------------------------------------------------------------------------------
>> > Try New Relic Now & We'll Send You this Cool Shirt
>> > New Relic is the only SaaS-based application performance monitoring
>> service
>> > that delivers powerful full stack analytics. Optimize and monitor your
>> > browser, app, & servers with just a few lines of code. Try New Relic
>> > and get this awesome Nerd Life shirt!
>> http://p.sf.net/sfu/newrelic_d2d_apr
>> > _______________________________________________
>> > Dbpedia-gsoc mailing list
>> > [email protected]
>> > https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc
>> >
>>
>
>
>
> --
> Best Regards,
> Rahul Sharnagat
> CSE MTech, IITB
> H14, B505
> +91.9860.451.056
>
--
Kontokostas Dimitris
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc