Sorry, that should have been
mvn scala:run -Dlauncher=download

On Fri, Mar 30, 2012 at 00:24, Jona Christopher Sahnwaldt
<[email protected]> wrote:
> Hi David,
>
> Pablo is right - if you only download a few files, wget is great. :-)
>
> The old downloader was broken. I recently rewrote it, but didn't
> integrate it with the extraction code yet (I'm not even sure that's a
> good idea), so it's a separate step. Try using
>
> mvn scala:run download
>
> in the directory extraction_framework/dump.
>
> The configuration is in download.properties or directly in the
> pom.xml. These settings should work for you. (I hope the line breaks
> survive intact...)
>
> # NOTE: format is not java.util.Properties, but
> # org.dbpedia.extraction.dump.download.Config
> dir=K:/Work/Eclipse Workspace/DBpedia_Dumps/to_update
> base=http://dumps.wikimedia.org/
> dump=commons,en:pages-articles.xml.bz2
> unzip=true
> retry-max=5
> retry-millis=10000
> #the following is only needed when you download
> #wikipedia language editions by their article count
> #csv=http://s23.org/wikistats/wikipedias_csv
> #the following are only needed if want to run the
> #AbstractExtractor, which uses a local MediaWiki
> #installation and takes several days to run.
> #dump=en:image.sql.gz,imagelinks.sql.gz,langlinks.sql.gz,templatelinks.sql.gz,categorylinks.sql.gz
> #other=http://svn.wikimedia.org/svnroot/mediawiki/trunk/phase3/maintenance/tables.sql
>
> Cheers,
> Christopher
>
> On Thu, Mar 29, 2012 at 17:42, Pablo Mendes <[email protected]> wrote:
>> Hi David,
>> What about downloading with wget?
>>
>> Cheers,
>> Pablo
>>
>>
>> On Thu, Mar 29, 2012 at 5:33 PM, David Gösenbauer
>> <[email protected]> wrote:
>>>
>>> Hi dbpedia-community!
>>>
>>> I'm experiencing heavy problems trying to get the extraction framework
>>> to run. The step I'm stuck at is downloading the dumps. My config-file
>>> seems to be correct as the download is started by the framework when
>>> running "mvn scala:run". Nevertheless the download times-out at a random
>>> state of data downloaded.
>>>
>>> Downloading this file
>>>
>>> http://dumps.wikimedia.org/enwiki/20120307/enwiki-20120307-pages-articles.xml.bz2
>>> with my browser is 10x slower than by downloading it with the framework.
>>> Downloading it with the browser results in the supposedly completely
>>> downloaded archive which is corrupted everytime since the download times
>>> out or else (The browser shows the download as completed though).
>>>
>>> At the moment it's impossible for me to get the dumps. I hope someone
>>> can please help me out since I need the most recent data at hand!
>>>
>>> Regards,
>>> David
>>>
>>> My config-file:
>>>
>>> dumpDir=K:/Work/Eclipse Workspace/DBpedia_Dumps/to_update
>>> outputDir=K:/Work/Eclipse Workspace/DBpedia_Dumps/updated
>>> updateDumps=true
>>>
>>> extractors=org.dbpedia.extraction.mappings.LabelExtractor \
>>>            org.dbpedia.extraction.mappings.WikiPageExtractor \
>>>            org.dbpedia.extraction.mappings.InfoboxExtractor \
>>>            org.dbpedia.extraction.mappings.PageLinksExtractor \
>>>            org.dbpedia.extraction.mappings.GeoExtractor
>>>
>>> extractors.en=org.dbpedia.extraction.mappings.CategoryLabelExtractor \
>>>               org.dbpedia.extraction.mappings.ArticleCategoriesExtractor \
>>>               org.dbpedia.extraction.mappings.ExternalLinksExtractor \
>>>               org.dbpedia.extraction.mappings.HomepageExtractor \
>>>               org.dbpedia.extraction.mappings.DisambiguationExtractor \
>>>               org.dbpedia.extraction.mappings.PersondataExtractor \
>>>               org.dbpedia.extraction.mappings.PndExtractor \
>>>               org.dbpedia.extraction.mappings.SkosCategoriesExtractor \
>>>               org.dbpedia.extraction.mappings.RedirectExtractor \
>>>               org.dbpedia.extraction.mappings.MappingExtractor \
>>>               org.dbpedia.extraction.mappings.PageIdExtractor \
>>>               org.dbpedia.extraction.mappings.AbstractExtractor \
>>>               org.dbpedia.extraction.mappings.RevisionIdExtractor
>>>
>>> languages=en
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> This SF email is sponsosred by:
>>> Try Windows Azure free for 90 days Click Here
>>> http://p.sf.net/sfu/sfd2d-msazure
>>> _______________________________________________
>>> Dbpedia-discussion mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>
>>
>>
>> ------------------------------------------------------------------------------
>> This SF email is sponsosred by:
>> Try Windows Azure free for 90 days Click Here
>> http://p.sf.net/sfu/sfd2d-msazure
>> _______________________________________________
>> Dbpedia-discussion mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
>>

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to