Hi everyone,
I'm working on an updated version of DBpediaFR chapter 2018 and I just want to
know where the extracted files are stored.
Some explications from beginning:
I downloaded :
- dbpedia/extraction-framework from [
https://github.com/dbpedia/extraction-framework/ | github ]
- commons + fr + wiki dump 2018-08's version from [
http://dumps.wikimedia.your.org/frwiki/20180801/ | the dumps wikimedia website
] (source: **-pages-articles.xml.bz2)
Now I would like to extract, first, commonswiki.
To do that, I configured 2 files:
=> "extraction.commons.properties" (content of file):
source=pages-articles.xml.bz2
require-download-complete=false
languages=commons
extractors=
extractors.commons=.MappingExtractor,.ContributorExtractor,.TemplateParameterExtractor,.FileTypeExtractor,.GalleryExtractor,.ImageAnnotationExtractor,.CommonsKMLExtractor,.DBpediaResourceExtractor
copyrightCheck=false
=> "universal.properties" (content of file):
dbpedia-version=2018-08
base-dir=/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08
log-dir=/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-data/2018-08
wiki-name=wiki
source=pages-articles.xml.bz2
parallel-processes=4
ontology=../ontology.xml
mappings=../mappings
uri-policy.iri=generic:en
format.ttl.bz2=turtle-triples
After that, I launched these command:
- cd extraction-frameworkd/dump
- ../clean-install-run extraction extraction.commons.properties
Now I have some messages like this:
- AVERTISSEMENT: Language not found: cbk-zam. To extract this language, please
edit the addonLanguage.json in core.
- INFOS: Will extract redirects from source for commons wiki, could not load
cache file
'/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/commonswiki-20180801-template-redirects.obj':
java.io.FileNotFoundException:
/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/commonswiki-20180801-template-redirects.obj
(No such file or directory)
- AVERTISSEMENT: wrong redirect. page:
[title=UNC;ns=0/Main/;language:wiki=commons,locale=en].
- found by dbpedia: [title=University of North Carolina at Chapel
Hill;ns=0/Main/;language:wiki=commons,locale=en].
- found by wikipedia: [title=University of North Carolina at Chapel
Hill;ns=0/Main/;language:wiki=commons,locale=en]
It's seem ok right?
Do you know if I just have to wait for the extraction to finish to see the
extracted files? Because I need to storage files in VirtuosoDB
Thank you for your help
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-developers mailing list
DBpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers