Hi Elmahdi,
These Log-Messages look pretty normal. Nothing should be broken.
- AVERTISSEMENT: Language not found: cbk-zam. To extract this language, please
edit the addonLanguage.json in core
This is only relevant if you want to extract from the cbk-zam language.
- INFOS: Will extract redirects from source for commons wiki, could not load
cache file
This is normal, if you extract from a language for the first time. It generates
a file called template-redirects.obj that will be used for the extraction. That
process takes some time and could throw a few warning messages, that you can
safely ignore.
After the template-redirects.obj file is generated the extraction itself should
start and you should see files appear in your specified data directory next to
the input file. With your configuration that would be:
/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/
The whole process can take a few hours for big languages like commons, en or
fr, but no more input from you is required.
Hope that this was helpful,
Robert Bielinski
________________________________
Von: Sebastian Hellmann <hellm...@informatik.uni-leipzig.de>
Gesendet: Dienstag, 28. August 2018 10:19
An: Elmahdi Korfed; DBpedia-developers@lists.sourceforge.net
Cc: Fabien Gandon; micbuffa; Robert Bielinski
Betreff: Re: [DBpedia-developers] Hi DBpedians, where the extracted files are
stored?
Hi Elmahdi,
@Robert: could you have a look at this email?
We established to do the first release of the "Generic" DBpedia Core module
yesterday, it contains all the files you can find here:
http://downloads.dbpedia.org/repo/lts/generic-spark/
Since we are establishing more frequent releases now, we split up the
publishing into LTS for long term and then "dev" for things we will eventually
delete.
The most important change is:
- clearer release and versioning methodology
- metadata provided
We would hope that you would also join in with some datasets.
By the way, we moved most of the communications in the "#releasea" channel on
slack
All the best,
Sebastian
On 24.08.2018 16:54, Elmahdi Korfed wrote:
Hi everyone,
I'm working on an updated version of DBpediaFR chapter 2018 and I just want to
know where the extracted files are stored.
Some explications from beginning:
I downloaded :
- dbpedia/extraction-framework from
github<https://github.com/dbpedia/extraction-framework/>
- commons + fr + wiki dump 2018-08's version from the dumps wikimedia
website<http://dumps.wikimedia.your.org/frwiki/20180801/> (source:
**-pages-articles.xml.bz2)
Now I would like to extract, first, commonswiki.
To do that, I configured 2 files:
=> "extraction.commons.properties" (content of file):
source=pages-articles.xml.bz2
require-download-complete=false
languages=commons
extractors=
extractors.commons=.MappingExtractor,.ContributorExtractor,.TemplateParameterExtractor,.FileTypeExtractor,.GalleryExtractor,.ImageAnnotationExtractor,.CommonsKMLExtractor,.DBpediaResourceExtractor
copyrightCheck=false
=> "universal.properties" (content of file):
dbpedia-version=2018-08
base-dir=/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08
log-dir=/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-data/2018-08
wiki-name=wiki
source=pages-articles.xml.bz2
parallel-processes=4
ontology=../ontology.xml
mappings=../mappings
uri-policy.iri=generic:en
format.ttl.bz2=turtle-triples
After that, I launched these command:
- cd extraction-frameworkd/dump
- ../clean-install-run extraction extraction.commons.properties
Now I have some messages like this:
- AVERTISSEMENT: Language not found: cbk-zam. To extract this language, please
edit the addonLanguage.json in core.
- INFOS: Will extract redirects from source for commons wiki, could not load
cache file
'/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/commonswiki-20180801-template-redirects.obj':
java.io.FileNotFoundException:
/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/commonswiki-20180801-template-redirects.obj
(No such file or directory)
- AVERTISSEMENT: wrong redirect. page:
[title=UNC;ns=0/Main/;language:wiki=commons,locale=en].
- found by dbpedia: [title=University of North Carolina at Chapel
Hill;ns=0/Main/;language:wiki=commons,locale=en].
- found by wikipedia: [title=University of North Carolina at Chapel
Hill;ns=0/Main/;language:wiki=commons,locale=en]
It's seem ok right?
Do you know if I just have to wait for the extraction to finish to see the
extracted files? Because I need to storage files in VirtuosoDB
Thank you for your help
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-developers mailing list
DBpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers