Hi Elmahdi,

These Log-Messages look pretty normal. Nothing should be broken.
- AVERTISSEMENT: Language not found: cbk-zam. To extract this language, please 
edit the addonLanguage.json in core

This is only relevant if you want to extract from the cbk-zam language.
- INFOS: Will extract redirects from source for commons wiki, could not load 
cache file
This is normal, if you extract from a language for the first time. It generates 
a file called template-redirects.obj that will be used for the extraction. That 
process takes some time and could throw a few warning messages, that you can 
safely ignore.

After the template-redirects.obj file is generated the extraction itself should 
start and you should see files appear in your specified data directory next to 
the input file. With your configuration that would be:

/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/
The whole process can take a few hours for big languages like commons, en or 
fr, but no more input from you is required.


Hope that this was helpful,

Robert Bielinski
________________________________
Von: Sebastian Hellmann <hellm...@informatik.uni-leipzig.de>
Gesendet: Dienstag, 28. August 2018 10:19
An: Elmahdi Korfed; DBpedia-developers@lists.sourceforge.net
Cc: Fabien Gandon; micbuffa; Robert Bielinski
Betreff: Re: [DBpedia-developers] Hi DBpedians, where the extracted files are 
stored?


Hi Elmahdi,

@Robert: could you have a look at this email?

We established to do the first release of the "Generic" DBpedia Core module 
yesterday, it contains all the files you can find here: 
http://downloads.dbpedia.org/repo/lts/generic-spark/

Since we are establishing more frequent releases now, we split up the 
publishing into LTS for long term and then "dev" for things we will eventually 
delete.

The most important change is:

- clearer release and versioning methodology

- metadata provided


We would hope that you would also join in with some datasets.

By the way, we moved most of the communications in the "#releasea" channel on 
slack


All the best,

Sebastian

On 24.08.2018 16:54, Elmahdi Korfed wrote:
Hi everyone,

I'm working on an updated version of DBpediaFR chapter 2018 and I just want to 
know where the extracted files are stored.
Some explications from beginning:

I downloaded :
- dbpedia/extraction-framework from 
github<https://github.com/dbpedia/extraction-framework/>
- commons + fr + wiki dump 2018-08's version from the dumps wikimedia 
website<http://dumps.wikimedia.your.org/frwiki/20180801/>  (source: 
**-pages-articles.xml.bz2)

Now I would like to extract, first, commonswiki.
To do that, I configured 2 files:

=> "extraction.commons.properties" (content of file):

source=pages-articles.xml.bz2
require-download-complete=false
languages=commons
extractors=
extractors.commons=.MappingExtractor,.ContributorExtractor,.TemplateParameterExtractor,.FileTypeExtractor,.GalleryExtractor,.ImageAnnotationExtractor,.CommonsKMLExtractor,.DBpediaResourceExtractor
copyrightCheck=false

=> "universal.properties" (content of file):
dbpedia-version=2018-08
base-dir=/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08
log-dir=/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-data/2018-08
wiki-name=wiki
source=pages-articles.xml.bz2
parallel-processes=4
ontology=../ontology.xml
mappings=../mappings
uri-policy.iri=generic:en
format.ttl.bz2=turtle-triples


After that, I launched these command:
- cd extraction-frameworkd/dump
- ../clean-install-run extraction extraction.commons.properties

Now I have some messages like this:
- AVERTISSEMENT: Language not found: cbk-zam. To extract this language, please 
edit the addonLanguage.json in core.
- INFOS: Will extract redirects from source for commons wiki, could not load 
cache file 
'/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/commonswiki-20180801-template-redirects.obj':
 java.io.FileNotFoundException: 
/Users/macbookpro/Documents/web_pro/github/dbpedia/extraction-framework/dump/extraction-dump/2018-08/commonswiki/20180801/commonswiki-20180801-template-redirects.obj
 (No such file or directory)

- AVERTISSEMENT: wrong redirect. page: 
[title=UNC;ns=0/Main/;language:wiki=commons,locale=en].
- found by dbpedia: [title=University of North Carolina at Chapel 
Hill;ns=0/Main/;language:wiki=commons,locale=en].
- found by wikipedia: [title=University of North Carolina at Chapel 
Hill;ns=0/Main/;language:wiki=commons,locale=en]

It's seem ok right?
Do you know if I just have to wait for the extraction to finish to see the 
extracted files? Because I need to storage files in VirtuosoDB

Thank you for your help
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-developers mailing list
DBpedia-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-developers

Reply via email to