Hi Olivier,
On 09/06/2012 07:32 PM, Sollier Olivier wrote:
Hello,
I need to extract wikipedia articles for a project I'm working on.
Everthing seems to go smoothly, until the Abstract Extractor starts.
Then I get tons of errors like these two :
sept. 06, 2012 1:15:53 PM
org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1
apply$mcVI$sp
INFO: Error retrieving abstract of title=Andre
Agassi;ns=0/Main/;language:wiki=en,locale=en. Retrying...
java.net.ConnectException: Connexion refusée
at
org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1.apply$mcVI$sp(AbstractExtractor.scala:118)
sept. 06, 2012 1:15:53 PM
org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1 apply
WARNING: error processing page 'title=Austro-Asiatic
languages;ns=0/Main/;language:wiki=en,locale=en'
java.lang.Exception: Could not retrieve abstract for page:
title=Austro-Asiatic languages;ns=0/Main/;language:wiki=en,locale=en
at
org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134)
Any idea how to fix this problem ? Thank you !
The abstract extractor requires the installation of a local mirror of
Wikipedia, at least for the required language, in order for it to be
able to resolve templates.
A template is a simple sequence of characters that has a special meaning
for Wikipedia, and those templates are handled in special way.
Ex: {{convert|1010000|km2|sp=us}}
This templates tells Wikipedia that the area of some
country is 1010000 square kilometers, and when it is rendered,
Wikipedia should display its area in both square kilometers, and
square miles. So, Wikipedia will render it as “1,010,000
square kilometers (390,000 sq mi)”.
So, the abstract extractor cannot work correctly without this local
Wikipedia.
More details about the abstract extraction can be found in [1].
Olivier Sollier
[1] http://jens-lehmann.org/files/2012/program_el_dbpedia_live.pdf
--
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion