Hi Olivier,

On 09/06/2012 07:32 PM, Sollier Olivier wrote:
Hello,

I need to extract wikipedia articles for a project I'm working on. Everthing seems to go smoothly, until the Abstract Extractor starts.
Then I get tons of errors like these two :

sept. 06, 2012 1:15:53 PM org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1 apply$mcVI$sp INFO: Error retrieving abstract of title=Andre Agassi;ns=0/Main/;language:wiki=en,locale=en. Retrying...
java.net.ConnectException: Connexion refusée
at org.dbpedia.extraction.mappings.AbstractExtractor$$anonfun$retrievePage$1.apply$mcVI$sp(AbstractExtractor.scala:118)

sept. 06, 2012 1:15:53 PM org.dbpedia.extraction.dump.extract.ExtractionJob$$anonfun$1 apply WARNING: error processing page 'title=Austro-Asiatic languages;ns=0/Main/;language:wiki=en,locale=en' java.lang.Exception: Could not retrieve abstract for page: title=Austro-Asiatic languages;ns=0/Main/;language:wiki=en,locale=en at org.dbpedia.extraction.mappings.AbstractExtractor.retrievePage(AbstractExtractor.scala:134)

Any idea how to fix this problem ? Thank you !

The abstract extractor requires the installation of a local mirror of Wikipedia, at least for the required language, in order for it to be able to resolve templates. A template is a simple sequence of characters that has a special meaning for Wikipedia, and those templates are handled in special way.

   Ex: {{convert|1010000|km2|sp=us}}
            This templates tells Wikipedia that the area of some
   country is 1010000 square kilometers, and when it is rendered,
   Wikipedia should display its area in both square kilometers, and
   square miles. So, Wikipedia will render         it as “1,010,000
   square kilometers (390,000 sq mi)”.

So, the abstract extractor cannot work correctly without this local Wikipedia.
More details about the abstract extraction can be found in [1].


Olivier Sollier

[1] http://jens-lehmann.org/files/2012/program_el_dbpedia_live.pdf

--
Kind Regards
Mohamed Morsey
Department of Computer Science
University of Leipzig

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to