I use nutch 1.4 and solr 3.4
I think that my error is at moment to parse one xml with this structure
<!--text with -- inside the comentary-->
I was reading but not found so much, this is my error's log.
please some help.
*************************************************************************************************
2012-05-21 10:17:53,398 INFO  fetcher.Fetcher - Fetcher: starting at 2012-05-21 
10:17:53
2012-05-21 10:17:53,399 INFO  fetcher.Fetcher - Fetcher: segment: 
crawl/segments/20120521101752
2012-05-21 10:17:53,762 INFO  fetcher.Fetcher - Using queue mode : byHost
2012-05-21 10:17:53,762 INFO  fetcher.Fetcher - Fetcher: threads: 20
2012-05-21 10:17:53,762 INFO  fetcher.Fetcher - Fetcher: time-out divisor: 2
2012-05-21 10:17:53,777 INFO  fetcher.Fetcher - QueueFeeder finished: total 9 
records + hit by time limit :0
2012-05-21 10:17:53,804 WARN  parse.ParsePluginsReader - Unable to parse 
[null].Reason is [org.xml.sax.SAXParseException; lineNumber: 37; columnNumber: 
7; The string "--" is not permitted within comments.]
2012-05-21 10:17:53,809 WARN  mapred.LocalJobRunner - job_local_0005
java.lang.RuntimeException: Parse Plugins preferences could not be loaded.
        at org.apache.nutch.parse.ParserFactory.<init>(ParserFactory.java:73)
        at org.apache.nutch.parse.ParseUtil.<init>(ParseUtil.java:53)
        at 
org.apache.nutch.fetcher.Fetcher$FetcherThread.<init>(Fetcher.java:581)
        at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1075)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
****************************************************************************************************




----- Mensaje original -----
De: "Markus Jelsma" <[email protected]>
Para: [email protected]
Enviados: Lunes, 21 de Mayo 2012 11:41:40
Asunto: RE: error parsing some xml

Hi

Which version do you use? It should list the troubling URL. What's the stack 
trace?

Cheers

 
 
-----Original message-----
> From:Ing. Eyeris Rodriguez Rueda <[email protected]>
> Sent: Mon 21-May-2012 17:07
> To: [email protected]
> Subject: error parsing some xml
> 
> Hi all.
> When I try to crawl i have a problem at parsing some xml, i get the exception 
> below, i want to know which is the xml with problem at parsing moment.
> **************************************************************************************
> WARN  parse.ParsePluginsReader - Unable to parse [null].Reason is 
> [org.xml.sax.SAXParseException; lineNumber: 37; columnNumber: 7; The string 
> "--" is not permitted within comments.]
> ***************************************************************************************
> Please some help will apreciated
> 
> 
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
> INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
> 
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
> 

10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS 
INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Reply via email to