Hi,

I have a question about a problem I encountered during my work. I am using 
Apache Nutch 1.5 and Apache Solr for searching for some news on 10 different 
news-websites and from 2 of them I am getting the same Java.io.IOException ?and 
then the Program crashes and all the crawled news are lost. This is the message 
i get from the Exception:


Exception in thread "main" java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
    at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1318)
    at org.apache.nutch.crawl.Crawl.run(Crawl.java:136)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)?


Do you have any suggestions from where to problem might be because the other 8 
news-websites that I am crawling are working just fine?


Best Regards,

Mihai Capatana

________________________________

***********************************************************************************
Acest mesaj poate con?ine informa?ii confiden?iale sau privilegiate, ?i este 
destinat doar pentru uzul destinatarilor s?i. Prin prezenta, sunte?i explicit 
notificat ca orice diseminare, copiere, retransmitere sau comunicare ?n orice 
alta forma, totala sau par?iala, a acestui mesaj, f?r? a avea ?n prealabil 
acordul scris al emitentului, este interzis?! ?n cazul ?n care din gre?eala 
primi?i acest mesaj, sunte?i ruga?i sa notifica?i emitentul ?i sa distruge?i 
mesajul.
Va mul?umim!
***********************************************************************************
***********************************************************************************
The information contained ?n this transmission may be privileged and/or 
confidential and is intended only for the use of the above person(s). If you 
are not the intended recipient, you are hereby notified that any review, 
dissemination, distribution or duplication of this communication or parts from 
it, is strictly prohibited and are requested to contact the sender by reply 
email and destroy all copies of the original message.
Thank you.
***********************************************************************************
? Save a tree, don't print this page if it's not strictly necessary!
***********************************************************************************

Reply via email to