Hi,
On 7/3/07, Jason Ma <[EMAIL PROTECTED]> wrote:
I'm running Nutch on RedHat Linux with Java 1.6.0_01. I have
successfully crawled and indexed smaller quantities of data in the
past. However, after I tried to scale up the crawling, Nutch would
give an exception when indexing (the bottom of the log included
below). Please let me know if there's more information I should
provide.
I'd be very grateful for any suggestions or advice you may have.
Thanks in advance,
Jason Ma
....
Indexing [http://64.13.133.31/pics/up-VC2GQ0CA9QSHHGHM-s] with
analyzer [EMAIL PROTECTED]
(null)
Indexing [http://64.13.133.31/pics/up-VET16648L9TBU53B-s] with
analyzer [EMAIL PROTECTED]
(null)
Indexing [http://64.13.133.31/pics/up-VHIUOB6N8CVESR52-s] with
analyzer [EMAIL PROTECTED]
(null)
Indexing [http://64.13.133.31/pics/user_promo_mini.png] with analyzer
[EMAIL PROTECTED] (null)
Optimizing index.
merging segments _73 (1 docs) _74 (1 docs) _75 (1 docs) _76 (1 docs)
_77 (1 docs) _78 (1 docs) _79 (1 docs) _7a (1 docs) _7b (1 docs) _7c
(1 docs) _7d (1 docs) _7e (1 docs) _7f (1 docs) _7g (1 docs) _7h (1
docs) _7i (1 docs) _7j (1 docs) _7k (1 docs) _7l (1 docs) _7m (1 docs)
_7n (1 docs) _7o (1 docs) _7p (1 docs) _7q (1 docs) into _7r (24 docs)
merging segments _1e (50 docs) _2t (50 docs) _48 (50 docs) _5n (50
docs) _72 (50 docs) _7r (24 docs) into _7s (274 docs)
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:357)
at org.apache.nutch.indexer.Indexer.index(Indexer.java:296)
at org.apache.nutch.indexer.Indexer.main(Indexer.java:313)
This exception is jobrunner telling us that your job has failed. This
doesn't show us where the actual problem is. Check your
logs/hadoop.log or your tasktracker's log files and you should see a
more detailed log about your problem.
--
Doğacan Güney