Hello 

By merging segments with ...

nutch mergesegs $crawldir/MERGEDsegments $crawldir/segments/* -slice 50000

... I got the following message in hadoop.log:

-------------------------
.
.
2010-03-03 03:27:01,849 INFO  segment.SegmentMerger - Slice size: 50000 URLs.
2010-03-03 22:47:43,130 INFO  segment.SegmentMerger - Slice size: 50000 URLs.
2010-03-03 22:47:43,149 INFO  segment.SegmentMerger - Slice size: 50000 URLs.
2010-03-03 23:47:00,029 WARN  mapred.LocalJobRunner - job_local_0001
java.lang.ThreadDeath
        at java.lang.Thread.stop(Thread.java:715)
        at 
org.apache.hadoop.mapred.LocalJobRunner.killJob(LocalJobRunner.java:310)
        at 
org.apache.hadoop.mapred.JobClient$NetworkedJob.killJob(JobClient.java:315)
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1239)
        at org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:620)
        at org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:665)
-------------------------



In the folder $crawldir/MERGEDsegments, I have also a sub directory with the 
name "_temporary"


What I'm doing wrong?

Can I use the generated segments for indexing and continue later with the 
crawling process?


Thanks for your help.
Pat

__________________________________________________
Do You Yahoo!?
Sie sind Spam leid? Yahoo! Mail verfügt über einen herausragenden Schutz gegen 
Massenmails. 
http://mail.yahoo.com

Reply via email to