hi. what does this command supposed to do ?

also do you know is there any way i can parse and save text of html files
while crawling ?

On 7 April 2010 14:32, Gareth Gale <gareth.g...@hp.com> wrote:

> Running nutch 0.9 for a long time without problems, but have just started
> to see this error when executing (all from within the nutch 0.9 bin
> directory) :-
>
> ./nutch mergesegs $crawldir/MERGEDsegments $crawldir/segements/*
>
> The error is :-
>
> Exception in thread "main" java.io.IOException: No input paths specified in
> input
>        at
> org.apache.hadoop.mapred.InputFormatBase.validateInput(InputFormatBase.java:99)
>        at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:326)
>        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:543)
>        at
> org.apache.nutch.segment.SegmentMerger.merge(SegmentMerger.java:590)
>        at
> org.apache.nutch.segment.SegmentMerger.main(SegmentMerger.java:638)
>
>
> I've tried both a 1.5 and 1.6 java vm but get the same result.
>
> I have no idea how this is happening or why, but need to fix it asap - any
> help much appreciated !
>

Reply via email to