Re: Recovering aborted fetch

Mathijs Homminga Mon, 26 Feb 2007 00:23:14 -0800

:(

I read something about creating a 'fetcher.done' file which can do somemagic.

Could that help us out?


Mathijs

rubdabadub wrote:

Hi:

I am Sorry to say that you need to fetch again i.e your last segment.
I know the feeling :-( AFAIK there is no way in 0.8 restart a failed
crawl. I have found having small segment i.e generating small fetch
list and merging all the segment later is the only way to avoid such
situation.

Regards

On 2/25/07, Mathijs Homminga <[EMAIL PROTECTED]> wrote:

Hi,

While fetching a segment with 4M documents, we ran out of diskspace.
We guess that the fetcher has fetched (and parsed) about 80 percent of

the documents, so it would be great if we could continue our crawlsomehow.


The segment directory does not contain a crawl_fetch subdirectory yet.
But we have a /tmp/hadoop/mapred/ (Local FS) directory.

Is there some way we can use the data in the temporary mapred directory
to create the crawl_fetch data in order to continue our crawl?

Thanks!
Mathijs

Re: Recovering aborted fetch

Reply via email to