Re: how to force set fetch-status without actually fetching

feng lu Mon, 08 Apr 2013 08:16:24 -0700

Hi Sourajit

Why do you want to index unfetched webpages? The index processing will be
failed if these pages will not have some fields that is to be needed by
indexer, such as digest.



On Mon, Apr 8, 2013 at 7:15 PM, Sourajit Basak <[email protected]>wrote:

> We have a use case where we are generating multiple parse outputs per url.
> In short the url hosts a custom xml file which is being parsed to generate
> several records.
>
> However, in reality the discovered or generated urls are not actually
> fetched. According to  NUTCH-514, anything which isn't fetched will be
> skipped during index.
>
> We need to override this behavior. Any ideas how it can be accomplished ?
>



-- 
Don't Grow Old, Grow Up... :-)

Re: how to force set fetch-status without actually fetching

Reply via email to