This is built into Nutch. Instead of injecting http:// url's, use file:// , and Nutch will use protocol-file to fetch the files locally.
Andy On 8/12/05, Dawid Weiss <[EMAIL PROTECTED]> wrote: > > Has anyone considered/ implemented injecting static pages with a > different URL scheme? I mean the rare scenario when you have tons of > static HTML pages and would want to avoid rerouting queries through your > own Web server, but rather fetch them directly from disk prefixing their > disk path with a given URL prefix. > > I looked at the problem briefly (I admit) and it seems it'd require some > manual coding because of the the split between indexer and fetcher pipeline. > > Any comments and suggestions are very welcome. > Dawid > > >
