On Fri, Sep 26, 2008 at 5:04 PM, Edward Quick <[EMAIL PROTECTED]> wrote: > > When I run the updatedb, it states URL normalizing and filtering are set to > false. I think they are already active though? If not, could someone tell me > how I switch those on please? >
You don't normally need filter/normalize during updatedb, since all urls should already be filtered and normalized by other jobs at that point. Still, you can switch them on by passing -normalize -filter to updatedb. > Thanks, > Ed. > > $ bin/nutch updatedb crawl/crawldb crawl/segments/20080926135817 > CrawlDb update: starting > CrawlDb update: db: crawl/crawldb > CrawlDb update: segments: [crawl/segments/20080926135817] > CrawlDb update: additions allowed: true > CrawlDb update: URL normalizing: false > CrawlDb update: URL filtering: false > CrawlDb update: Merging segment data into db. > CrawlDb update: done > > > _________________________________________________________________ > Win New York holidays with Kellogg's & Live Search > http://clk.atdmt.com/UKM/go/111354033/direct/01/ -- Doğacan Güney
