Hello nutch-general,
Thanks for your answer! Also it is interested for me: How all Nutch users support their databases a freshen? Maybe exists some script which I can run from cron, give them db path and segments path and which will be to update database? Because I have not absolutely understood what steps it is necessary to make to just "freshen" my current database by hand... ;-/ At now I think is required to update: 1) Generate fetch list (bin/nutch generate <old_db_dir> <new_segment_dir>) 2) Fetch this list (bin/nutch fetch <new_segment_dir>) Now we have content of all required for update links in <new_segment_dir>. But farther what is better? Merge this new segment with old segment (mergesegs?) or i can to make something like: "bin/nutch updatedb <old_db_dir> <new_segment_dir>" ? I have tried and mergesegs and updatedb too, but does not see changes in db after this actions. Also I have noticed, that if I just delete old db, which used by searcher and place new db, searcher continues to use old db until I not restarted Tomcat. if I do make some changes directly in db, which searcher use, then he will consider them? Or how to update database without stopping web search? Thanks in advance! -----------Original Message----------- > To refetch you need to generate a fetch list of urls that need to be refetched. > See http://www.nutch.org/cgi-bin/twiki/view/Main/GenerateOptions > You can configure the timespan until a url need to be refetched in the config > file. > See http://www.nutch.org/conf/nutch-default.xml > "db.default.fetch.interval 30 The default number of days between re-fetches of > a page. " > Then you just fetching and in the end you need to merge the segments. > > HTH > Stefan > > > > Zitiere NGS <[EMAIL PROTECTED]>: > >> Hello, >> >> I have made "bin/nutch crawl". Now I want to re-fetch database. I >> make "bin/nutch updatedb db <segments directory>". Some seconds and >> I see "Finishing Update". I think this because default re-fetch >> time is set to 30 days. Then I have made "bin/nutch generate db >> segments -adddays 31". Now all links must be marked as out-of-date? >> I again make "bin/nutch updatedb db segments" and see that again >> really nothing was updated... >> >> Maybe I have skipped some stages? -- Best regards, NGS mailto:[EMAIL PROTECTED] ------------------------------------------------------- This Newsletter Sponsored by: Macrovision For reliable Linux application installations, use the industry's leading setup authoring tool, InstallShield X. Learn more and evaluate today. http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/ _______________________________________________ Nutch-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-general
