You can find a lot of tutorials about cron on the web, use command crontab -e to add your script to it. Make sure crond is running,
When you start crawler next time not all URLs get updated. Only those which are old (it's configured in nutch-site.xml, see fetch settings). So you can reduce this interval if you want to refetch all sites frequently. Alexander On 14/08/2008, plat hpc <[EMAIL PROTECTED]> wrote: > > Hi Alex, > > Thanks for pointing out the cron. Which file do I set to the cron for > firing > up the regular search? > > And secondly, how do I get my site reindex again? I did the first time > months ago, it reindex pages at that moment. But now when I did it with the > same command, it doesn't seems to reindex new pages. > > Please advise. > > Thanks alot. > > On Wed, Aug 13, 2008 at 5:37 PM, Alexander Aristov < > [EMAIL PROTECTED]> wrote: > > > Add a cron job which will fire crawler on regular basis. It's a standard > > approach > > > > Alex > > > > > > On 13/08/2008, plat hpc <[EMAIL PROTECTED]> wrote: > > > > > > Hi, > > > > > > I am new to nutch, managed to installed nutch and set it up. Did the > > first > > > crawl few months back. Now as my site has some new posts and updates, > but > > > the nutch wasn't reflecting. So i did another crawl : bin/nutch crawl > > urls > > > -dir crawl -depth 4 -topN 200 But then the new posts and updates > didn't > > > seems to be updated. > > > > > > Would anyone please kindly tell me the steps/command to get nutch to > > crawl > > > my whole site and on regular updates? > > > > > > Thanks. > > > > > > > > > > > -- > > Best Regards > > Alexander Aristov > > > -- Best Regards Alexander Aristov
