Hello sg,

smsc> I run nutch with a script that runs in a never ending loop but does sleep for
smsc> some time between the steps.

This script are big?
Can you show it please?
I really do not understand, what is needed to make in loop.

I have tried to create something like in tutorial:
             ------update.sh content--------
  bin/nutch generate db segments
  s1=`ls -d segments/2* | tail -1`
  echo $s1
  bin/nutch fetch $s1
  bin/nutch updatedb db $s1
  bin/nutch analyze db 5
  bin/nutch index $s1
  bin/nutch dedup segments dedup.tmp
            ---------------------------------

But with each run, it creates separate segment. I think it will occupy
many HDD space ;-/
Second problem in real-time database update: after each script run I
am compelled to restart Tomcat for update websearcher results :(
How you solve this problem?

smsc> The problem with cron jobs is that you do not know when a task is ready and the
smsc> next should start.

We can create some file-flag at the begin of script for indicating
nutch process, and delete it at the end.  If we run script and
file-flag exists then - exit. It will help to solve problem with run
task once... 

---
Best regards,
 NGS                            mailto:[EMAIL PROTECTED]



-------------------------------------------------------
This SF.Net email is sponsored by:
Sybase ASE Linux Express Edition - download now for FREE
LinuxWorld Reader's Choice Award Winner for best database on Linux.
http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click
_______________________________________________
Nutch-developers mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to