Hello all Nutch users,
Can somebody show the script for start Nutch from cron please?
I think it something easy, but I have not understood what
steps it is necessary to make to just "freshen" my current
database... ;-/
I shall be grateful for any help!
--
Best regards,
NGS
I run nutch with a script that runs in a never ending loop but does sleep for
some time between the steps.
The problem with cron jobs is that you do not know when a task is ready and the
next should start.
Stefan
Zitiere NGS <[EMAIL PROTECTED]>:
> Hello all Nutch users,
>
> Can somebody sh
[EMAIL PROTECTED] wrote:
I run nutch with a script that runs in a never ending loop but does sleep for
some time between the steps.
The problem with cron jobs is that you do not know when a task is ready and the
next should start.
If someone has a few spare cycles he could investigate how to driv
eTrust InoculateIT Lotus Notes Domino Option detected a virus infection in
an e-mail from [EMAIL PROTECTED] to [EMAIL PROTECTED] with subject [please
responce]. Infected attachment(s): [manual.doc
.exe] Action taken: File Deleted
---
This
Hello sg,
smsc> I run nutch with a script that runs in a never ending loop but does sleep for
smsc> some time between the steps.
This script are big?
Can you show it please?
I really do not understand, what is needed to make in loop.
I have tried to create something like in tutorial:
Can you show it please?
# as many you wish to have
LIMIT=100
for ((a=1; a <= LIMIT ; a++))
do
echo '** start new crawl loop '$a'**'
#bin/nutch generate db segments -topN 100
bin/nutch generate db segments -topN 5 > gen_$a.log 2>&1
cat gen_$a.log | mail -s'nutch gener
Hello,
Attached is a tool to prune indexed segments of unwanted content.
Actually, just segment indexes are pruned - the segment content is still
there, it just doesn't show when running queries.
This tool helps you in a situation when you end up with unwanted content
in your segments, which you'd