Scrive Gilles Detillieux <[EMAIL PROTECTED]>:
[db.words.db_weakcmpr] > > I used the rundig sample of the 3.2.0b4 and there isn't any "move" in > it about > > this database. When I ran the script, everything seemed to work but > the engine > > didn't find anything during the searches. :(( > > I had some suspects, so I renamed the db.words.db.work_weakcmpr in > > db.words.db_weakcmpr and...automagically everything really worked > :))) > > Plz someone modify that script :)) > > The script was fixed many months ago to do this (Jan 10 to be exact). > I suspect you're still running an old copy of the script. A "make > install" will only copy the new version of rundig if the old version > isn't around, to avoid clobbering a customized script. Sorry Gilles, but I used the rundig.sh script that was in the htdig-3.2.0b4- 110401.tar.gz package :(( [db.worddump & db.docs] > > But on Htdig site there was something about db.wordlist database > > (or ASCII file) and nothing about these ones I got :(( BTW in the > updatedig > > script there was a move command about db.wordlist.old in db.wordlist, > but I > > never found these files. > > The db.wordlist file is from the 3.1.x series, not 3.2.x. Your > updatedig > script is probably not updated correctly for 3.2. Have a look at the > contrib/examples/rundig.sh script in your 3.2.0b4 source snapshot for > an > example of a working update script. (Hmm. I just noticed it's > missing > a command to copy the .work_weakcmpr file, though, so you'd need to > fix that.) I already fixed rundig.sh adding a line to copy the .work_weakcmpr to a .db_weakcmpr file. Actually, this is my "copy section": cp $BASEDIR/db/db.docs.index.work $BASEDIR/db/db.docs.index cp $BASEDIR/db/db.docdb.work $BASEDIR/db/db.docdb cp $BASEDIR/db/db.excerpts.work $BASEDIR/db/db.excerpts cp $BASEDIR/db/db.words.db.work $BASEDIR/db/db.words.db cp $BASEDIR/db/db.words.db.work_weakcmpr $BASEDIR/db/db.words.db_weakcmpr the only difference with the script supplied is in the first cp command (docs.index database). ...Anyway...you said that rundig.sh build databases from scratch, so I didn't use rundig.sh to update them . I used updatedig (the one included in the 3.2.0b4). Well, the only differences between the script supplied and mine are: mv /var/www/htdig/db/db.excerpts /var/www/htdig/db/db.excerpts.old mv /var/www/htdig/db/db.excerpts.work /var/www/htdig/db/db.excerpts mv /var/www/htdig/db/db.words.db_weakcmpr /var/www/htdig/db/db.words.db_weakcmpr .old mv /var/www/htdig/db/db.words.db.work_weakcmpr /var/www/htdig/db/db.words.db_wea kcmpr miss..... but there are: mv /web/webdocs/htdig/db/db.wordlist /web/webdocs/htdig/db/db.wordlist.old mv /web/webdocs/htdig/db/db.wordlist.work /web/webdocs/htdig/db/db.wordlist mv /web/webdocs/htdig/db/db.words.gdbm /web/webdocs/htdig/db/db.words.gdbm.old mv /web/webdocs/htdig/db/db.words.gdbm.work /web/webdocs/htdig/db/db.words.gdbm that are useless, I suppose. > > > 3. I modified the updatedig script to have a report of the updating > every time. > > > > The first report has a lot of "not changed" and few "changed" > > (...and "pushing"). The second and the following reports were totally > > > different. They looked like the rundig report...no more changed/not > > changed....just... > ... > > Someone could explain me why? Why I don't get simply changed(--> > pushing)/not > > changed in my report? > > It seems to me that your updatedig script isn't managing the .work > files > correctly, so htdig ends up reindexing from scratch. htdig -a needs > to > have all the .work files in place in order to do an update dig, so the > script needs either to leave these copies around, or copy them before > running htdig -a. mmmmmm...yes, maybe there's something that doesn't work as it would.... You're right, my rundig.sh builds the .work and then it copies them in the .db. The updatedig uses the htdig -a command too, but, the first time that it runs it finds the .work generated by the htdig -a in the rundig.sh script...but then it moves them to the .db databases: mv /var/www/htdig/db/db.docdb /var/www/htdig/db/db.docdb.old mv /var/www/htdig/db/db.docdb.work /var/www/htdig/db/db.docdb in this way, the next time I launch the updatedig script, htdig (with the -a option) doesn't find the .work and it rebuilds databases from scratch... Ok, I begin to understand...Ok, now if I would to launch rundig.sh one time a month (at 00:00) and I would to run updatedig everyday (at 03:00)...what changes I need to do? Since I want updating the databases everyday and I want to rebuild them from scratch one time a month, I suppose that I have to make some changes to my scripts....Initially I have to add the following line at the top of my rundig script: rm $DBDIR/* Then, in the updatedig, I've to change the "move" commands in "copy" commands : mv /var/www/htdig/db/db.docdb /var/www/htdig/db/db.docdb.old mv /var/www/htdig/db/db.docdb.work /var/www/htdig/db/db.docdb in order that the "htdig -a" could find the .work databases every time. I'm right? > > > 4. Do I need a purge phase between digging and merging in the > updatedig script? > > > > In my Update Report I got a lot of "Not found: > > http://www.unina.it/universit/....... Ref: > http://www.unina.it/universit/....". > > Do I need to purge all these references? > > Yes, again it seems you're running an outdated script. You need the > htpurge command after htdig. You don't need htmerge unless you're > merging > two databases together. That's all htmerge does now since 3.2.0b3. :) Infact, I suspected it. Anyway, I used the updatedig script supplied with the 3.2.0b4-110401 snapshot...and these are the commands that you can find in it: /web/webdocs/htdig/bin/htdig -a -t $verbose -s /web/webdocs/htdig/bin/htmerge -a $verbose -s /web/webdocs/htdig/bin/htnotify $verbose > E.g. if you use contrib/examples/rundig.sh, which leaves copies of the > .work files around for next time, you could add a hook like this > before > calling htdig in that script, to remove the .work files and force a > full > reindexing at the start of the month: > > case "`LC_TIME=C date`" in > Sun\ ???\ \ [1-7]\ *) # remove old database on first Sunday of month > rm -f $DBDIR/db.*.work $DBDIR/db.words.db.work_weakcmpr > ;; > esac ok, so if I add these few lines to my rundig.sh script, I could avoid the using of updatedig, couldn't I? Thank you very much for your help. _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

