According to Gabriele Bartolini:
> At 13.44 12/03/2002 +0200, Greg wrote:
> >I am running htdig 3.1.4 and want to do a re-index of the existing URL list
> >within the current database.  The conf file no longer contains the 
> >original URL
> >list.  Is there a way to redig the existing URL list without the start_url 
> >list and
> >if possible place the new db into a seperate set of files so that general
> >access is not affected?  If not how do I extract the existing url 
> >list?  Please
> >can you be very specific as I am not at all familiar with htdig command
> >syntax.
> 
> Ciao,
> 
>     please guys correct me if I am wrong, but I think that you Greg should 
> probably switch to the 3.1.6 version if you can. It should be almost 
> painless. It's just a consideration I am doing now ... :-)
> 
>     Geoff, Gilles & co, is the 3.1.4 database compatible with the 3.1.6 
> version?

Yes, 3.1.6 should be able to handle a 3.1.4 database without any
difficulty.  3.1.6 also includes an htdump utility to extract the
whole document database as an ASCII file.  You could probably fairly
easily extract the list of URLs from the db.docs file produced by
htdump, using an awk/sed/perl script.  See http://www.htdig.org/ for
all ht://Dig documentation, including syntax for individual commands.
See the documentation for awk, sed or Perl for information about how
you could use one of these to strip out the URLs from db.docs.

Something like "sed -n 's/^.*   u:\([^  ]*\)    .*/\1/p' db.docs" would
probably do it, where the spaces in the s/// command are actually tab
characters.

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to