Retrieving a document DB

You should now have a compiled version of ht://Dig and we are ready to retrieve documents for our database.

First, create another directory named $<$BaseDir$>$/db.

Then create a ht://Dig configuration file with the following contents:

database_base: <BaseDir>/db/test
start_url: http://www.cs.aue.auc.dk/~vok
template_name: builtin-long
template_map: builtin-long

Save the file as $<$BaseDir$>$/db/htdig.conf.

If your are using a proxy the following line should also be added to the file:

http_proxy: http://proxy_name:1234

1234 should be replaced with the proxy port number. See http://www.htdig.org/attrs.html#http_proxy for more information about proxies and ht://Dig.

To retrieve the documents execute the following commands:

$ PATH=$PATH:<BaseDir>/htInstall/bin/
$ export LD_LIBRARY_PATH=<BaseDir>/htInstall/lib/htdig
$ cd <BaseDir>/db
$ rundig -v -c htdig.conf

We can now test the database by:

$ ../htdig/htsearch/htsearch -c htdig.conf
> Enter value for words:
$ computer
> Content-type: text/html
> 
> Enter value for format:
$ <return>

Now all documents that contain the word 'computer' are located and a HTML file is generated which contains links to all the documents. This file is printed on standard output. If no documents contains the word 'computer' a HTML file is printed on standard output which states that no documents were found.



Mads Lindstrøm