On Mon, 19 Feb 2001, Geoff Hutchison wrote:
> Date: Mon, 19 Feb 2001 18:25:23 -0500 (EST)
> From: Geoff Hutchison <[EMAIL PROTECTED]>
> To: Gilles Detillieux <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED],
> "ht://Dig mailing list" <[EMAIL PROTECTED]>
> Subject: Re: [htdig] 3.2.0b3 on BSDI, light at the end of tunnel;)
>
> On Mon, 19 Feb 2001, Gilles Detillieux wrote:
>
> > But what did lines 62 and 63 look like before? It's perfectly valid to
> > start a line with "#", as long as the previous line wasn't an incomplete
> > definition ending with a "\" at the end of the line. This is what we
> > need to know. Was the code choking on valid syntax or not???
>
> Or more to the point, can we get a copy of the config file that was
> causing problems? It's one thing if you can index and quite another if we
> have a bug that you exposed that needs to get fixed.
I just checked the original htdig.conf from the source tree. It does not
have the problem. When I first tried a 3.2.0bx I copied the conf file
from the source tree to the conf folder; I then appended my 3.1.5 conf
file to it. Then I commented out duplicates, without meticulously placing
"#"'s at start of lines;(
Any way, My bad;(( I have attached the corrected conf file just in case.
Regards,
Joe
--
_/ _/_/_/ _/ ____________ __o
_/ _/ _/ _/ ______________ _-\<,_
_/ _/ _/_/_/ _/ _/ ......(_)/ (_)
_/_/ oe _/ _/. _/_/ ah [EMAIL PROTECTED]
#
# Example config file for ht://Dig.
#
# This configuration file is used by all the programs that make up ht://Dig.
# Please refer to the attribute reference manual for more details on what
# can be put into this file. (http://www.htdig.org/confindex.html)
# Note that most attributes have very reasonable default values so you
# really only have to add attributes here if you want to change the defaults.
#
# What follows are some of the common attributes you might want to change.
#
#
# Specify where the database files need to go. Make sure that there is
# plenty of free disk space available for the databases. They can get
# pretty big.
#
# database_dir: @DATABASE_DIR@
#
# This specifies the URL where the robot (htdig) will start. You can specify
# multiple URLs here. Just separate them by some whitespace.
# The example here will cause the ht://Dig homepage and related pages to be
# indexed.
# You could also index all the URLs in a file like so:
# start_url: `${common_dir}/start.url`
#
# start_url: http://www.htdig.org/
#
# This attribute limits the scope of the indexing process. The default is to
# set it to the same as the start_url above. This way only pages that are on
# the sites specified in the start_url attribute will be indexed and it will
# reject any URLs that go outside of those sites.
#
# Keep in mind that the value for this attribute is just a list of string
# patterns. As long as URLs contain at least one of the patterns it will be
# seen as part of the scope of the index.
#
# limit_urls_to: ${start_url}
#
# If there are particular pages that you definitely do NOT want to index, you
# can use the exclude_urls attribute. The value is a list of string patterns.
# If a URL matches any of the patterns, it will NOT be indexed. This is
# useful to exclude things like virtual web trees or database accesses. By
# default, all CGI URLs will be excluded. (Note that the /cgi-bin/ convention
# may not work on your web server. Check the path prefix used on your web
# server.)
#
# exclude_urls: /cgi-bin/ .cgi
#
# Since ht://Dig does not (and cannot) parse every document type, this
# attribute is a list of strings (extensions) that will be ignored during
# indexing. These are *only* checked at the end of a URL, whereas
# exclude_url patterns are matched anywhere.
#
# Also keep in mind that while other attributes allow regex, these must be
# actual strings.
#
# bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
# .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi
#
# The string htdig will send in every request to identify the robot. Change
# this to your email address.
#
# maintainer: [EMAIL PROTECTED]
#
# The excerpts that are displayed in long results rely on stored information
# in the index databases. The compiled default only stores 512 characters of
# text from each document (this excludes any HTML markup...) If you plan on
# using the excerpts you probably want to make this larger. The only concern
# here is that more disk space is going to be needed to store the additional
# information. Since disk space is cheap (! :-)) you might want to set this
# to a value so that a large percentage of the documents that you are going
# to be indexing are stored completely in the database. At SDSU we found
# that by setting this value to about 50k the index would get 97% of all
# documents completely and only 3% was cut off at 50k. You probably want to
# experiment with this value.
# Note that if you want to set this value low, you probably want to set the
# excerpt_show_top attribute to false so that the top excerpt_length characters
# of the document are always shown.
#
# max_head_length: 10000
#
# To limit network connections, ht://Dig will only pull up to a certain limit
# of bytes. This prevents the indexing from dying because the server keeps
# sending information. However, several FAQs happen because people have files
# bigger than the default limit of 100KB. This sets the default a bit higher.
# (see <http://www.htdig.org/FAQ.html> for more)
#
# max_doc_size: 200000
#
# Most people expect some sort of excerpt in results. By default, if the
# search words aren't found in context in the stored excerpt, htsearch shows
# the text defined in the no_excerpt_text attribute:
# (None of the search words were found in the top of this document.)
# This attribute instead will show the top of the excerpt.
#
# no_excerpt_show_top: true
#
# Depending on your needs, you might want to enable some of the fuzzy search
# algorithms. There are several to choose from and you can use them in any
# combination you feel comfortable with. Each algorithm will get a weight
# assigned to it so that in combinations of algorithms, certain algorithms get
# preference over others. Note that the weights only affect the ranking of
# the results, not the actual searching.
# The available algorithms are:
# accents
# exact
# endings
# metaphone
# prefix
# regex
# soundex
# speling [sic]
# substring
# synonyms
# By default only the "exact" algorithm is used with weight 1.
# Note that if you are going to use the endings, metaphone, soundex, accents,
# or synonyms algorithms, you will need to run htfuzzy to generate
# the databases they use.
#
# search_algorithm: exact:1 synonyms:0.5 endings:0.1
#
# The following are the templates used in the builtin search results
# The default is to use compiled versions of these files, which produces
# slightly faster results. However, uncommenting these lines makes it
# very easy to change the format of search results.
# See <http://www.htdig.org/hts_templates.html for more details.
#
# template_map: Long long ${common_dir}/long.html \
# Short short ${common_dir}/short.html
# template_name: long
#
# The following are used to change the text for the page index.
# The defaults are just boring text numbers. These images spice
# up the result pages quite a bit. (Feel free to do whatever, though)
#
# next_page_text: <img src="/Search/Images/buttonr.gif" border="0"
align="middle" width="30" height="30" alt="next">
# no_next_page_text:
# prev_page_text: <img src="/Search/Images/buttonl.gif" border="0"
align="middle" width="30" height="30" alt="prev">
# no_prev_page_text:
# page_number_text: '<img src="/Search/Images/button1.gif" border="0"
align="middle" width="30" height="30" alt="1">' \
# '<img src="/Search/Images/button2.gif" border="0"
align="middle" width="30" height="30" alt="2">' \
# '<img src="/Search/Images/button3.gif" border="0"
align="middle" width="30" height="30" alt="3">' \
# '<img src="/Search/Images/button4.gif" border="0"
align="middle" width="30" height="30" alt="4">' \
# '<img src="/Search/Images/button5.gif" border="0"
align="middle" width="30" height="30" alt="5">' \
# '<img src="/Search/Images/button6.gif" border="0"
align="middle" width="30" height="30" alt="6">' \
# '<img src="/Search/Images/button7.gif" border="0"
align="middle" width="30" height="30" alt="7">' \
# '<img src="/Search/Images/button8.gif" border="0"
align="middle" width="30" height="30" alt="8">' \
# '<img src="/Search/Images/button9.gif" border="0"
align="middle" width="30" height="30" alt="9">' \
# '<img src="/Search/Images/button10.gif" border="0"
align="middle" width="30" height="30" alt="10">'
#
# To make the current page stand out, we will put a border around the
# image for that page.
#
# no_page_number_text: '<img src="/Search/Images/button1.gif" border="2"
align="middle" width="30" height="30" alt="1">' \
# '<img src="/Search/Images/button2.gif" border="2"
align="middle" width="30" height="30" alt="2">' \
# '<img src="/Search/Images/button3.gif" border="2"
align="middle" width="30" height="30" alt="3">' \
# '<img src="/Search/Images/button4.gif" border="2"
align="middle" width="30" height="30" alt="4">' \
# '<img src="/Search/Images/button5.gif" border="2"
align="middle" width="30" height="30" alt="5">' \
# '<img src="/Search/Images/button6.gif" border="2"
align="middle" width="30" height="30" alt="6">' \
# '<img src="/Search/Images/button7.gif" border="2"
align="middle" width="30" height="30" alt="7">' \
# '<img src="/Search/Images/button8.gif" border="2"
align="middle" width="30" height="30" alt="8">' \
# '<img src="/Search/Images/button9.gif" border="2"
align="middle" width="30" height="30" alt="9">' \
# '<img src="/Search/Images/button10.gif" border="2"
align="middle" width="30" height="30" alt="10">'
# local variables:
# mode: text
# eval: (if (eq window-system 'x) (progn (setq font-lock-keywords (list '("^#.*" .
font-lock-keyword-face) '("^[a-zA-Z][^ :]+" . font-lock-function-name-face) '("[+$]*:"
. font-lock-comment-face) )) (font-lock-mode)))
# end:
#
# Example config file for ht://Dig.
# Last modified 2-Sep-1996 by Andrew Scherpbier
#
# This configuration file is used by all the programs that make up ht://Dig.
# Please refer to the attribute reference manual for more details on what
# can be put into this file. (http://htdig.sdsu.edu/configfile.html)
# Note that most attributes have very reasonable default values so you
# really only have to add attributes here if you want to change the defaults.
#
# What follows are some of the common attributes you might want to change.
#
#
# Specify where the database files need to go. Make sure that there is
# plenty of free disk space available for the databases. They can get
# pretty big.
#
database_dir: /Search/db
#
# This specifies the URL where the robot (htdig) will start. You can specify
# multiple URLs here. Just separate them by some whitespace.
# The example here will cause the ht://Dig homepage and related pages to be
# indexed.
#
start_url: http://www.ccsf.cc.ca.us/
#
# This attribute limits the scope of the indexing process. The default is to
# set it to the same as the start_url above. This way only pages that are on
# the sites specified in the start_url attribute will be indexed and it will
# reject any URLs that go outside of those sites.
#
# Keep in mind that the value for this attribute is just a list of string
# patterns. As long as URLs contain at least one of the patterns it will be
# seen as part of the scope of the index.
#
limit_urls_to: ${start_url}
#
# Access certain URLs on the local filesystem.
# For example, local_urls: http://www.foo.com/=/usr/www/htdocs/
#
local_urls:
http://www.ccsf.cc.ca.us/Associated_Students/=/Organizations/Associated_Students/\
http://www.ccsf.cc.ca.us/Campuses/=/dptweb/Campuses/\
http://www.ccsf.cc.ca.us/Catalog/=/dptweb/Catalog/\
http://www.ccsf.cc.ca.us/Channel_52/=/Departments/Channel_52/\
http://www.ccsf.cc.ca.us/Continuing_Education/=/Services/Continuing_Education/
#
# If there are particular pages that you definately do NOT want to index, you
# can use the exclude_urls attribute. The value is a list of string patterns.
# If a URL matches any of the patterns, it will NOT be indexed. This is
# useful to exclude things like virtual web trees or database accesses. By
# default, all CGI URLs will be excluded. (Note that the /cgi-bin/ convention
# may not work on your web server. Check the path prefix used on your web
# server.)
#
exclude_urls: /cgi-bin/ /title3-cgi/ /Guardsman/ .shtml/ ?
#
# Max keywords,
# max_keywords: "-1"
#
max_keywords: 11
#
# This is a weight of "how important" a page is, based on the number of URLs pointing
to it. It's actually multiplied
# by the ratio of the incoming URLs (backlinks) and outgoing URLs, to balance out
pages with lots of links to pages
# that link back to them. This factor can be changed without changing the database in
any way. The default may be
# a bit high.
#
backlink_factor: 0
#
# Since ht://Dig does not (and cannot) parse every document type, this
# attribute is a list of strings (extensions) that will be ignored during
# indexing. These are *only* checked at the end of a URL, whereas
# exclude_url patterns are matched anywhere.
#
bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif \
.jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi
#
# This factor, like backlink_factor can be changed without modifing the database. It
gives higher rankings to newer
# documents and lower rankings to older documents. Before setting this factor, it's
advised to make sure your
# servers are returning accurate dates (check the dates returned in the long format).
#
date_factor: 0
# Plain old "descriptions" are the text of a link pointing to a document. This factor
gives weight to the words of
# these descriptions of the document. Not surprisingly, these can be pretty accurate
summaries of a document's
# content. See also title_factor or text_factor. Changing this factor will require
updating your database.
#description_factor: "150"
#
description_factor: 1
# This is a factor which will be used to multiply the weight of words between
# <h1> and </h1> tags. It is used to assign the level of importance to certain
# headers. Setting a factor to 0 will cause words in this heading to be ignored.
# The number may be a floating point number. See also the title_factor and
# text_factor attributes.
heading_factor_1: 5
heading_factor_2: 4
heading_factor_3: 3
heading_factor_4: 0
heading_factor_5: 0
heading_factor_6: 0
# This is a factor which will be used to multiply the weight of words in the list of
# keywords of a document. The number may be a floating point number. See
# also the title_factor and text_factorattributes.
keywords_factor: 100
#
# This is a factor which will be used to multiply the weight of words in any META
description tags in a document.
# The number may be a floating point number. See also the title_factor and
text_factorattributes.
#meta_description_factor: "50"
#
meta_description_factor: 20
# This is a factor which will be used to multiply the weight of words that are not
# in any special part of a document. Setting a factor to 0 will cause normal
# words to be ignored. The number may be a floating point number. See also
# the heading_factor_[1-6], title_factor, and keyword_factor attributes.
text_factor: 1
# This is a factor which will be used to multiply the weight of words in the title
# of a document. Setting a factor to 0 will cause words in the title to be
# ignored. The number may be a floating point number. See also the
# heading_factor_[1-6] attribute.
title_factor: 100
#
# Depending on your needs, you might want to enable some of the fuzzy search
# algorithms. There are several to choose from and you can use them in any
# combination you feel comfortable with. Each algorithm will get a weight
# assigned to it so that in combinations of algorithms, certain algorithms get
# preference over others. Note that the weights only affect the ranking of
# the results, not the actual searching.
# The available algorithms are:
# exact
# endings
# synonyms
# soundex
# metaphone
# By default only the "exact" algorithm is used with weight 1.
# Note that if you are going to use any of the algorithms other than "exact",
# you need to use the htfuzzy program to generate the databases that each
# algorithm requires.
#
#search_algorithm: "exact:1"
#search_algorithm: exact:1 synonyms:0.5 endings:0.1
search_algorithm: exact:1 synonyms:.2 prefix:0.005 #endings:0.1
#
# The following are the templates used in the builtin search results
# The default is to use compiled versions of these files, which produces
# slightly faster results. However, uncommenting these lines makes it
# very easy to change the format of search results.
# See <http://www.htdig.org/hts_templates.html for more details.
#
#template_map: "Long builtin-long builtin-long Short builtin-short
builtin-short"
#template_name: "builtin-long"
# template_map: Long long ${common_dir}/long.html \
# Short short ${common_dir}/short.html
# template_name: long
#
# The following are used to change the text for the page index.
# The defaults are just boring text numbers. These images spice
# up the result pages quite a bit. (Feel free to do whatever, though)
#
next_page_text: <img src=/Pub/Search/Graphics/buttonr.gif border=0
align=middle width=30 height=30 alt=next>
no_next_page_text:
prev_page_text: <img src=/Pub/Search/Graphics/buttonl.gif border=0
align=middle width=30 height=30 alt=prev>
no_prev_page_text:
page_number_text: "<img src=/Pub/Search/Graphics/button1.gif border=0
align=middle width=30 height=30 alt=1>" \
"<img src=/Pub/Search/Graphics/button2.gif border=0
align=middle width=30 height=30 alt=2>" \
"<img src=/Pub/Search/Graphics/button3.gif border=0
align=middle width=30 height=30 alt=3>" \
"<img src=/Pub/Search/Graphics/button4.gif border=0
align=middle width=30 height=30 alt=4>" \
"<img src=/Pub/Search/Graphics/button5.gif border=0
align=middle width=30 height=30 alt=5>" \
"<img src=/Pub/Search/Graphics/button6.gif border=0
align=middle width=30 height=30 alt=6>" \
"<img src=/Pub/Search/Graphics/button7.gif border=0
align=middle width=30 height=30 alt=7>" \
"<img src=/Pub/Search/Graphics/button8.gif border=0
align=middle width=30 height=30 alt=8>" \
"<img src=/Pub/Search/Graphics/button9.gif border=0
align=middle width=30 height=30 alt=9>" \
"<img src=/Pub/Search/Graphics/button10.gif border=0
align=middle width=30 height=30 alt=10>"
#
# To make the current page stand out, we will put a border arround the
# image for that page.
#
no_page_number_text: "<img src=/Pub/Search/Graphics/button1.gif border=2
align=middle width=30 height=30 alt=1>" \
"<img src=/Pub/Search/Graphics/button2.gif border=2
align=middle width=30 height=30 alt=2>" \
"<img src=/Pub/Search/Graphics/button3.gif border=2
align=middle width=30 height=30 alt=3>" \
"<img src=/Pub/Search/Graphics/button4.gif border=2
align=middle width=30 height=30 alt=4>" \
"<img src=/Pub/Search/Graphics/button5.gif border=2
align=middle width=30 height=30 alt=5>" \
"<img src=/Pub/Search/Graphics/button6.gif border=2
align=middle width=30 height=30 alt=6>" \
"<img src=/Pub/Search/Graphics/button7.gif border=2
align=middle width=30 height=30 alt=7>" \
"<img src=/Pub/Search/Graphics/button8.gif border=2
align=middle width=30 height=30 alt=8>" \
"<img src=/Pub/Search/Graphics/button9.gif border=2
align=middle width=30 height=30 alt=9>" \
"<img src=/Pub/Search/Graphics/button10.gif border=2
align=middle width=30 height=30 alt=10>"
#
# If set to true, numbers are considered words. This means that searches
# can be done on number as well as regular words. All the same rules
# apply to numbers as to words. See the description of valid_punctuation
# for the rules used to determine what a word is.
#
allow_numbers: true
#
#
#
#allow_virtual_hosts: false
# This attribute is used to specify a list of content-type/parsers that are to be used
to parse documents that cannot
# by parsed by any of the internal parsers. The list of external parsers is examined
before the builtin parsers are
# checked, so this can be used to override the internal behavior without recompiling
htdig.
# The external parsers are specified as pairs of strings. The first string of each
pair is the content-type that the
# parser can handle while the second string each pair is the path to the external
parsing program. The parsing
# program will get the document to be parsed on its standard input and it is to write
information for htdig on its
# standard output.
# example:
# external_parsers: text/html /usr/local/bin/htmlparser application/ms-word
/usr/local/bin/mswordparse
external_parsers: application/msword->text/html /usr/local/bin/conv_doc.pl \
application/postscript->text/html /usr/local/bin/conv_doc.pl \
application/pdf->text/html /usr/local/bin/conv_doc.pl
#externalI_parsers: application/msword /usr/local/bin/parse_doc.pl \
# application/postscript /usr/local/bin/parse_doc.pl \
# application/pdf /usr/local/bin/parse_doc.pl
#
# This specifies the email address that htnotify email messages get sent out from. The
address is forged using
# /usr/lib/sendmail. Check htnotify/htnotify.cc for detail on how this is done.
htnotify_sender: [EMAIL PROTECTED]
#
# This sets whether htsearch should use the syslog() to log search requests. If set,
this will log requests with a
# default level of LOG_INFO and a facility of LOG_LOCAL5. For details on redirecting
the log into a separate file or
# other actions, see the syslog.conf(5) man page. To set the level and facility used
in logging, change LOG_LEVEL
# and LOG_FACILITY in the include/htconfig.h file before compiling.
# Log file path /Search/conf/log
#
logging: true
#
# The words in this list are used to search for keywords in HTML META tags. This list
can contain any number of
# strings that each will be seen as the name for whatever keyword convention is used.
# The META tags have the following format:
# <META name="somename" value="somevalue">
keywords_meta_tag_names: keywords htdig-keywords
#
# The string htdig will send in every request to identify the robot. Change
# this to your email address.
#
maintainer: [EMAIL PROTECTED]
#
# If this is set to a relatively small number, the matches will be shown in
# pages instead of all at once.
#
matches_per_page: 10
#
# While gathering descriptions of URLs, htdig will only record those descriptions
which are shorter than this
# length. This is used mostly to deal with broken HTML. (If a hyperlink is not
terminated with a </a> the description
# will go on until the end of the document.)
#
max_description_length: 60
#
# The excerpts that are displayed in long results rely on stored information
# in the index databases. The compiled default only stores 512 characters of
# text from each document (this excludes any HTML markup...) If you plan on
# using the excerpts you probably want to make this larger. The only concern
# here is that more disk space is going to be needed to store the additional
# information. Since disk space is cheap (! :-)) you might want to set this
# to a value so that a large percentage of the documents that you are going
# to be indexing are stored completely in the database. At SDSU we found
# that by setting this value to about 50k the index would get 97% of all
# documents completely and only 3% was cut off at 50k. You probably want to
# experiment with this value.
# Note that if you want to set this value low, you probably want to set the
# excerpt_show_top attribute to false so that the top excerpt_length characters
# of the document are always shown.
#max_head_length: "512"
#
max_head_length: 500000
# Instead of limiting the indexing process by URL pattern, it can also be limited
# by the number of hops or clicks a document is removed from the starting
# URL. Unfortunately, this only works reliably when a complete index is
# created, not an update.
# The starting page will have hop count 0.
max_hop_count: 999999
# the maximum number of extentions ( if you set it to 2 it will only fetch abc,abca,
abcb )
#
#max_prefix_matches: 1000
# if its left blank, it will always try to expand the search words
#prefix_match_character: "*"
#
# When stars are used to display the score of a match, this value determines the
maximum number of stars that can
# be displayed.
max_stars: 5
#
# This sets the minimum length of words that will be indexed. Words
# shorter than this value will be silently ignored but still put into the
excerpt.
# Note that by making this value less than 3, a lot more words that are
# very frequent will be indexed. It might be advisable to add some of these
# to the bad_words list.
#
minimum_word_length: 3
#
# If no excerpt is available, this option will act the same as excerpt_show_top, that
is, it will show the top of the
# document.
#
no_excerpt_show_top: true
excerpt_show_top: no
# The following line is the default used by PDF.cc if there is no pdf_converter
# in the config file
# pdf_converter: acroread -toPostScript -pairs %src %dest
# Using acroread that is not in the PATH
# pdf_converter: /usr/local/bin/acroread -toPostScript -pairs %src %dest
# Using pdftops that comes in the xpdf package
# pdf_converter: /usr/local/bin/pdftops %src %dest
pdf_parser: /usr/contrib/bin/pdftops
max_doc_size: 1650000
# If TRUE, htmerge will remove any URLs which were marked as unreachable
# by htdig from the database. If FALSE, it will not do this. When htdig is run in
# initial mode, documents which were referred to but could not be accessed
# should probably be removed, and hence this option should then be set to
# TRUE, however, if htdig is run to update the database, this may cause
# documents on a server which is temporarily unavailable to be removed. This
# is probably NOT what was intended, so hence this option should be set to
# FALSE in that case.
remove_bad_urls: true
remove_default_doc: index.shtml index.html index.htm homepage.html homepage.htm
home.html home.htm
#
# This directive tells the indexer that servers have several DNS aliases, which all
point to the same machine and are
# NOT virtual hosts. This allows you to ensure pages are indexed only once on a given
machine, despite the alias
# used in a URL.
server_aliases:
cloud.ccsf.cc.ca.us:80=www.ccsf.cc.ca.us:80=cloud.ccsf.org:80=www.ccsf.org:80
#
# If set to true, any META description tags will be used as excerpts by htsearch. Any
documents that do not have
# META descriptions will retain their normal excerpts.
use_meta_description: true
#compression_level: 0
#excerpt_length: 300
#These characters are considered part of a word. In contrast to the characters in the
valid_punctuation attribute, they are treated
#just like letter characters. Note that the locale attribute is normally used to
configure which characters constitute letter
#characters. example: extra_word_characters: _
extra_word_characters: _
local_default_doc: index.shtml index.html index.htm homepage.html
homepage.htm home.html home.htm
#local_urls_only: false
#Set this to access user directory URLs through the local filesystem. If you leave the
"path" portion out, it will look up the
#user's home directory in /etc/password (or NIS or whatever). As with local_urls, if
the files are not found, ht://Dig will try with
#HTTP. Again, note the example's format. To map http://www.my.org/~joe/foo/bar.html to
/home/joe/www/foo/bar.html, try the example
#below. The fallback to HTTP can be disabled by setting the local_urls_only attribute
to true. As of 3.1.5, you can provide multiple
#mappings of a given URL to different directories, and htdig will use the first
mapping that works. Special characters can
#be embedded in these names using %xx hex encoding. For example, you can use %3D to
embed an "=" sign in an URL pattern.
#example: local_user_urls: http://www.my.org/=/home/,/www/
local_user_urls: http://www.ccsf.cc.ca.us/=/www/,/dptweb/
#max_descriptions: 5
#max_meta_description_length: 512
#minimum_prefix_length: 1
#valid_punctuation: ".-_/!#$%^&'"
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
Information: http://lists.sourceforge.net/lists/listinfo/htdig-general
FAQ: http://htdig.sourceforge.net/FAQ.html