Author: brad
Email: [EMAIL PROTECTED]
Message:
I have indexer working fine on RedHat 6.2, but when I try the same setup on Mandrake 
7.1 - indexer populates all the tables except keywords and description.  index.conf 
follows.

###########################################################################
# This is sample indexer config file.
# To start using it please edit and rename to indexer.conf
# You may want to keep the original indexer.conf-dist for future references.
# Use '#' to comment out lines.
# All command names are case insensitive (DBAddr=DBADDR=dbaddr).
# You may use '\' character to prolong current command to next line
# when it is required.
#
# You may include enother configuration file in any place of the indexer.conf
# using "Include <filename>" command.
# Absolute path if <filename> starts with "/":
#Include /usr/local/mnogosearch/etc/inc1.conf
# Relative path else:
#Include inc1.conf
###########################################################################



###########################################################################
#  Section 1.
#  Global parameters.


###########################################################################
# DBAddr <URL-style database description>
# Options (type, host, database name, port, user and password) 
# to connect to SQL database.
# Do not matter for built-in text files support.
# Should be used only once and before any other commands.
# Command have global effect for whole config file.
# Format:
#DBAddr <DBType>:[//[DBUser[:DBPass]@]DBHost[:DBPort]]/DBName/
#
# ODBC notes:
#       Use DBName to specify ODBC data source name (DSN)
#       DBHost does not matter, use "localhost".
# Solid notes:
#       Use DBHost to specify Solid server
#       DBName does not matter for Solid
#
# Currently supported DBType values are 
# mysql, pgsql, msql, solid, mssql, oracle, ibase.
# Actually, it does not matter for native libraries support.
# But ODBC users should specify one of supported values.
# If your database type is not supported, you may use "unknown" instead.
# If you are using PostgreSQL and do not specify hostname,
#       e.g. pgsql://user:password@/dbname/
# then PostgreSQL will not work via TCP, but will use Unix socket.

DBAddr          mysql://xxx:xxxxx@localhost/udmsearch/


#######################################################################
# DBMode single/multi/crc/crc-multi
# Does not matter for built-in text files support
# You may select SQL database mode of words storage.
# When "single" is specified, all words are stored in the same
# table. If "multi" is selected, words will be located in different
# tables depending of their lengths. "multi" mode is usually faster
# but requires more tables in database. 
#
# If "crc" mode is selected, mnoGoSearch will store 32 bit integer
# word IDs calculated by CRC32 algorythm instead of words. This
# mode requres less disk space and it is faster comparing with "single"
# and "multi" modes. "crc-multi" uses the same storage structure with
# the "crc" mode, but also stores words in different tables depending on 
# words lengths like "multi" mode.
#
#Default DBMode value is "single":
DBMode single


#######################################################################
#SyslogFacility <facility>
# This is used if indexer was compiled with syslog support and if you
# don't like the default value. Argument is the same as used in syslog.conf
# file. For list of possible facilities see syslog.conf(5)
#SyslogFacility local7


#######################################################################
#LogdAddr host[:port]
# Use cachelogd at given host and port if specified.
# It is required for "cache mode" only. Default values are localhost 
# and port 7000
#LogdAddr localhost:7000


#######################################################################
# LocalCharset <charset>
# Defines charset of local file system. It is required if you are using 
# 8 bit charsets and does not matter for 7 bit charsets.
# This command should be used once and takes global effect for the config file.
# Choose currently supported one:
#
# Western Europe: Germany
#LocalCharset iso-8859-1
#
# Central Europe: Czech
#LocalCharset iso-8859-2
#
# ISO Cyrillic
#LocalCharset iso-8859-5
#
# Unix Cyrillic
#LocalCharset koi8-r
#
# MS Central Europe: Czech
#LocalCharset windows-1250
#
# MS DOS Cyrillic
#LocalCharset cp866
#
# MS Cyrillic
#LocalCharset windows-1251
#
# MS Arabic
#LocalCharset windows-1256
#
# Mac Cyrillic
#LocalCharset x-mac-cyrillic
#
# ISO Greek
#LocalCharset iso-8859-7
#
# MS Greek
#LocalCharset windows-1253
#
# ISO Hebrew
#LocalCharset iso-8859-8
#
# MS Hebrew
#LocalCharset windows-1255
#
# ISO Baltic
#LocalCharset iso-8859-4
#LocalCharset iso-8859-13
#
# MS Baltic
#LocalCharset windows-1257
#
# ISO Turkish
#LocalCharset iso-8859-9
#
# MS Turkish
#LocalCHarset windows-1254


#######################################################################
#ForceIISCharset1251 yes/no
#This option is useful for users which deals with Cyrillic content and broken
#(or misconfigured?) Microsoft IIS web servers, which tends to not report
#charset correctly. This is really dirty hack, but if this option is turned on
#it is assumed that all servers which reports as 'Microsoft' or 'IIS' have
#content in Windows-1251 charset.
#This command should be used only once in configuration file and takes global
#effect.
#Default: no
ForceIISCharset1251 no


###########################################################################
# Ispell support commands. Detailed description is given in /doc/ispell.txt
# Ispell commands MUST be given after LocalCharset definition.
# Set ispell mode. Can be text (default) or db. If set to db then
# Affix and Spell command should not be used.
#IspellUsePrefixes yes/no
# If enabled, indexer will use ispell prefixes, not only suffixes
# Default: no
#Ispellmode text
# Load ispell affix file:
#Affix <lang> <ispell affixes file name>
# Load ispell dictionary file
#Spell <lang> <ispell dictionary file name>
# File names are relative to mnoGoSearch /etc directory
# Absolute paths can be also specified.
#
#Affix en en.aff
#Spell en en.dict

###########################################################################
#Phrase yes/no
#  Whether to index with phrase support. Default value is no.
Phrase no


###########################################################################
#CrossWords yes/no
# Whether to build CrossWords index
# Default value is no
CrossWords no


###########################################################################
# StopwordFile <filename>
# Load stop words from the given text file. You may specify either absolute 
# file name or a name relative to mnoGoSearch /etc directory. You may use
# several StopwordFile commands.
#
#StopwordFile stopwords.txt

###########################################################################
# StopwordTable <tablename> [<tablename>...]
# Load stop words from the given SQL table. You may use several 
# StopwordTable commands. This command has no effect work when compiled
# without SQL database support.
#
StopwordTable stopword

#######################################################################
# Word lengths. You may change default length range of words
# stored in database. By default, words with the length in the
# range from 1 to 32 are stored. Note that setting MaxWordLength more
# than 32 will not work as expected.
#
MinWordLength 1
MaxWordLength 32

#######################################################################
# MaxDocSize bytes
# Default value 1048576 (1 Mb)
# Takes global effect for whole config file
MaxDocSize 1048576


#######################################################################
# HTTPHeader <header>
# You may add your desired headers in indexer HTTP request
# You should not use "If-Modified-Since","Accept-Charset" headers,
# these headers are composed by indexer itself.
# "User-Agent: mnoGoSearch/version" is sent too, but you may override it.
# Command has global effect for all configuration file.
#
#HTTPHeader User-Agent: My_Own_Agent
#HTTPHeader Accept-Language: ru, en
#HTTPHeader From: [EMAIL PROTECTED]


#######################################################################
# ServerTable <table_name>   (SQL only, not supported with build-in database)
# Load servers with all their parameters from the table "table_name".
# Check an example of these tables structure in create/mysql/server.txt
# You may use several arguments for this command:
#ServerTable my_servers1 my_servers2 my_servers3
# or the only one argument:
#
#ServerTable server


#######################################################################
#DeleteNoServer yes/no
# Use it to choose whether delete or not those URLs which have no
# correspondent "Server" commands.
# Default value is "yes".
#DeleteNoServer yes



##########################################################################
# Section 2.
# URL control configuration.


##########################################################################
#Allow [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ... ]
# Use this to allow URLs that match (doesn't match) given argument.
# First three optional parameters describe the type of comparison.
# Default values are Match, NoCase, String.
# Use "NoCase" or "Case" values to choose case insensitive or case sensitive
# comparison.
# Use "Regex" to choose regular expression comparison. 
# Use "String" to choose string with wildcards comparison.
# Widlcards are '*' for any number of characters and '?' for one character.
# Note that '?' and '*' have special meaning in "String" match type. Please use
# "Regex" to describe documents with '?' and '*' signs in URL.
# "String" match is much faster than "Regex". Use "String" where it 
# is possible.
# You may use several arguments for one 'Allow' command.
# You may use this command any times.
# Takes global effect for config file.
# Note that mnoGoSearch automatically adds one "Allow regex .*"
# command after reading config file. It means that allowed everything
# that is not disallowed.
# Examples
#  Allow everything:
Allow *
#  Allow everything but .php .cgi .pl extensions case insensitively using regex:
#Allow NoMatch Regex \.php$|\.cgi$|\.pl$
#  Allow .HTM extension case sensitively:
#Allow Case *.HTM


##########################################################################
#Disallow [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ... ]
# Use this to disallow URLs that match (doesn't match) given argument.
# The meaning of first three optional parameters is exactly the same 
# with "Allow" command.
# You can use several arguments for one 'Disallow' command.
# Takes global effect for config file.
#
# Examples:
# Disalow URLs that are not in udm.net domains using "string" match:
#Disallow NoMatch *.udm.net/*
# Disallow any except known extensions and directory index using "regex" match:
#Disallow NoMatch Regex \/$|\.htm$|\.html$|\.shtml$|\.phtml$|\.php$|\.txt$
# Exclude cgi-bin and non-parsed-headers using "string" match:
#Disallow */cgi-bin/* *.cgi */nph-*
# Exclude anything with '?' sign in URL. Note that '?' sign has a 
# special meaning in "string" match, so we have to use "regex" match here:
#Disallow Regex  \?


# Exclude some known extensions using fast "String" match:
Disallow *.b    *.sh   *.md5  *.rpm
Disallow *.arj  *.tar  *.zip  *.tgz  *.gz   *.z     *.bz2 
Disallow *.lha  *.lzh  *.rar  *.zoo  *.ha   *.tar.Z
Disallow *.gif  *.jpg  *.jpeg *.bmp  *.tiff *.tif   *.xpm  *.xbm *.pcx
Disallow *.vdo  *.mpeg *.mpe  *.mpg  *.avi  *.movie *.mov  *.dat
Disallow *.mid  *.mp3  *.rm   *.ram  *.wav  *.aiff  *.ra
Disallow *.vrml *.wrl  *.png
Disallow *.exe  *.com  *.cab  *.dll  *.bin  *.class *.ex_
Disallow *.tex  *.texi *.xls  *.doc  *.texinfo
Disallow *.rtf  *.pdf  *.cdf  *.ps
Disallow *.ai   *.eps  *.ppt  *.hqx
Disallow *.cpt  *.bms  *.oda  *.tcl
Disallow *.o    *.a    *.la   *.so 
Disallow *.pat  *.pm   *.m4   *.am   *.css
Disallow *.map  *.aif  *.sit  *.sea
Disallow *.m3u  *.qt   *.mov

# Exclude Apache directory list in different sort order using "string" match:
Disallow *D=A *D=D *M=A *M=D *N=A *N=D *S=A *S=D

# More complicated case. RAR .r00-.r99, ARJ a00-a99 files 
# and unix shared libraries. We use "Regex" match type here:
Disallow Regex \.r[0-9][0-9]$ \.a[0-9][0-9]$ \.so\.[0-9]$



##########################################################################
#CheckOnly [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ... ]
# The meaning of first three optional parameters is exactly the same 
# with "Allow" command.
# Indexer will use HEAD instead of GET HTTP method for URLs that
# match/do not match given regular expressions. It means that the file 
# will be checked only for being existing and will not be downloaded. 
# Useful for zip,exe,arj and other binary files.
# Note that you can disallow those files with commands given below.
# You may use several arguments for one "CheckOnly" commands.
# Useful for example for searching through the URL names rather than
# the contents (a la FTP-search).
# Takes global effect for config file.
#
# Check some known non-text extensions using "string" match:
#CheckOnly *.b    *.sh   *.md5
#CheckOnly *.arj  *.tar  *.zip  *.tgz  *.gz
#CheckOnly *.lha  *.lzh  *.rar  *.zoo  *.tar*.Z
#CheckOnly *.gif  *.jpg  *.jpeg *.bmp  *.tiff 
#CheckOnly *.vdo  *.mpeg *.mpe  *.mpg  *.avi  *.movie
#CheckOnly *.mid  *.mp3  *.rm   *.ram  *.wav  *.aiff
#CheckOnly *.vrml *.wrl  *.png
#CheckOnly *.exe  *.cab  *.dll  *.bin  *.class
#CheckOnly *.tex  *.texi *.xls  *.doc  *.texinfo
#CheckOnly *.rtf  *.pdf  *.cdf  *.ps
#CheckOnly *.ai   *.eps  *.ppt  *.hqx
#CheckOnly *.cpt  *.bms  *.oda  *.tcl
#CheckOnly *.rpm  *.m3u  *.qt   *.mov
#CheckOnly *.map  *.aif  *.sit  *.sea
#
# or check ANY except known text extensions using "regex" match:
#Check NoMatch Regex \/$|\.html$|\.shtml$|\.phtml$|\.php$|\.txt$


##########################################################################
#HrefOnly [Match|NoMatch] [NoCase|Case] [String|Regex] <arg> [<arg> ... ]
# The meaning of first three optional parameters is exactly the same 
# with "Allow" command.
#
# Use this to scan a HTML page for "href" tags but not to index the contents
# of the page with an URLs that match (doesn't match) given argument.
# Commands have global effect for all configuration file.
#
# When indexing large mail list archives for example, the index and thread
# index pages (like mail.10.html, thread.21.html, etc.) should be scanned 
# for links but shouldn't be indexed:
#
#HrefOnly */mail*.html */thread*.html



# How to combine Allow, Disallow, CheckOnly, HrefOnly commands.
#
# indexer compares URLs against all these command arguments in the 
# order of their appearence in indexer.conf file. 
# If indexer find that URL matches some rule it will make a decision of what 
# to do with this URL, allow it, disallow it or use HEAD instead 
# of the GET method. So, you may use different Allow, Disallow,
# CheckOnly, HrefOnly commands order.
# If no one of these commands are given, mnoGoSearch will allow everything 
# by default.
#
# There are many possible combinations. Samples of two of them are here:
#
# Sample of first useful combination.
# Disallow known non-text extensions (zip,wav etc),
# then allow everything else. This sample is uncommented above (note that
# there is actually no "Allow *" command, it is added automatically after
# indexer.conf loading).
#
# Sample of second combination.
# Allow some known text extensions (html, txt) and directory index ( / ), 
# then disallow everything else:
#
#Allow .html .txt */
#Disallow *



################################################################
# Section 3.
# Mime types and external parsers.


################################################################
#UseRemoteContentType yes/no
# This command specifies if the indexer should get content type
# from http server headers (yes) or from it's AddType settings (no).
# If set to 'no' and the indexer could not determine content-type
# by using its AddType settings, then it will use http header.
# Default: yes
UseRemoteContentType yes


################################################################
#AddType [String|Regex] [Case|NoCase] <mime type> <arg> [<arg>...]
# This command associates filename extensions (for services
# that don't automatically include them) with their mime types.
# Currently "file:" protocol uses these commands.
# Use optional first two parameter to choose comparison type.
# Default type is "String" "NoCase" (case insensitive string match with
# '?' and '*' wildcards for one and several characters correspondently).
#
AddType text/plain      *.txt  *.pl *.js *.h *.c *.pm *.e
AddType text/html       *.html *.htm
AddType image/x-xpixmap *.xpm
AddType image/x-xbitmap *.xbm
AddType image/gif       *.gif
#
# You may also use quotes in mime type definition
# for example to specify charset. e.g. Russian webmasters 
# often use *.htm extension for windows-1251 documents and
# *.html for unix koi8-r documents:
#
#AddType "text/html; charset=koi8-r"       *.html
#AddType "text/html; charset=windows-1251" *.htm
#
# More complicated example for rar .r00-r.99 using "Regex" match:
#AddType Regex application/rar  \.r[0-9][0-9]$
#
# Default unknown type for other extensions:
AddType application/unknown *.*


# Mime <from_mime> <to_mime> <command line>
#
# This is used to add support for parsing documents with mime types other
# than text/plain and text/html. It can be done via external parser (which
# must provide output in plain or html text) or just by substituting mime
# type so indexer will understand it.
# 
# <from_mime> and <to_mime> are standard mime types
# <to_mime> is either text/plain or text/html
#
# Optional charset parameter used to change charset if needed.
# Mime command understands case insensitive string match
# with ? and * signs. You may use:
#
# Mime application/pdf*
#
# Command line may have $1 parameter which stands for temporary file name. 
# Some parsers can not operate on stdin, so indexer creates temporary file 
# for parser and it's name passed instead of $1. There are many ways to use parsers,
# take a look into doc/parsers.txt for other parser types and parsers usage 
explanation.
# Examples:
#
#       from_mime                            to_mime[charset]             [command 
line [$1]]
#
#Mime application/msword                     "text/plain; charset=cp1251"  "catdoc $1"
#Mime "application/pdf; charset=iso-8859-1"  "text/plain"                  "pdftotext 
$1"
#Mime application/x-troff-man                 text/plain                   "deroff"
#Mime text/x-postscript                       text/plain                   "ps2ascii"



#########################################################################
# Section 4.
# Aliases configuration.


#########################################################################
#Alias <master> <mirror>
# You can use this command for example to organize search through 
# master site by indexing a mirror site. It is also usefull to
# index your site from local file system.
# mnoGoSearch will display URLs from <master> while searching
# but go to the <mirror> while indexing.
# This command has global indexer.conf file effect. 
# You may use several aliases in one indexer.conf.
#Alias http://www.mysql.com/ http://mysql.udm.net/
#Alias http://www.site.com/  file:/usr/local/apache/htdocs/


#########################################################################
#AliasProg <command line>
# AliasProg is an external program that can be called, that takes a URL,
# and returns the appropriate alias to stdout. Use $1 to pass a URL. This
# command has global effect for whole indexer.conf.
# Example:
#AliasProg "echo $1 | /usr/local/mysql/bin/replace http://localhost/ file:/home/httpd/"


#######################################################################
# Section 5.
# Servers configuration.


#######################################################################
#Period <time>
# Does not matter for built-in text files support
# Set reindex period.
# <time> is in the form 'xxxA[yyyB[zzzC]]'                                   
# (Spaces are allowed between xxx and A and yyy and so on)                     
#   there xxx, yyy, zzz are numbers (can be negative!)                         
#         A, B, C can be one of the following:                                 
#               s - second                                                      
#               M - minute                                                      
#               h - hour                                                        
#               d - day                                                         
#               m - month                                                       
#               y - year                                                        
#      (these letters are the same as in strptime/strftime functions)                  
 
#                                                                              
# Examples:
# 15s - 15 seconds
# 4h30M - 4 hours and 30 minutes
# 1y6m-15d - 1 year and six month minus 15 days
# 1h-10M+1s - 1 hour minus 10 minutes plus 1 second
#
# If you specify only number without any character, it is assumed
# that time is given in seconds (this behaviour is for
# compatibility with versions prior to 3.1.7).
#
# Can be set many times before "Server" command and
# takes effect till the end of config file or till next Period command.
Period 7d


#######################################################################
#Tag <string>
# Use this field for your own purposes. For example for grouping
# some servers into one group, etc...
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next Tag command.
# Default values is an empty sting


#######################################################################
#Category <string>
#You may distribute documents between nested categories. Category
#is a string in hex number notation. You may have up to 5 levels with
#256 members per level. Empty category means the root of category tree.
#Take a look into doc/categories.txt for more information.
#This command means a category on first level:
#Category AA
#This command meand a category on 5th level:
Category FFAABBCCDD


#######################################################################
#DefaultLang <string>
#Default language for server. Can be used if you need language
#restriction while doing search.
DefaultLang en


#######################################################################
#MaxHops <number>
# Maximum way in "mouse clicks" from start url.
# Default value is 256.
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next MaxHops command.
MaxHops 200


#######################################################################
#MaxNetErrors <number>
# Maximum network errors for each server.
# Default value is 16. Use 0 for unlimited errors number.
# If there too many network errors on some server 
# (server is down, host unreachable, etc) indexer will try to do 
# not more then 'number' attempts to connect to this server.
# Takes effect till the end of config file or till next MaxNetErrors command.
MaxNetErrors 16


#######################################################################
#ReadTimeOut <time>
# Connect timeout and stalled connections timeout.
# For <time> format see description of Period above.
# Default value is 30 seconds.
# Can be set any times before "Server" command and
# takes effect till the end of config file or till next ReadTimeOut command.
ReadTimeOut 90s


#######################################################################
#DocTimeOut <time>
# Maximum amount of time indexer spends for one document downloading.
# For <time> format see description of Period above.
# Default value is 90 seconds.
# Can be set any times before "Server" command and
# takes effect till the end of config file or till next DocTimeOut command.
DocTimeOut 1m30s


########################################################################
#NetErrorDelayTime <time>
# Specify document processing delay time if network error has occured.
# For <time> format see description of Period above.
# Default value is one day
#NetErrorDelayTime 1d


#######################################################################
#Robots yes/no
# Allows/disallows using robots.txt and <META NAME="robots">
# exclusions. Use "no", for example for link validation of your server(s).
# Command may be used several times before "Server" command and
# takes effect till the end of config file or till next Robots command.
# Default value is "yes".
Robots yes


#######################################################################
#Clones yes/no
# Allow/disallow clone eliminating. If alowed, indexer will 
# detect the same documents under different location, such as
# mirrors, and will index only one document from the group of
# such equal documents. "Clones yes" also allows to reduce space usage.
# Default value is "yes".
Clones yes


#######################################################################
#BodyWeight <number>
# It is better to use a degree of 2 as *Weight commands argument.
# Refer to "Changing different document part weights at search time"
# in doc/search.txt.
#
# Weight of the words in the <body>...</body> of the html documents 
# and in the content of the text/plain documents.
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next BodyWeight command.
# Default value is 2
BodyWeight 2


#######################################################################
#CrossWeight <number>
# Weight of the words in a link to html document (CrossWords). 
# CrossWords indexing is turned on or off with "CrossWords" command
# Default value is 32
#CrossWeight 32


#######################################################################
#TitleWeight <number>
# Weight of the words in the <title>...</title>
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next TitleWeight command.
# Default value is 4
TitleWeight 4


#######################################################################
#KeywordWeight <number>
# Weight of the words in the <META NAME="Keywords" Content="...">
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next KeywordWeight command.
# Default value is 8
KeywordWeight 8


#######################################################################
#DescWeight <number>
# Weight of the words in the <META NAME="Description" Content="...">
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next DescWeight command.
# Default value is 16
DescWeight 16


#######################################################################
#UrlWeight <number>
# Weight of the words in the URL of the documents.
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next UrlWeight command.
# Default value is 0
UrlWeight 0


#######################################################################
#UrlHostWeight <number>
# Weight of the words in the hostname part of URL of the documents.
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next UrlHostWeight command.
# Default value is 0
#UrlHostWeight 0


#######################################################################
#UrlPathWeight <number>
# Weight of the words in the path part of URL of the documents.
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next UrlPathWeight command.
# Default value is 0
UrlPathWeight 0


#######################################################################
#UrlFileWeight <number>
# Weight of the words in the filename part of URL of the documents.
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next UrlFileWeight command.
# Default value is 0
#UrlFileWeight 0


######################################################################
# Spell checking. You can change the factors of word weight depending on
# whether word is found in Ispell dictionaries or not. Setting the 
# "IspellCorrectFactor" to 0 will prevent indexer from storing words with
# right spelling in database. The only incorrect words will be stored
# in database in this case. Then you may easily find incorrect words
# and correspondent URLs where those words are found. If no
# ispell files are used all word are considered as "incorrect".
#
#IspellCorrectFactor    1
#IspellIncorrectFactor  1


#######################################################################
# Numbers indexing. By default numbers and words which contain both
# digits and letters (like "3a","U2") are stored in database. You may change 
# this behaviour by setting into "0" weight factors. Usefull for spell checking
# in combination with previous commands.
#
#NumberFactor 1
#AlnumFactor  1


#######################################################################
#DeleteBad yes/no
# Use it to choose whether delete or not bad (not found, forbidden etc) URLs
# from database. 
# May be used multiple times before "Server" command and
# takes effect till the end of config file or till next DeleteBad command.
# Default value is "no", that means do not delete bad URLs.
DeleteBad yes


#######################################################################
#Index yes/no
# Prevent indexer from storing words into database.
# Useful for example for link validation.
# Can be set multiple times before "Server" command and
# takes effect till the end of config file or till next Index command.
# Default value is "yes".
Index yes


#######################################################################
#Follow page/path/site/world/no
# Set indexer behaviour on searching whether an URL correspons a Server
# command. It describes which part of argument given in the following 
# Server command is to be compared with an URL to decide whether URL 
# corresponds Server command.
# "page" means that URL must be the same. It actually means describes web 
# space which consists of one page.
# "path" means URL which is under the same path with Server argument 
# corresponds Server command.
# "site" means links from the same host.
# "world" means to follow any link.
# "no" is the same with "page".
# Follow commad can be used multiple times before "Server" command and
# takes effect till the end of config file or till next Follow command.
# Default value is "path".
Follow path


#######################################################################
#CheckMp3Tag yes/no
#Work only on servers support HTTP/1.1 protocol.
#It is used "Range: bytes" header to download mp3 tag.
CheckMp3Tag no


#######################################################################
#IndexMP3TagOnly yes/no
#Enable this option allow to check file to detect id3 tag and
#if no id3 tag exist do nothing.
#Also set CheckMp3Tag to yes.
IndexMP3TagOnly no


########################################################################
#CharSet <charset>
# Useful for 8 bit character sets.
# WWW-servers send data in different charsets.
#<Charset> is default character set of server in next "Server" command(s).
#This is required only for "bad" servers that do not send information
#about charset in header: "Content-type: text/html; charset=some_charset"
# and have not <META NAME="Content" Content="text/html; charset=some_charset">
#Can be set before every "Server" command and
# takes effect till the end of config file or till next CharSet command.
#CharSet windows-1251


#########################################################################
#ProxyAuthBasic login:passwd
# Use http proxy basic authorization 
# Can be used before every "Server" command and
# takes effect only for next one "Server" command!
# It should be also before "Proxy" command.
# Examples:
#ProxyAuthBasic somebody:something  


#########################################################################
#Proxy your.proxy.host[:port]
# Use proxy rather then connect directly
#One can index ftp servers when using proxy
#Default port value if not specified is 3128 (Squid)
#If proxy host is not specified direct connect will be used.
#Can be set before every "Server" command and
# takes effect till the end of config file or till next Proxy command.
#If no one "Proxy" command specified indexer will use direct connect.
#
#           Examples:
#           Proxy on atoll.anywhere.com, port 3128:
#Proxy atoll.anywhere.com
#
#           Proxy on lota.anywhere.com, port 8090:
#Proxy lota.anywhere.com:8090
#
#           Disable proxy (direct connect):
#Proxy


#########################################################################
#AuthBasic login:passwd
# Use basic http authorization 
# Can be set before every "Server" command and
# takes effect only for next one Server command!
# Examples:
#AuthBasic somebody:something  
#
# If you have password protected directory(ies), but whole server is open,use:
#AuthBasic login1:passwd1
#Server http://my.server.com/my/secure/directory1/
#AuthBasic login2:passwd2
#Server http://my.server.com/my/secure/directory2/
#Server http://my.server.com/


##############################################################
# Mirroring parameters commands.
#
# You may specify a path to root dir to enable sites mirroring
#MirrorRoot /path/to/mirror
#
# You may specify as well root dir of mirrored document's headers
# indexer will store HTTP headers to local disk too.
#MirrorHeadersRoot /path/to/headers
#
# MirrorPeriod <time>
# You may specify period during wich earlier mirrored files 
# will be used while indexing instead of real downloading.
# It is very useful when you do some experiments with mnoGoSearch
# indexing the same hosts and do not want much traffic from/to Internet.
# If MirrorHeadersRoot is not specified and headers are not stored
# to local disk then default Content-Type's given in AddType commands
# will be used.
# Default value of the MirrorPeriod is -1, which means
# "do not use mirrored files".
#
# For <time> format see Period command description above.
#
# The command below will force using local copies for one day:
#MirrorPeriod 1d


#########################################################################
#Server [subsection] <URL> [alias]
# This is the main command of the indexer.conf file. It's used 
# to add servers or their parts to be indexed. It also inserts
# given URL into database.
# For example:
#Server http://localhost/
#
# You can also specify some path to index server section:
#Server http://localhost/subsection/
# or concrete one page:
#Server http://localhost/path/main.html
#
# Use optional subsection parameter to specify server's subsection.
# It specifies which part of Server command argument is to be compared
# with and URL. Check follow.txt for details.
# Values of subsection are the same with "Follow" command arguments.
# If subsection is not specified current "Follow" value will be used.
# If subsection is specified it does not change current "Follow" value
# for next "Server" commands without subsection argument.
# This example will add /path/ section on localhost:
#Server path http://localhost/path/main.html
# This example will add whole server:
#Server site http://localhost/path/main.html
#
# You can also specify optional parameter "alias". This example will
# index server "http://search.mnogo.ru/"; directly from disk instead of
# fetching from HTTP server:
#Server http://search.mnogo.ru/  file:/home/httpd/search.mnogo.ru/
#
# You may use "Server" command as many times as a number of different
# servers you want to index.
#
Server  http://www.mcleodusa.com/ file:/www/htdocs/mcleodusa/

#########################################################################
#Realm [String|Regex] [Match|NoMatch] <arg> [alias]
# It works almost like "Server" command but takes a regular expression or 
# string wildcards as it's argument. String wildcards is default match type.
# For example, if you want to index all HTTP sites in ".ru" domain, use:
#Realm http://*.ru/*
# The same using "Regex" match:
#Realm Regex ^http://.*\.ru/
# Another example. Use this command to index everything without .com domain:
#Realm NoMatch http://*.com/*
#
# Optional "alias" argument allows to provide very complicated URL rewrite
# more powerful than other aliasing mechanism. Take a look into alias.txt
# for "alias" argument usage explanation.


#########################################################################
#URL http://localhost/path/to/page.html
# This command inserts given URL into database. This is usefull to add
# several entry points to one server. Has no effect if an URL is already
# in the database. When inserting indexer does not any checking and this 
# URL may be delated at first indexing attempt if URL has no correspondent 
# Server command or disallowed by rules given in Allow/Disallow
# commands. 
#
#This command will add /main/index.html page:
#URL http://localhost/main/index.html



Reply: <http://search.mnogo.ru/board/message.php?id=2125>

___________________________________________
If you want to unsubscribe send "unsubscribe general"
to [EMAIL PROTECTED]

Reply via email to