Hi there,
I am trying to index about 1000 HTML-Articles on our website via local
filesystem. I use the "Alias" command to map the links to the real
location.
The problem is that when I search for any pattern, most results show
directories not files. It seems that the indexer is parsing the
directory listings for matching filenames. Nice Feature, but nothing
what I really need.
For Example a search result for "linux" looks like this:
Displaying documents 1-20 of total 55 found.
1. http://www.linux-magazin.de/ausgabe/1996/08/ [1]
2. [1]
[...]
13. http://www.linux-magazin.de/ausgabe/1996/06/News/biodata.html [1]
The Allow/Disallow settings are kept simple:
Allow \.html$ \.htm$ \.txt$ \/$
Disallow .*
I tried it without the \/$ to prevent directory indexing, but then the
program indexed nothing except the rootdir
(http://www.linux-magazin.de/ausgabe/). I have also tried "CheckOnly
\/$", but it doesnt seem to change anything.
Ideas anyone?
--
Tobias Freitag | http://www.linux-magazin.de
Stefan-George-Ring 24 | Tel: +49 (0) 89 993411-0
D-81929 Mnchen | Fax: +49 (0) 89 993411-99
# This is sample indexer config file
# To start using it please edit and rename to indexer.conf
# You may want to keep the original indexer.conf-dist for future references.
# Use '#' to comment out lines.
# All command names are case insensitive (DBHost=DBHOST=dbhost).
# You may use '\' character to prolong current command to next line
# when it is required.
###
# DBAddr URL-style database description
# Database options (type, host, database name, port, user and password)
# to connect to SQL database.
# Do not matter for built-in text files support.
# Should be used only once and before any other commands.
# Command have global effect for whole config file.
# Format:
#DBAddr DBType:[//[DBUser[:DBPass]]DBHost[:DBPort]]/DBName/
#
# ODBC notes:
# Use DBName to specify ODBC data source name (DSN)
# DBHost does not matter, use "localhost".
# Solid notes:
# Use DBHost to specify Solid server
# DBName does not matter for Solid
#
# Currently supported DBType values are
# mysql, pgsql, msql, solid, mssql, oracle, ibase.
# Actually, it does not matter for native libraries support.
# But ODBC users should specify one of supported values.
# If your database type is not supported, you may use "unknown" instead.
#DBAddr mysql://:***@localhost/udmsearch/
###
# DBMode single/multi/crc/crc-multi
# Does not matter for built-in text files support
# You may select SQL database mode of words storage.
# When "single" is specified, all words are stored in the same
#table. If "multi" is selected, words will be located in different
#tables depending of their lengths. "multi" mode is usually faster
#but requires more tables in database.
#
# If "crc" mode is selected, UdmSearch will store 32 bit integer
# word IDs calculated by CRC32 algorythm instead of words. This
# mode requres less disc space and it is faster comparing with "single"
# and "multi" modes. "crc-multi" uses the same storage structure with
# the "crc" mode, but also stores words in different tables depending on
# words lengths like "multi" mode.
#
#Default DBMode value is "single":
DBMode multi
###
#SyslogFacility facility
# This is used if indexer was compiled with syslog support and if you
# don't like the default value. Argument is the same as used in syslog.conf
# file. For list of possible facilities see syslog.conf(5)
#SyslogFacility local7
###
# LocalCharset charset
# Defines charset of local file system. It is required if you are using
# 8 bit charsets and does not matter for 7 bit charsets.
# This command should be used once and takes global effect for the config file.
# Choose currently supported one:
#
# Western Europe: Germany
LocalCharset iso-8859-1
#
# Central Europe: Czech
#LocalCharset iso-8859-2
#
# ISO Cyrillic
#LocalCharset iso-8859-5
#
# Unix Cyrillic
#LocalCharset koi8-r
#
# MS Central Europe: Czech
#LocalCharset cp1250
#
# MS DOS Cyrillic
#LocalCharset cp866
#
# MS Cyrillic
#LocalCharset cp1251
#
# MS Arabic
#LocalCharset cp1256
#
# Mac Cyrillic
#LocalCharset x-mac-cyrillic
###
# Ispell support commands. Detailed description is given in /doc/ispell.txt
# Ispell commands MUST be given after LocalCharset definition.
# Load ispell affix file:
#Affix lang ispell affixes file name
# Load ispell dictionary file
#Spell lang ispell diction