Re: UdmSearch: Prevent directory-matches in results

2001-02-14 Thread Tobias Freitag

Alexander Barkov schrieb:
 
 Tobias Freitag wrote:
 
  Alexander Barkov schrieb:
  
   Tobias Freitag wrote:
   
The problem is that when I search for any pattern, most results show
directories not files. It seems that the indexer is parsing the
 
 ...
UrlPathWeight -1
UrlFileWeight -1
 ...
  
   This is because of -1 in UrlFileWeight and UrlPathWeight.
 
  I have set both values to 0 and to 1 but it doesnt change anything. Is
  this a common problem?
 
 Set it to 0 and reindex everything.

I did that already, but it didnt help.

I also cleared the whole database before indexing (using "echo YES |
sbin/indexer -C ; sbin/indexer"). Maybe its a strange behavior under Red
Hat 7? 

Is it possible to index with just \.html$ \.htm$ \.txt$ allowed? And why
does the programm ignore the CheckOnly Entry?

And last but not least: Is there a difference between
\.html$|\.htm$|\.txt$ and \.html$ \.htm$ \.txt$ ?

-- 
Tobias Freitag  | http://www.linux-magazin.de
Stefan-George-Ring 24   | Tel:  +49 (0) 89 993411-0
D-81929 Mnchen | Fax:  +49 (0) 89 993411-99
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




Re: UdmSearch: Prevent directory-matches in results

2001-02-13 Thread Tobias Freitag

Alexander Barkov schrieb:
 
 Tobias Freitag wrote:
 
  The problem is that when I search for any pattern, most results show
  directories not files. It seems that the indexer is parsing the

   ...
  UrlPathWeight -1
  UrlFileWeight -1
   ...
 
 This is because of -1 in UrlFileWeight and UrlPathWeight.

I have set both values to 0 and to 1 but it doesnt change anything. Is
this a common problem?

Btw. I have udmsearch-3.0.23 installed running on Red Hat 7
-- 
Tobias Freitag  | http://www.linux-magazin.de
Stefan-George-Ring 24   | Tel:  +49 (0) 89 993411-0
D-81929 Mnchen | Fax:  +49 (0) 89 993411-99
__
If you want to unsubscribe send "unsubscribe udmsearch"
to [EMAIL PROTECTED]




UdmSearch: Prevent directory-matches in results

2001-02-12 Thread Tobias Freitag

Hi there,

I am trying to index about 1000 HTML-Articles on our website via local
filesystem. I use the "Alias" command to map the links to the real
location.

The problem is that when I search for any pattern, most results show
directories not files. It seems that the indexer is parsing the
directory listings for matching filenames. Nice Feature, but nothing
what I really need. 

For Example a search result for "linux" looks like this:

Displaying documents 1-20 of total 55 found. 

1. http://www.linux-magazin.de/ausgabe/1996/08/ [1]

2.  [1]

[...]

13. http://www.linux-magazin.de/ausgabe/1996/06/News/biodata.html [1]


The Allow/Disallow settings are kept simple:

Allow \.html$ \.htm$ \.txt$ \/$
Disallow .*

I tried it without the \/$ to prevent directory indexing, but then the
program indexed nothing except the rootdir
(http://www.linux-magazin.de/ausgabe/). I have also tried "CheckOnly
\/$", but it doesnt seem to change anything.

Ideas anyone?

-- 
Tobias Freitag  | http://www.linux-magazin.de
Stefan-George-Ring 24   | Tel:  +49 (0) 89 993411-0
D-81929 Mnchen | Fax:  +49 (0) 89 993411-99

# This is sample indexer config file
# To start using it please edit and rename to indexer.conf
# You may want to keep the original indexer.conf-dist for future references.
# Use '#' to comment out lines.
# All command names are case insensitive (DBHost=DBHOST=dbhost).
# You may use '\' character to prolong current command to next line
# when it is required.

###
# DBAddr URL-style database description
# Database options (type, host, database name, port, user and password) 
# to connect to SQL database.
# Do not matter for built-in text files support.
# Should be used only once and before any other commands.
# Command have global effect for whole config file.
# Format:
#DBAddr DBType:[//[DBUser[:DBPass]]DBHost[:DBPort]]/DBName/
#
# ODBC notes:
#   Use DBName to specify ODBC data source name (DSN)
#   DBHost does not matter, use "localhost".
# Solid notes:
#   Use DBHost to specify Solid server
#   DBName does not matter for Solid
#
# Currently supported DBType values are 
# mysql, pgsql, msql, solid, mssql, oracle, ibase.
# Actually, it does not matter for native libraries support.
# But ODBC users should specify one of supported values.
# If your database type is not supported, you may use "unknown" instead.

#DBAddr mysql://:***@localhost/udmsearch/


###
# DBMode single/multi/crc/crc-multi
# Does not matter for built-in text files support
# You may select SQL database mode of words storage.
# When "single" is specified, all words are stored in the same
#table. If "multi" is selected, words will be located in different
#tables depending of their lengths. "multi" mode is usually faster
#but requires more tables in database. 
#
# If "crc" mode is selected, UdmSearch will store 32 bit integer
# word IDs calculated by CRC32 algorythm instead of words. This
# mode requres less disc space and it is faster comparing with "single"
# and "multi" modes. "crc-multi" uses the same storage structure with
# the "crc" mode, but also stores words in different tables depending on 
# words lengths like "multi" mode.
#
#Default DBMode value is "single":
DBMode multi


###
#SyslogFacility facility
# This is used if indexer was compiled with syslog support and if you
# don't like the default value. Argument is the same as used in syslog.conf
# file. For list of possible facilities see syslog.conf(5)
#SyslogFacility local7


###
# LocalCharset charset
# Defines charset of local file system. It is required if you are using 
# 8 bit charsets and does not matter for 7 bit charsets.
# This command should be used once and takes global effect for the config file.
# Choose currently supported one:
#
# Western Europe: Germany
LocalCharset iso-8859-1
#
# Central Europe: Czech
#LocalCharset iso-8859-2
#
# ISO Cyrillic
#LocalCharset iso-8859-5
#
# Unix Cyrillic
#LocalCharset koi8-r
#
# MS Central Europe: Czech
#LocalCharset cp1250
#
# MS DOS Cyrillic
#LocalCharset cp866
#
# MS Cyrillic
#LocalCharset cp1251
#
# MS Arabic
#LocalCharset cp1256
#
# Mac Cyrillic
#LocalCharset x-mac-cyrillic


###
# Ispell support commands. Detailed description is given in /doc/ispell.txt
# Ispell commands MUST be given after LocalCharset definition.
# Load ispell affix file:
#Affix lang ispell affixes file name
# Load ispell dictionary file
#Spell lang ispell diction