Hello,

>The reason is that charset is not set for page: 
><http://melior.univ-montp3.fr/>http://melior.univ-montp3.fr/ and 
>ASPseek indexer treats word Pérez as two words: P and rez.

But I have in aspseek.conf:
-----------------------
#CharSet <charset>
# Useful for 8 bit character sets.
# WWW-servers send data in different charsets.
# <Charset> is default character set of server in next "Server" command(s).
# This is required only for "bad" servers that do not send information
# about charset in header: "Content-type: text/html; charset=some_charset"
# and have not <META NAME="Content" Content="text/html; charset=some_charset">
# Can be set before every "Server" command and
# takes effect till the end of config file or till next CharSet command.
CharSet iso88591
-----------------------

>I don't know which value of iso88591 do you use in CharsetTableU1 
>directive, but if you use value iso-8859-1 then this value must be 
>first in the CharsetAlias directive.

-----------------------
CharsetTableU1 windows-1250 ru tables/windows-1250.txt
CharsetTableU1 windows-1251 ru tables/windows-1251.txt
CharsetTableU1 windows-1252 ru tables/windows-1252.txt
CharsetTableU1 windows-1253 ru tables/windows-1253.txt
CharsetTableU1 windows-1254 ru tables/windows-1254.txt
CharsetTableU1 windows-1255 ru tables/windows-1255.txt
CharsetTableU1 windows-1256 ru tables/windows-1256.txt
CharsetTableU1 windows-1257 ru tables/windows-1257.txt
CharsetTableU1 windows-1258 ru tables/windows-1258.txt
CharsetTableU1 windows-874 ru tables/windows-874.txt
CharsetTableU1 iso-8859-1 en tables/iso8859-1.txt
CharsetTableU1 iso-8859-2 ru tables/iso8859-2.txt
CharsetTableU1 iso-8859-3 ru tables/iso8859-3.txt
CharsetTableU1 iso-8859-4 ru tables/iso8859-4.txt
CharsetTableU1 iso-8859-5 ru tables/iso8859-5.txt
CharsetTableU1 iso-8859-6 ru tables/iso8859-6.txt
CharsetTableU1 iso-8859-7 ru tables/iso8859-7.txt
CharsetTableU1 iso-8859-8 ru tables/iso8859-8.txt
CharsetTableU1 iso-8859-9 ru tables/iso8859-9.txt
CharsetTableU1 iso-8859-10 ru tables/iso8859-10.txt
CharsetTableU1 iso-8859-13 ru tables/iso8859-13.txt
CharsetTableU1 iso-8859-14 ru tables/iso8859-14.txt
CharsetTableU1 iso-8859-15 ru tables/iso8859-15.txt
CharsetTableU1 koi8-r ru tables/koi8r.txt
CharsetTableU1 koi8-u ru tables/koi8u.txt
-----------------------

and

-----------------------
## Latin 1; Western European Languages
CharsetTable iso88591   en charsets/iso88591
[...]
CharsetAlias iso88591   iso-8859-1 iso8859-1 iso8859.1 iso-8859.1 
iso_8859-1:1988 iso_8859-1 iso_8859.1
-----------------------

>If cs parameter is set to iso88591 which is different from specified 
>in CharsetTableU1, then searcher also treats word Pérez as two words 
>and finds the page.
>If cs parameter is set to the value specified in CharsetTableU1 then 
>searcher can't find word Pérez because it is not in the index due to 
>absent charset.

I have in s.htm:

-----------------------
<input type="hidden" name="cs" value="iso-8859-1">
-----------------------

I don't undertsand...

And i don't see what the language colum for in CharsetTableU1 (en, 
ru, etc.) and in CharsetTable.

Thanks in advance,

Gilles, lost in CharsetSpace...

Reply via email to