Title: Accent problem.

Hi,

I am searching to solve some problems in ht://Dig 3.1.5.

I tested and reproduce that :

If :
 -- more than one html file contains : both words "tu�" and "tue" per file ;
 -- or an html files contains the word "tue" and the html which is reffering to it contains the word "tu�" (or the reverse case)

    [exemple : d0.htm containing "<a href="d1.htm">UN HOMME TUE</a>" and d1.htm containing "tu�"]

Then a search for "tu�" or a search for "tue" will only find the last file indexed which contains both "tu�" and "tue".

In the file db.wordlist we can see for example :
tue    i:0 [...]
tue    i:1 [...]
tu�    i:1 [...]
tue    i:2 [...]
tu�    i:2 [...]

(only the file which correspond to "i:2" will be found).


Is this can be solve ?
(Note I have in htdig.conf :
locale: fr_FR
)

<cultural parenthesis>
At the beginning of automatic typewritters (first moity of the century), there was nos accented uppercases such as �� (the machines were anglo-saxons) and so, the usage of accented lowercase desapear in common usage : nowadays, many teachers in France teach that "there is never accent in a lowercase". (In fact there is accented lowercase in all newpapers, books printed by professionnals who know the rule that there must be accented lowercase -- there is accented lowercase in France since the beginning of prints).

This is a problem as accents have a sence :
"un homme tu�" : means "a man killed"
"un homme tue" : means "a man kills".
How to understand : "UN HOMME TUE" if there is no accented lowercase ?
</cultural parenthesis>.


Charles N�pote
paris, France
[and please do forget english mistakes...]

Reply via email to