Hi,
I am searching to solve some problems in ht://Dig 3.1.5.
I tested and reproduce that :
If :
-- more than one html file contains : both words "tu�" and "tue" per file ;
-- or an html files contains the word "tue" and the html which is reffering to it contains the word "tu�" (or the reverse case)
[exemple : d0.htm containing "<a href="d1.htm">UN HOMME TUE</a>" and d1.htm containing "tu�"]
Then a search for "tu�" or a search for "tue" will only find the last file indexed which contains both "tu�" and "tue".
In the file db.wordlist we can see for example :
tue i:0 [...]
tue i:1 [...]
tu� i:1 [...]
tue i:2 [...]
tu� i:2 [...]
(only the file which correspond to "i:2" will be found).
Is this can be solve ?
(Note I have in htdig.conf :
locale: fr_FR
)
<cultural parenthesis>
At the beginning of automatic typewritters (first moity of the century), there was nos accented uppercases such as �� (the machines were anglo-saxons) and so, the usage of accented lowercase desapear in common usage : nowadays, many teachers in France teach that "there is never accent in a lowercase". (In fact there is accented lowercase in all newpapers, books printed by professionnals who know the rule that there must be accented lowercase -- there is accented lowercase in France since the beginning of prints).
This is a problem as accents have a sence :
"un homme tu�" : means "a man killed"
"un homme tue" : means "a man kills".
How to understand : "UN HOMME TUE" if there is no accented lowercase ?
</cultural parenthesis>.
Charles N�pote
paris, France
[and please do forget english mistakes...]
