Hi guys, this is my first adventure with htdig. With one of my
colleagues we have installed successfully htdig 3.2.0b6. As most of users, I
think, we need the search engine to search in other languages, in my case, in
Hungarian. I read on the official site, that htdig is not able to
search on characters encoded on more than 8 bits. In other words I needed to search for example the character ő
what is encoded in html: & # 3 3 7 ; First of all,
the characters that have the code bigger than 255 were not well processed and in
the search result they didn’t appear as character, it appears their html
code. The second problem is that I cannot search on them. I set up the accent
algorithm and it doesn’t recognize a similarity between o and ő.
So the only way, I find this words is to enter in the search field & # 3 3
7; but this is not a good solution. Is this from the same reason? Htdig cannot
find them. Is there any other solution? Or I must wait for newer version. My second question is related to search algorithms. I want to make my search not sensible to diacritics. I used
accents algorithm in combination with substring: substring:1 accents:1 The real problem is: Let’s consider the word: “mambómámbo” I think that it is normal that if I search for: „bomam”
the algorithm to find the searched word, because it can be converted to „bómám”
according to accents algorithm, and that is a substring of the searched word.
But the search algorithm doesn’t find it. It is very interesting that if
I search for “mambomambo” the searched word was found. (it was only
the accents algorithm used), or if I search for “mambó” the
word is found, too (substring algorithm used). It seems, that htdig cannot use
a cooperation of the two algorithm. Probably is some problem with my
configuration, because I saw other sites that use htdig and this problem doesn’t
appear. If anybody know a solution for any of my (two) problems, please
answer. These are my final probs before integrating htdig in my website. Thank you in advance! Regards, Levi |