I feel like a fool continuing this debate, being the least intelligent
guy in the room, but here goes:
My point was that wikipedia (the link i gave and other definitions I
saw) seem to refer to the little markings around a letter as
diacriticals whether they mean the letter is a completely different
letter or not (see the part mentioning Scandinavian, as well as possibly
Websters dictionary). Marko disputed this in his last comment, and I
don't know that he is wrong. All I have seen seems to indicate this though.
I also dispute this sentence in the new javadoc patch proposed:
*It will also be impossible to search for the word in its original form.*
If you use the same analyzer at search and query time, there should be no such
problem.
Doug Cutting wrote:
Mark Miller wrote:
I wouldn't pretend to know the truth on this matter, but you might
update the wikipedia article http://en.wikipedia.org/wiki/Diacritic
if you do, as it does not agree with your comments.
Wikipedia says, "Swedish uses characters identical to a-diaeresis (ä)
and o-diaeresis (ö)". This is a little ambiguous. Identical how? I
think they mean "visually identical to". The distinction is whether
Swedish treats 'ä' as a variant of 'a' or as a completely separate
letter. The latter is the case.
http://en.wikipedia.org/wiki/Umlaut_(diacritic) states:
Swedish [...] treat[s] them as independent letters.
Doug
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]