From: "Geza Lakner MD"
<snip>
> The right order for Hungarian vowels: actually the diaresis characters
> come first and then the double acute ones (only o and u have double
> accents in the Hungarian alphabet):
> oOóÓöÖõÕ
> uUúÚüÜûÛ

This was easy to fix.

> Unfortunately the case-insensitiveness does not work. Look:
> hungarian-sort ["alom" "Álom" "álom" "Állam"]
> == ["alom" "álom" "Állam" "Álom"]
>
> Though it should read:
> alom Állam Álom álom.

Yes, this is a problem.  My current algorithm will not easily accommodate
this change.  I now can even remember thinking last year that the approach
might cause a problem, but the test samples presented apparently did not
"detect" this problem at that time.  Hmmm.

Time to go back to the drawing board.  I already have an idea, but it may
take a while before I have some time to create the new algorithm.

> - The /case refinement results in the same result as the one without
> it :-(    :
> >> hungarian-sort/case ["alom" "álom" "Álom" "Állam"]
> == ["alom" "álom" "Állam" "Álom"]
>
> The case-sensitive collation sequence IMHO would be a bit different than
> you have defined, namely:
> aAáÁ...eEéÉ...
>
> Your order was:
> aáAÁ...eéEÉ...

There end up being two issues at work here.  Having the order as
    aáAÁ...eéEÉ...
was not my intention.  What I was aiming to do was
    aá..eé..AÁ..EÉ...
which may also not seem correct to you; however, this behavior mirrors
REBOL's default behavior for the /case switch, but does differ in placing
the little letters before the capital letters.  Petr K. said that this was
the more normal method in eastern europe (Czech language in his case).  So I
was trying to reflect this pattern, but did make the one ordering error.

The REBOL 'sort /case switch will sort all the words first by whether the
letter is capital or not.  In fact, REBOL places all the words that begin in
capital letters _before_ the words that begin in small letters (because of
the ascii number assigned to the letters).

Maybe we need an additional switch that allows for the eastern european
desire to have smalls before capitals, and to interleave these together as
you suggest.  Sometimes it would be handy to have these options too here in
the US.  Just need a clever name or names for these switches (or paths in
REBOLese).  Any ideas are welcomed.

> - and so on for all affected special accented chars.

and so on for life in general!
:-)

I'll repost after I have a chance to develop the new algorithm that I have
in mind.  "Stay tuned"

Thanks for your feedback!

--Scott Jones

-- 
To unsubscribe from this list, please send an email to
[EMAIL PROTECTED] with "unsubscribe" in the 
subject, without the quotes.

Reply via email to