Asmus,

This discussion reminds me of my ill fated efforts to produce a manageable
set of rules to do automatic title casing starting with French text.  It
would have required either special dictionaries or entering the text in a
special way.  If special text was used, one could enter it in the proper
title case to begin with.

If you are entering Danish city names then enter it as �lborg.  You should
only use Aalborg where the font does not support �.  For matching logic you
can equate � to Aa then the issue of compound words goes away.

Carl

> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
> Behalf Of Asmus Freytag
> Sent: Saturday, September 08, 2001 5:56 PM
> To: Mark Davis; [EMAIL PROTECTED]; Francesco Zappa Nardelli
> Subject: Re: [OT] o-circumflex
>
>
> At 02:45 PM 9/8/01 -0700, Mark Davis wrote:
> >If you use a Danish tailoring of the UCA that equates Å and AA
> (at least at
> >a primary and secondary level), then they will sort the same
> way. A string
> >search that uses the same tailoring will also find "Ålborg" when given
> >"Aalborg" (and vice versa).
>
> But if you do this, all compound words starting with "data" and
> continuing
> with another word starting with "a" will be sorted incorrectly!
>
> To achieve this effect, you would have to mark which AAs are A-Rings and
> which ones are accidental adjacencies. In Danish one can use the
> SHY (soft
> hyphen) to break the latter, as these accidental pairs occur at
> legal word
> break points. In fact, that's the recommended solution, but it requires
> that the input data are in a sepecific form.
>
> A./
>


Reply via email to