> So contrib/unaccent/ is a Python script that
> takes UnicodeData.txt, a list of information about all Unicode
> codepoints available at a URL that is shown in a comment, and
> generates unaccent.rules.  The idea was to avoid having to change it
> manually every time someone finds characters that should be in there
> (as you have just done!) by doing it systematically.
> Unicode has two ways to represent characters with accents: either with
> composed codepoints like "é" or decomposed codepoints where you say
> "e" and then "´".  The field "00E2 0301" is the decomposed form of
> that character above.  Our job here is to identify the basic letter
> that each composed character contains, by analysing the decomposed
> field that you see in that line.  I failed to realise that characters
> with TWO accents are described as a composed character with ONE accent
> plus another accent.
> You don't have to worry about decoding that line, it's all done in
> that Python script.  The problem is just in the function
> is_letter_with_marks().  Instead of just checking if combining_ids[0]
> is a plain letter, it looks like it should also check if
> combining_ids[0] itself is a letter with marks.  Also get_plain_letter
> would need to be able to recurse to extract the "a".
> I hope that helps!
