Re: Matching decomposable Unicode characters

Ron Aaron Fri, 31 May 2013 02:43:49 -0700

On Friday, May 31, 2013 12:27:21 PM UTC+3, Bram Moolenaar wrote:
 
> I find it a bit annoying that Unicode has two forms for the same character.
> They should have made a choice to either use a base character plus composing
> characters, or the combined form.  Now we need to solve this in software
> everywhere.


Believe me, you're not the only one who finds it annoying!

> Perhaps iconv has a way to specify decomposing characters?
> But we don't want to convert everything.

I didn't see any way, but maybe someone else knows?  

> I suppose decomposing is not an algorithm but a matter of a very big
> table.

Yes, though perhaps not such a big table.  It's just the "equivalence" 
characters.  Hmm... actually there are a number of those, aren't there? Not 
just Hebrew but combined Latin chars as well.  ARRRRGH.

I could solve the problem for Hebrew, specifically, with a small table.  But I 
assume that would not be a solution you would favor.

-- 
-- 
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

--- 
You received this message because you are subscribed to the Google Groups 
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Matching decomposable Unicode characters

Raspunde prin e-mail lui