Alexander Belopolsky <belopol...@users.sourceforge.net> added the comment:

The logic suggested by Martin in msg120018 looks right to me, but the whole 
code seems to be unnecessarily complex.  (And comb1==comb may need to be 
changed to comb1>=comb.) I don't understand why linear search through "skipped" 
array is needed.  At the very least instead of adding their positions to the 
"skipped" list, used combining characters can be replaced by a non-character to 
be later skipped.  A better algorithm should be able to avoid the whole issue 
of "skipping" by properly computing the length of the decomposed character.  
See internalCompose() at http://www.unicode.org/reports/tr15/Normalizer.java.

I'll try to come up with a patch.

----------
assignee:  -> belopolsky

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10254>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to