On Wed, Jul 21, 2010 at 12:04 AM, Jon Lang <datawea...@gmail.com> wrote: > Mark J. Reed wrote: >> Perhaps the syllabic kana could be the "integer" analogs, and what you >> get when you iterate over the range using ..., while the modifier kana >> would not be generated by the series ア ... ヴ but would be considered >> in the range ア .. ヴ? I wouldn't object to such script-specific >> behavior, though perhaps it doesn't belong in core. > > As I understand it, it wouldn't need to be script-specific behavior; > just behavior that's aware of Unicode properties.
That wouldn't help in this case. For example, U+30A1 KATAKANA SMALL LETTER A - the small "modifier" variety of letter under discussion - is not a modifier in the Unicode sense. It has exactly the same properties as U+30A2 KATAKANA LETTER A, an actual syllable: 30A1;KATAKANA LETTER SMALL A;Lo;0;L;;;;;N;;;;; 30A2;KATAKANA LETTER A;Lo;0;L;;;;;N;;;;; So without script-specific special-case code, there's no way to distinguish them. As Aaron said, they're treated like lowercase, but that's not an accurate representation of how they're used in actual text, or of the common idea of what constitutes the set of kana. -- Mark J. Reed <markjr...@gmail.com>