New issue 398: Support logograms (like CJK characters) as account names.

https://bitbucket.org/blais/beancount/issues/398/support-logograms-like-cjk-characters-as



Alex Hou:



According to beancount language syntax, account names should follow this rule:



> Each component of the account names begin with a capital letter or a number 
> and are followed letters, numbers or dash \(-\) characters. All other 
> characters are disallowed.



But logograms \(Chinese characters, Japanese hiragana and katakana, Korean 
alphabet, etc.\) have no ‘capital letter’.



Please support these characters as account names.



[http://www.programminginkorean.com/programming/hangul-in-unicode/](http://www.programminginkorean.com/programming/hangul-in-unicode/)



> Hangul Syllables \(AC00–D7A3\)  

> Hangul Jamo \(1100–11FF\)  

> Hangul Compatibility Jamo \(3130–318F\)  

> Hangul Jamo Extended-A \(A960–A97F\)  

> Hangul Jamo Extended-B \(D7B0–D7FF\)



[https://stackoverflow.com/a/30200250/4458143](https://stackoverflow.com/a/30200250/4458143)
  

[http://www.rikai.com/library/kanjitables/kanji\_codes.unicode.shtml](http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml)



> Hiragana \(3040 - 309F\)  

> Katakana \(30A0 - 30FF\)



[https://en.wikipedia.org/wiki/Halfwidth\_and\_Fullwidth\_Forms\_\(Unicode\_block\)](https://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms_(Unicode_block))



> halfwidth forms of compatibility jamo characters for Hangul \(FFA0–FFDC\)  

> half-width katakana \( FF65–FF9F \)



‌



[https://stackoverflow.com/a/1366113/4458143](https://stackoverflow.com/a/1366113/4458143)
  

[https://en.wikipedia.org/wiki/CJK\_Unified\_Ideographs](https://en.wikipedia.org/wiki/CJK_Unified_Ideographs)



```

Block                                      Range           Comment

CJK Unified Ideographs                     4E00-9FFF       Common

CJK Unified Ideographs Extension A         3400-4DBF       Rare

CJK Unified Ideographs Extension B         20000-2A6DF     Rare, historic

CJK Unified Ideographs Extension C         2A700–2B73F     Rare, historic

CJK Unified Ideographs Extension D         2B740–2B81F     Uncommon, some in 
current use

CJK Unified Ideographs Extension E         2B820–2CEAF     Rare, historic

CJK Compatibility Ideographs               F900-FAFF       Duplicates, 
unifiable variants, corporate characters

CJK Compatibility Ideographs Supplement    2F800-2FA1F     Unifiable variants

```



I’ve organized these unicode ranges:



```python

r'\u1100-\u11FF'    # Hangul Jamo

r'\u3040-\u309F'    # Hiragana

r'\u30A0-\u30FF'    # Katakana

r'\u3130-\u318F'    # Hangul Compatibility Jamo

r'\u3400-\u4DBF'    # CJK Unified Ideographs Extension A

r'\u4E00-\u9FFF'    # CJK Unified Ideographs

r'\uA960-\uA97F'    # Hangul Jamo Extended-A

r'\uAC00-\uD7A3'    # Hangul Syllables

r'\uD7B0-\uD7FF'    # Hangul Jamo Extended-B

r'\uF900-\uFAFF'    # CJK Compatibility Ideographs

r'\uFF65-\uFF9F'    # half-width katakana

r'\uFFA0-\uFFDC'    # halfwidth forms of compatibility jamo characters for 
Hangul

r'\u20000-\u2A6DF'  # CJK Unified Ideographs Extension B

r'\u2A700-\u2B73F'  # CJK Unified Ideographs Extension C

r'\u2B740-\u2B81F'  # CJK Unified Ideographs Extension D

r'\u2B820-\u2CEAF'  # CJK Unified Ideographs Extension E

r'\u2F800-\u2FA1F'  # CJK Compatibility Ideographs Supplement

```




-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/20190611111036.37291.18235%40celery-worker-112.ash1.bb-inf.net.
For more options, visit https://groups.google.com/d/optout.

Reply via email to