In the .grouped.xml file, if a
<char> does not have an attribute, it inherits it from its
containing <group> element. The group containing the digits
has IDC="Y" OIDC="N" XIDC="Y", and so that applies to the digits
as well.
If you don't want to deal with the inheritance mechanism, just use
the .flat.xml files, the <char> elements carry all the
attributes.
Eric.
On 7/12/2017 6:35 AM, J Decker via Unicode wrote:
I started looking more deeply at the _javascript_
specification. Identifiers are defined as starting with
characters with ID_Start and continued with ID_Continue
attributes.
I grabbed the xml database (ucd.all.grouped.xml ) in which
I was able to find IDS, IDC flags ( also OIDS,OIDC, XIDS,XIDC
of which meaning I'm not entirely sure of)
but I started filtering out to find characters that are NOT
IDS|IDC....
Something simple like numbers 0x30-0x39 are marked with
IDS='N' but have no [ OX]IDC flags specified. Is a lack of
flag assumed N or Y? www.unicode.org/reports/tr42/
documentation on the XML file format doesn't specify.
most languages do support identifiers like a1, a2, etc as
valid identifiers, so certainly numbers should have IDC even
though they're not IDS.
Are there characters that are IDS without being IDC? There
are certainly characters that are IDC without IDS.
some examples.....
found char { cp: '0034', na: 'DIGIT FOUR', gc: 'Nd',
nt: 'De', nv: '4', bc: 'EN', lb: 'NU', sc: 'Zyyy',
scx: 'Zyyy', Alpha: 'N', Hex: 'Y', AHex: 'Y', IDS: 'N',
XIDS: 'N', WB: 'NU', SB: 'NU', Cased: 'N', CWCM: 'N',
InSC: 'Number' }
(this has IDC notation but not IDS; since it says 'digit' I
assume this is a number type, and should not be IDS.)
found char { cp: '0F32', na: 'TIBETAN DIGIT HALF NINE',
gc: 'No', nt: 'Nu', nv: '17/2', Alpha: 'N', IDC: 'N',
XIDC: 'N', SB: 'XX', InSC: 'Number' }
This might be not IDS but is IDC?
found char { cp: '203F',
na: 'UNDERTIE',
gc: 'Pc',
IDC: 'Y',
XIDC: 'Y',
Pat_Syn: 'N',
WB: 'EX' }
this is sort of IDS but not IDC?
found char { cp: '309B', na: 'KATAKANA-HIRAGANA VOICED
SOUND MARK', gc: 'Sk', dt: 'com', dm: '0020 3099', bc:
'ON', lb: 'NS', sc: 'Zyyy', scx: 'Hira Kana', Alpha:
'N', Dia: 'Y', OIDS: 'Y', XIDS: 'N', XIDC: 'N', WB:
'KA', SB: 'XX', NFKC_QC: 'N', NFKD_QC: 'N', XO_NFKC:
'Y', XO_NFKD: 'Y', CI: 'Y', CWKCF: 'Y', NFKC_CF: '0020
3099', vo: 'Tu' }
|