On 6/10/2014 2:17 PM, Wang Weijun wrote: >>>> >>> 176 st.whitespaceChars(0x00, 0x20); >>>> >>> 177 st.wordChars(0x21, 0xFF); >>>> >>> I'm not sure of the code above, would you like have to test for >>>> >>> none-ASCII characters? >>> >> >>> >> I cannot find any spec on this, but the source has >>> >> >>> >> ctype = c < 256 ? ct[c] : CT_ALPHA; >>> >> >>> >> which means every non-ASCII is a word char (no support for wide >>> >> numerals). >>> >> >>> >> StreamTokenizer only allows you to categorize the ASCII chars. >>> >> >> > I'm not sure too. If "0x01 0x05" is a character, does the above code >> > treat the "0x01" and "0x05" as white space? > Here the input of StreamTokenizer is char array. If you mean "0x01 0x05" as > two chars, then they are both treated as white spaces. If you mean \u0105, > it's a word char. > OK.
Xuelei
