I wouldn't count on the delimiters being ASCII; even for English text you might have, e..g., left and right quotes, French adds more and for Mathematical text all bets are off.
-- Shmuel (Seymour J.) Metz http://mason.gmu.edu/~smetz3 ________________________________________ From: IBM Mainframe Assembler List <[email protected]> on behalf of Paul Gilmartin <[email protected]> Sent: Friday, June 15, 2018 12:53 PM To: [email protected] Subject: Re: Count Words? On 2018-06-15, at 10:49:05, Seymour J Metz wrote: > What about translation and scanning of UTF-8 and UTF-16? > If the delimiters are USASCII characters, UTF-8 maps them transparently. That's the good part; variable-length encoding is the bad part. -- gil
