I wouldn't count on the delimiters being ASCII; even for English text you might 
have, e..g., left and right quotes, French adds more and for Mathematical text 
all bets are off.


--
Shmuel (Seymour J.) Metz
http://mason.gmu.edu/~smetz3

________________________________________
From: IBM Mainframe Assembler List <[email protected]> on behalf 
of Paul Gilmartin <[email protected]>
Sent: Friday, June 15, 2018 12:53 PM
To: [email protected]
Subject: Re: Count Words?

On 2018-06-15, at 10:49:05, Seymour J Metz wrote:

> What about translation and scanning of UTF-8 and UTF-16?
>
If the delimiters are USASCII characters, UTF-8 maps them transparently.
That's the good part; variable-length encoding is the bad part.

-- gil

Reply via email to