Yes, normalization doesn't deal with those spaces. It does change the
text in ways that are unfriendly and I often tell DrRacket "no" when
it asks about normalization. I just wanted to put that into the mix
for this conversation, since it is a place that has to deal with
similar issues.

Robby

On Thu, Apr 12, 2012 at 4:24 PM, Eli Barzilay <[email protected]> wrote:
> 20 minutes ago, Robby Findler wrote:
>> I'm not sure of the right answer, but there is also a notion of
>> normalization of unicode characters that probably fits into whatever
>> solution you come up with here (ie the thing DrRacket is doing for
>> normalization probably applies to what you're thinking about).
>
> IIRC, normalization can change the bytes for something else that is
> the same text, so that would be fine.  I looked around and found this:
> http://www.w3.org/International/charlint/ -- I ran it on a string that
> had a zero width space, and it kept it in its output.
>
> --
>          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
>                    http://barzilay.org/                   Maze is Life!

_________________________
  Racket Developers list:
  http://lists.racket-lang.org/dev

Reply via email to