Yes, normalization doesn't deal with those spaces. It does change the text in ways that are unfriendly and I often tell DrRacket "no" when it asks about normalization. I just wanted to put that into the mix for this conversation, since it is a place that has to deal with similar issues.
Robby On Thu, Apr 12, 2012 at 4:24 PM, Eli Barzilay <[email protected]> wrote: > 20 minutes ago, Robby Findler wrote: >> I'm not sure of the right answer, but there is also a notion of >> normalization of unicode characters that probably fits into whatever >> solution you come up with here (ie the thing DrRacket is doing for >> normalization probably applies to what you're thinking about). > > IIRC, normalization can change the bytes for something else that is > the same text, so that would be fine. I looked around and found this: > http://www.w3.org/International/charlint/ -- I ran it on a string that > had a zero width space, and it kept it in its output. > > -- > ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: > http://barzilay.org/ Maze is Life! _________________________ Racket Developers list: http://lists.racket-lang.org/dev

