Dario Teixeira schrieb:
So, can someone find any problems with this reasoning?
No, the kind of compatibility with legacy code you described is
one of the original design goals of UTF-8, see
http://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt
- Florian.
Hi,
I'm using Ulex + Menhir to parse UTF-8 encoded source code, and I'm relying
on plain strings for processing and storing data. I *think* I can get away
with using only the String module to handle this variable-length encoding
as long as I am careful with the way I treat these strings.
On Wed, Aug 12, 2009 at 10:36:56AM -0700, Dario Teixeira wrote:
Hi,
I'm using Ulex + Menhir to parse UTF-8 encoded source code, and I'm relying
on plain strings for processing and storing data. I *think* I can get away
with using only the String module to handle this variable-length
Hi,
Thank you all for your comments. Ulex has caught all the intentionally
malformed code points I've inserted in the stream, so I'm fairly confident
it's up to the task. But if I find a problem I'll keep Netconversion's
and Extlib's validation functions in mind...
By the way, I just
Call for Papers and ParticipationIFL 2009Seton Hall UniversitySouth Orange, NJ, USAhttp://tltc.shu.edu/blogs/projects/IFL2009/Register at: http://tltc.shu.edu/blogs/projects/IFL2009/registration.html* NEW *Registration and talk submission deadline fast approaching: August 23,