Geoffrey Sneddon wrote:
> Yeah, I started an entire Unicode implementation in userland PHP.  
> Let's just say it became rather large while getting nowhere. :)

So, the trick is to only use strpos() on well-formed UTF-8, and you're
golden. :-)

> This isn't really a case of the built-in implementation not working,  
> it's just the built-in implementation is defined to use either UCS2 or  
> UCS4 depending on a compile-time flag, which can end up being rather  
> fun to deal with (look at ifragment in anolislib/utils.py in Anolis  
> for example).

That an absolutely horrid piece of code, having to match for surrogate
pairs yourself. Does Python use PCRE, by any chance?

Cheers,
Edward

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
 To post to this group, send email to [email protected]
 To unsubscribe from this group, send email to 
[email protected]
 For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB
-~----------~----~----~----~------~----~------~--~---

Reply via email to