Unicode delenda est! On Fri, Apr 4, 2008 at 2:42 PM, Kevin Brown <[EMAIL PROTECTED]> wrote:
> On Fri, Apr 4, 2008 at 2:24 PM, Brian Eaton <[EMAIL PROTECTED]> wrote: > > > On Fri, Apr 4, 2008 at 2:20 PM, Kevin Brown <[EMAIL PROTECTED]> wrote: > > > > Quick! What happens if someone passes in > "%ff%4fpensocial_owner_id" > > > > to our java code and then we pass it on to .NET code running in a > > > > japanese locale? > > > > > > > > > Er, shouldn't this always be UTF-8 encoded, in which case %ff isn't a > > valid > > > character anyway? It seems to me that if we're decoding and > re-encoding > > > anyway we'd eliminate most of these sorts of things. > > > > You're right, %ff%4f is unicode, there is a UTF-8 version we'd accept > > though. > > > > Java does not canonicalize the halfwidth to ASCII equivalents. Other > > platforms do (I think for sort order?), so the disconnect causes > > problems. > > > This is why everyone should speak Latin. > > > > > > > > Cheers, > > Brian > > > > > > -- > ~Kevin >

