>>>>> "Bryan" == Bryan O'Sullivan <b...@serpentine.com> writes:
Bryan> On Sat, Aug 14, 2010 at 10:46 PM, Michael Snoyman <mich...@snoyman.com> wrote: Bryan> When I'm writing a web app, my code is sitting on a Linux Bryan> system where the default encoding is UTF-8, communicating Bryan> with a database speaking UTF-8, receiving request bodies in Bryan> UTF-8 and sending response bodies in UTF-8. So converting all Bryan> of that data to UTF-16, just to be converted right back to Bryan> UTF-8, does seem strange for that purpose. Bryan> Bear in mind that much of the data you're working with can't Bryan> be readily trusted. UTF-8 coming from the filesystem, the Bryan> network, and often the database may not be valid. The cost of Bryan> validating it isn't all that different from the cost of Bryan> converting it to UTF-16. But UTF-16 (apart from being an abomination for creating a hole in the codepoint space and making it impossible to ever etxend it) is slow to process compared with UTF-32 - you can't get the nth character in constant time, so it seems an odd choice to me. -- Colin Adams Preston Lancashire () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments _______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe