On approximately 12/8/2008 12:57 AM, came the following characters from
the keyboard of Stephen J. Turnbull:
"Internal decoding" is (or should be) an oxymoron. Why would your
software be passing around text in any format other than internal? So
decoding will happen (a) on I/O, which is itself almost certainly
slower than making a few checks for Unicode hygiene, or (b) on receipt
of data from other software that whose sanitation you shouldn't trust
more than you trust the Internet.
Encoding isn't a problem, AFAICS.
So I can see validating user supplied data, which always comes in via I/O.
But during manipulation of internal data, including file and database
I/O, there is a need for encoding and decoding also. If all the data
has already been validated, then there would be no need to revalidate on
every conversion.
I hear you when you say that clever coding can make the validation
nearly free, and I applaud that: the UTF-8 coder that I wrote predated
most of the rules that have been created since, so I didn't attempt to
be clever in that regard.
Thanks to you and Adam for your explanations; I see your points, and if
it is nearly free, I withdraw most of my negativity on this topic.
--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com