Re: [Python-Dev] Encoding detection in the standard library?

Mike Klaas Tue, 22 Apr 2008 13:47:42 -0700


On 22-Apr-08, at 3:31 AM, M.-A. Lemburg wrote:

I don't think that should be part of the standard library. People
will mistake what it tells them for certain.


+1

I also think that it's better to educate people to add (correct)
encoding information to their text data, rather than give them a
guess mechanism...

That is a fallacious alternative: the programmers that need encodingdetection are not the same people who are omitting encoding information.

I only have a small opinion on whether charset detection should appearin the stdlib, but I am somewhat perplexed by the arguments in thisthread. I don't see how inclusion in the stdlib would make peoplemore inclined to think that the algorithm is always correct. In termsof the need of this functionality:


Martin wrote:

Can you please explain why that is? Web programs should not normally
have the need to detect the encoding; instead, it should be specified
always - unless you are talking about browsers specifically, which
need to support web pages that specify the encoding incorrectly.

Any program that needs to examine the contents of documents/feeds/whatever on the web needs to deal with incorrectly-specified encodings(which, sadly, is rather common). The set of programs of programsthat need this functionality is probably the same set that needsBeautifulSoup--I think that set is larger than just browsers <grin>


-Mike
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding detection in the standard library?

Reply via email to