Re: [Python-Dev] Encoding detection in the standard library?

Tony Nelson Mon, 21 Apr 2008 11:35:15 -0700

At 1:14 PM -0400 4/21/08, David Wolever wrote:
>On 21-Apr-08, at 12:44 PM, [EMAIL PROTECTED] wrote:
>>
>>     David> Is there some sort of text encoding detection module is the
>>     David> standard library?  And, if not, is there any reason not
>> to add
>>     David> one?
>> No, there's not.  I suspect the fact that you can't correctly
>> determine the
>> encoding of a chunk of text 100% of the time mitigates against it.
>Sorry, I wasn't very clear what I was asking.
>
>I was thinking about making an educated guess -- just like chardet
>(http://chardet.feedparser.org/).
>
>This is useful when you get a hunk of data which _should_ be some
>sort of intelligible text from the Big Scary Internet (say, a posted
>web form or email message), and you want to do something useful with
>it (say, search the content).


Feedparser.org's chardet can't guess 'latin1', so it should be used as a
last resort, just as the docs say.
-- 
____________________________________________________________________
TonyN.:'                       <mailto:[EMAIL PROTECTED]>
      '                              <http://www.georgeanelson.com/>
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding detection in the standard library?

Reply via email to