Todd Zullinger wrote:
>
>Related to the second part of Werner's message being scrubbed with the
>message:
>
>    An embedded and charset-unspecified text was scrubbed...
>
>Poking in the email package (on python 2.4.4) shows:
>
>    def get_content_charset(self, failobj=None):
>        """Return the charset parameter of the Content-Type header.
>
>        The returned string is always coerced to lower case.  If there is no
>        Content-Type header, or if that header has no charset parameter,
>        failobj is returned.
>        """
>
>This seems to violate section 5.2 of RFC 2045 which says parts lacking
>a Content-type header should be assumed to be text/plain with a
>charset of us-ascii.  The get_content_type method in email.Message
>does mention RFC 2045 and uses text/plain if the content-type is
>invalid.


It does seem inconsistent, but I don't think we can call it a violation
of the RFC yet, it depends on what the caller does with it.


>Would it be appropriate to set failobj="us-ascii" when
>calling this method in Scrubber.py?


It might be, but I'd like to hear from Tokio first.

Clearly this was considered at one point as a specific case and message
exist for it where it would have been simpler to just assume it is
us-ascii. Thus, I think there must be messages in the wild with parts
with unspecified character sets that aren't us-ascii.

-- 
Mark Sapiro <[EMAIL PROTECTED]>       The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan

------------------------------------------------------
Mailman-Users mailing list
[email protected]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: 
http://www.python.org/cgi-bin/faqw-mm.py?req=show&amp;file=faq01.027.htp

Reply via email to