On Jun 26, 2014, at 7:02 PM, Philip Prindeville <philipp_s...@redfish-solutions.com> wrote:
> > On Jun 25, 2014, at 5:29 PM, RW <rwmailli...@googlemail.com> wrote: > >> On Wed, 25 Jun 2014 14:21:33 -0600 >> Philip Prindeville wrote: >> >> >>> Here’s the other thing I don’t get. >>> >>> The message claims to be 7-bit and text/plain, yet it uses encoded >>> characters which exceed 7-bit widths yet this doesn’t seem to be >>> firing any rules either. >>> >>> Ь would seem to be at least an 11-bit wide character. >> >> You are mixing-up different levels of encoding. The characters >> &,#,x,0,4,2 and C are all 7-bit ASCI, and so are consistent with >> Content-Transfer-Encoding: 7bit. > > You’re correct… That is consistent with the CTE. > > But the Content-Type omitted a ;charset=“XXX” attribute, which means it > defaults to “US-ASCII”. > > Quoting RFC-2046: > > 4.1.2. Charset Parameter > > A critical parameter that may be specified in the Content-Type field > for "text/plain" data is the character set. This is specified with a > "charset" parameter, as in: > > Content-type: text/plain; charset=iso-8859-1 > > Unlike some other parameter values, the values of the charset > parameter are NOT case sensitive. The default character set, which > must be assumed in the absence of a charset parameter, is US-ASCII. > > > Since Ь is outside the US-ASCII character set, this would be an > encoding violation. > > -Philip > Can anyone point me at how to write a test that confirms that the actual encoded text will fit into the named (or implicit) charset? I.e. what’s a good template or example to go by? Thanks.