>Date: Thu, 19 Apr 2001 12:59:43 -0700
>To: Tomas McGuinness <[EMAIL PROTECTED]>
>From: Asmus Freytag <[EMAIL PROTECTED]>
>Subject: Re: Byte Order Marks
>
>At 02:58 PM 4/19/01 +0200, you wrote:
>>If its absent is it safe to assume any particular order (i.e. Big or 
>>Little Endian?)


The default order is Big endian, but I wouldn't call that a 'safe' 
assumption. In the most general case I would attempt an autorecognition in 
the unlabelled case. Where a particular protocol's specification reinforces 
that the default order SHALL apply for the unlabelled case, the assumption 
becomes that much stronger, of course.

A./

PS: as an aside: the SCSU encoder can be used to do this form of 
autorecognition. If text shows much better compression in one byte order 
than the other, that byte order is overwhelmingly likely to be the true 
one. The exception would be strings of pure Han ideographs. For these it's 
necessary to


Reply via email to