RE: Big and Little Endians [was: Re: Procedure How to Write a Manual!]

2009-05-22 Thread Jeff Coatsworth
Wow! What an education for a Friday ;)

I remember playing with DEC PDP-11's when I was a kid visiting my
father's office. I used to play some pseudo-DD command line game and
fool around with some graphics software that would draw overlapping
circles  fill them with a limited palette of colours (sort of early
Venn diagrams). Good times

-Original Message-
From: framers-boun...@lists.frameusers.com
[mailto:framers-boun...@lists.frameusers.com] On Behalf Of Jeremy H.
Griffith
Sent: Friday, May 22, 2009 3:14 PM
To: framers@lists.frameusers.com
Subject: Big and Little Endians [was: Re: Procedure How to Write a
Manual!]

On Fri, 22 May 2009 13:36:11 +, Bodvar Bjorgvinsson
bod...@gmail.com wrote:

Regarding the endianess, I had a problem some 13 years ago with some 
UNIX software that was supposed to work on Linux. It did not. I sent a 
query to an Icelandic guy on the Basic Linux Training list I 
subscribed to and he came up with a solution. Then he expained to me 
that there was a difference between Linux an UNIX that one used big 
endian and the other little endian in the same code of software.

In current computer systems, there are two kinds of endianess, called
LSB (Least Significant Byte) first and MSB (Most Significant Byte)
first.
For any given system, what determines this is not the operating system
(Linux, Windows, etc.), it's the processor (CPU).  All Intel CPUs are
LSB first; others, like Sun SPARC and Motorola 68K, are MSB first.  So
Linux on a Sun SPARC would be MSB first, but on an Intel box it would be
LSB first.

Technically, the difference is indeed *byte* order, not *bit* order
(which is constant).  Suppose you have a hex number 0xABCD.  The most
significant byte is 0xAB; the least significant byte is 0xCD.
Now imagine that you store this number in memory at address 0.  ;-)  You
will get:

Location  SPARC  Intel
  0xAB   0xCD
0001  0xCD   0xAB

Well-designed programs where portability matters will work with *either*
CPU.  They do this by not caring what the storage order in memory is,
and always accessing multibyte numbers through a set of functions that
work regardless of byte order.
For example, Mif2Go was originally developed on a Sun SPARC system, then
ported to Windows very easily because it followed those design rules.

There's actually a third flavor, but it was used only on the DEC PDP-11.
Since the last of those is probably in the Smithsonian, you won't see it
in current software.  It is the same as Intel for two-byte numbers
(shorts) but switches the byte pairs for 4-byte numbers (longs).  So the
number 0x12345678 is 0x34, 0x12, 0x78, 0x56.

Endianness also affects Unicode, in the UTF-16 and UTF-32 encodings of
it, but *not* in UTF-8.
It is the reason for the UTF-16 BOM (Byte Order Mark), U+FEFF,  In
UTF-16 Big-endian (MSB first), the bytes are 0xFE 0xFF.  In UTF-16
Little-endian (LSB first), they are 0xFF 0xFE.  UTF-32 adds two zero
bytes, before it for Big and after for Little. 

The Unicode BOM may also be used as an encoding
signature, but I digress...   ;-)  Good thing
it's Friday, eh?

HTH!

-- Jeremy H. Griffith, at Omni Systems Inc.
  jer...@omsys.com  http://www.omsys.com/
___


You are currently subscribed to Framers as
jeff.coatswo...@jonassoftware.com.

Send list messages to fram...@lists.frameusers.com.

To unsubscribe send a blank email to
framers-unsubscr...@lists.frameusers.com
or visit
http://lists.frameusers.com/mailman/options/framers/jeff.coatsworth%40jo
nassoftware.com

Send administrative questions to listad...@frameusers.com. Visit
http://www.frameusers.com/ for more resources and info.
___


You are currently subscribed to Framers as arch...@mail-archive.com.

Send list messages to fram...@lists.frameusers.com.

To unsubscribe send a blank email to 
framers-unsubscr...@lists.frameusers.com
or visit 
http://lists.frameusers.com/mailman/options/framers/archive%40mail-archive.com

Send administrative questions to listad...@frameusers.com. Visit
http://www.frameusers.com/ for more resources and info.


Re: Big and Little Endians [was: Re: Procedure How to Write a Manual!]

2009-05-22 Thread Bodvar Bjorgvinsson
Now I finally understand and, consequently revoke my warning words
against using 'endianess' in a manual. :D

What? I understand it, why shouldn't everyone else? :-/

Böðvar
-- Enlightened on a Friday

2009/5/22 Jeremy H. Griffith jer...@omsys.com:
 On Fri, 22 May 2009 13:36:11 +, Bodvar Bjorgvinsson
 bod...@gmail.com wrote:

Regarding the endianess, I had a problem some 13 years ago with some
UNIX software that was supposed to work on Linux. It did not. I sent a
query to an Icelandic guy on the Basic Linux Training list I
subscribed to and he came up with a solution. Then he expained to me
that there was a difference between Linux an UNIX that one used big
endian and the other little endian in the same code of software.

 In current computer systems, there are two kinds of
 endianess, called LSB (Least Significant Byte)
 first and MSB (Most Significant Byte) first.
 For any given system, what determines this is not
 the operating system (Linux, Windows, etc.), it's
 the processor (CPU).  All Intel CPUs are LSB first;
 others, like Sun SPARC and Motorola 68K, are MSB
 first.  So Linux on a Sun SPARC would be MSB first,
 but on an Intel box it would be LSB first.

 Technically, the difference is indeed *byte* order,
 not *bit* order (which is constant).  Suppose you
 have a hex number 0xABCD.  The most significant
 byte is 0xAB; the least significant byte is 0xCD.
 Now imagine that you store this number in memory
 at address 0.  ;-)  You will get:

 Location  SPARC  Intel
   0xAB   0xCD
 0001  0xCD   0xAB

 Well-designed programs where portability matters
 will work with *either* CPU.  They do this by not
 caring what the storage order in memory is, and
 always accessing multibyte numbers through a set
 of functions that work regardless of byte order.
 For example, Mif2Go was originally developed on
 a Sun SPARC system, then ported to Windows very
 easily because it followed those design rules.

 There's actually a third flavor, but it was used
 only on the DEC PDP-11.  Since the last of those
 is probably in the Smithsonian, you won't see it
 in current software.  It is the same as Intel
 for two-byte numbers (shorts) but switches the
 byte pairs for 4-byte numbers (longs).  So the
 number 0x12345678 is 0x34, 0x12, 0x78, 0x56.

 Endianness also affects Unicode, in the UTF-16
 and UTF-32 encodings of it, but *not* in UTF-8.
 It is the reason for the UTF-16 BOM (Byte Order
 Mark), U+FEFF,  In UTF-16 Big-endian (MSB first),
 the bytes are 0xFE 0xFF.  In UTF-16 Little-endian
 (LSB first), they are 0xFF 0xFE.  UTF-32 adds two
 zero bytes, before it for Big and after for Little.

 The Unicode BOM may also be used as an encoding
 signature, but I digress...   ;-)  Good thing
 it's Friday, eh?

 HTH!

 -- Jeremy H. Griffith, at Omni Systems Inc.
  jer...@omsys.com  http://www.omsys.com/
 ___


 You are currently subscribed to Framers as bod...@gmail.com.

 Send list messages to fram...@lists.frameusers.com.

 To unsubscribe send a blank email to
 framers-unsubscr...@lists.frameusers.com
 or visit 
 http://lists.frameusers.com/mailman/options/framers/bodvar%40gmail.com

 Send administrative questions to listad...@frameusers.com. Visit
 http://www.frameusers.com/ for more resources and info.




-- 
Life is not only a game--it is also a dance on roses.
--Fleksnes (Rolv Wesenlund)
___


You are currently subscribed to Framers as arch...@mail-archive.com.

Send list messages to fram...@lists.frameusers.com.

To unsubscribe send a blank email to 
framers-unsubscr...@lists.frameusers.com
or visit 
http://lists.frameusers.com/mailman/options/framers/archive%40mail-archive.com

Send administrative questions to listad...@frameusers.com. Visit
http://www.frameusers.com/ for more resources and info.