Re: PerlIO_write Question

Tassilo von Parseval Fri, 14 Oct 2005 00:40:15 -0700

On Fri, Oct 14, 2005 at 08:56:15AM +0200 Reinhard Pagitsch wrote:
> Tassilo von Parseval wrote:
> >But this is then byte-order dependant: It will write the
> >least-significant byte on little-endian and most significant byte on
> >big-endian.
> 
> Do you have more informations about big and little endian? Maybe some 
> links? But not too theoretical, more practical.


I can't remember where I learned about it. But it's not so extremely
difficult that it couldn't be explained easily.

Consider a decimal number always consisting of 4 digits, such as 1234.
The maths behind that is

    1234 = 4*10^0 + 3*10^1 + 2*10^2 + 1*10^3

This is big-endian "digit"-order because the big (that is, significant)
digit comes first. Little-endian would be 4321, because the little (last
significant) digit comes first.

With computers the digits are actually bytes, and the math becomes:

    01 02 03 04 = 4*256^0 + 3*256^1 + 2*256^2 + 1*256^3 = 16909060

Again, the above is big-endian. In little-endian, the bytes are simply
reversed:

    04 03 02 01 = 4*256^0 + 3*256^1 + 2*256^2 + 1*256^3 = 16909060

In your example you had a 'long', which we assume is 4-bytes (but could
be 8 bytes as well).

    long what = 1;

Internal representation as char-buffer is for big-endian:

    unsigned char bytes[4] = { 0, 0, 0, 1 };

and little-endian:

    unsigned char bytes[4] = { 1, 0, 0, 0 };

Therefore, if you do a 

    Write1Byte((char*)&what);

0 is spit out for big-endian and 1 for little-endian because each writes
the first byte of the character buffer.

There are macros to swap the byteorder:

    #define swap32(n) \
        n = ((n & 0xff000000) >> 24) |  \
            ((n & 0x00ff0000) >> 8)  |  \
            ((n & 0x0000ff00) << 8)  |  \
            ((n & 0x000000ff) << 24)

    /* for 16-bit integers (shorts) */
    #define swap16(n) \
        n = ((n & 0xff00) >> 8) |   \
            ((n & 0x00ff) << 8)

It should be obvious how the above macros work: a bitmask is used to extract a
certain byte and then it's shifted to the appropriate position.

You can avoid endian-issues by de- and en-coding numbers manually:

    /* char b[4] contains the bytes */
    int num = b[3] | (b[2]<<8) | (b[1]<<16) | (b[0]<<24);

and correspondingly for the other direction:

    b[0] = (num & 0xff000000) >> 24;    /* most significant */
    b[1] = (num & 0x00ff0000) >> 16;
    b[2] = (num & 0x0000ff00) >> 8;
    b[3] =  num & 0x000000ff;           /* least significant */

Also, big-endian is often referred to as network byte-order and
little-endian as host byte-order. The libc contains some conversion
functions as well ('h' standing for host, 'n' for network in the
below functions):

    #include <netinet/in.h>

    uint32_t htonl(uint32_t hostlong);
    uint16_t htons(uint16_t hostshort);
    uint32_t ntohl(uint32_t netlong);
    uint16_t ntohs(uint16_t netshort);

Tassilo
-- 
use bigint;
$n=71423350343770280161397026330337371139054411854220053437565440;
$m=-8,;;$_=$n&(0xff)<<$m,,$_>>=$m,,print+chr,,while(($m+=8)<=200);

Re: PerlIO_write Question

Reply via email to