Jason Stover <[EMAIL PROTECTED]> writes:

> I have a collection of small SAS data sets that I have been using to
> try to decipher the SAS data file format. I made some progress and
> have reached a more difficult task of figuring out how the actual
> numeric values are encoded.

I made a little headway on that fairly quickly.  We can use Perl
to see how numeric values are encoded as IEEE doubles, e.g.:

    [EMAIL PROTECTED]:~(0)$ perl -e 'print pack("d", 8.0)' | hd
    00000000  00 00 00 00 00 00 20 40                           |...... @|
    00000008
    [EMAIL PROTECTED]:~(0)$ perl -e 'print pack("d", 9.0)' | hd
    00000000  00 00 00 00 00 00 22 40                           |......"@|
    00000008
    [EMAIL PROTECTED]:~(0)$ 

Then, we can dump the differences between files that are supposed
to contain those values and look for them:

    [EMAIL PROTECTED]:~/tmp/sas(0)$ diff -u <(hd 
n_cases_1_float_9_lc_a.sas7bdat) <(hd n_cases_1_float_8_lc_a.sas7bdat) |grep 
'22 40\|20 40'
    -00000490  00 00 00 00 00 00 22 40  61 20 20 20 20 20 20 20  |......"@a     
  |
    +00000490  00 00 00 00 00 00 20 40  61 20 20 20 20 20 20 20  |...... @a     
  |

There are other differences between these files but those really
stick out.
-- 
"Then, I came to my senses, and slunk away, hoping no one overheard my
 thinking."
--Steve McAndrewSmith in the Monastery


_______________________________________________
pspp-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-dev

Reply via email to