On Sun, Apr 24, 2005 at 08:45:10PM -0700, Ben Pfaff wrote:

     John Darrington <[EMAIL PROTECTED]> writes:
     
     > Maybe we should find out exactly what SPSS does.
     
     I think that's the thing to do.  I will try to test it out in the
     next few days.  If you get to it before me, pass along your
     results, and I will do the same.
     
My results are in the brief report attached.  

My conclusion is that SPSS does indeed keep its long-short name map,
and does not allow short names to magically change.  So I think we
should do the same.  I don't think it adds too much extra complexity.
Variables need only to have one name (the long one).  The map needs to
be a member of the dictionary. The only modules which will need to use
it however will be sfm-read and sfm-write.

I suppose the question still remains about what should happen if the
variables are renamed.  Tom Watson's comments seem to suggest that
SPSS simply ignores the short names and renames only the long ones.
We can probably do better than this.

Another question is the geometry of the long-short name map --- should
it be indexed by shortname or by longname.  I remember wondering if I
made the right choice when I was implementing it.
 

Any comments?


J'

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.


Introduction
------------

Version 12 of SPSS introduced long names for its variables.  In this
version variable names can be upto 64 bytes long.  Previous versions
permitted variable names to a maximum of 8 bytes long.  In order to
allow backward compatibility of system files, the designers chose a
system whereby the system files are written with the original 8 byte
names, but also a map of short names to long names.

The question arises of whether the mapping between short and long
names persists thoughout a session.  If it does not, then a system
file loaded with a particular set of variable names and subsequently
saved, may end up with  a completely different set of variable names,
or the same set, but with a different mapping.

It's considered desirable that PSPP emulate the behaviour of SPSS in
order to ensure compatibility.


Purpose
-------

I wanted to examine the hypothesis:

  "SPSS v12 retains its mapping between long variable names and short
  variable names throughout a session."



Test Environment and Tools
--------------------------

1. A Windoze operating system running SPSS v12.

2. The Emacs text editor.


Method
------


I prepared a syntax file 'write.sps' containing the following:

 DATA LIST LIST /foobarwiz1 * foobarwiz2 *.
 BEGIN DATA.
 1 2
 END DATA.

 ECHO 'State of dictionary prior to writing'.

 DISPLAY DICTIONARY.

 LIST.

 SAVE /OUTFILE='out.sav'.


When run through SPSS v12, this file produced the output in appendix A
'OUTPUT1.TXT'.

It it pertinent that the variables foobarwiz1 and foobarwiz2 have been
allocated indeces 1 and 2 respectively.

Next, I examined the 'out.sav' system file using the hexl-mode of
Emacs.  The hex dump of this file is in Appendix B.  SPSS
decided to allocate the short name FOOBARWI to the variable
"foobarwiz1" and the short name FOOBAR_A to the variable "foobarwiz2".


I created a verbatim copy of 'out.sav' which I named 'in.sav'.  Then
using the hexl-mode of Emacs I modified 'in.sav' as follows.  In the
short-long name map I exchanged the strings FOOBARWI and FOOBAR_A. So
effectively exchanging the mappings between the two variables.  A
hexdump of the modified 'in.sav' is in Appendix C.


Now I used the following syntax file to read the modified 'in.sav' and
re-write it to a file 'out2.sav'.

 GET /FILE='in.sav'.

 ECHO 'State of dictionary after reading'.

 DISPLAY DICTIONARY.

 LIST.

 SAVE /OUTFILE='out2.sav'.


The motive is, that since 'in.sav' contains a map which is not the
default mapping, if the mapping is not preserved during this session,
then 'out2.sav' will be written with the default mapping instead of
the mapping presented to it in 'in.sav'.


The output file from this session is shown in Appendix D.
It is not entirely unexpected, that the indeces of variables foobarwiz1 and
foobarwiz2 are now reversed from the situation in the previous
session.

The interesting part however comes when examining 'out2.sav'.  The
mapping has retained that which was in existence in 'in.sav' ---
foobarwiz1 has the short name FOOBAR_A and foobarwiz2 has the short
name FOOBARWI.   The hex dump of out2.sav is presented in Appendix E.



Conclusion
----------

The results suggest that that SPSS does indeed persist is long/short
name mappings throughout the duration of a SPSS session.  It does not
appear to generate new mappings every time a system file is written,
but uses existing mappings if available.



Appendix A
----------

OUTPUT1.TXT

State of dictionary prior to writing

File Information

Notes
�
 | -------------------------------------------- | -------------------- | 
 | Output Created                               | 26-APR-2005 07:55:10 | 
 | -------------------------------------------- | -------------------- | 
 | Comments                                     |                      | 
 | ----------- | ------------------------------ | -------------------- | 
 | Input       | Filter                         | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | Weight                         | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | Split File                     | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | N of Rows in Working Data File | 1                    | 
 | ----------- | ------------------------------ | -------------------- | 
 | Syntax                                       | DISPLAY DICTIONARY.  | 
 |                                              |                      | 
 | ----------- | ------------------------------ | -------------------- | 
 | Resources   | Elapsed Time                   | 0:00:00.00           | 
 | ----------- | ------------------------------ | -------------------- | 
�



List of variables on the working file

Name (Position) Label

foobarwiz1 (1)
    Measurement Level: Scale
    Column Width: 10  Alignment: Right
    Print Format: F8.2
    Write Format: F8.2

foobarwiz2 (2)
    Measurement Level: Scale
    Column Width: 10  Alignment: Right
    Print Format: F8.2
    Write Format: F8.2




List

Notes
�
 | -------------------------------------------- | -------------------- | 
 | Output Created                               | 26-APR-2005 07:55:10 | 
 | -------------------------------------------- | -------------------- | 
 | Comments                                     |                      | 
 | ----------- | ------------------------------ | -------------------- | 
 | Input       | Filter                         | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | Weight                         | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | Split File                     | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | N of Rows in Working Data File | 1                    | 
 | ----------- | ------------------------------ | -------------------- | 
 | Syntax                                       | LIST.                | 
 |                                              |                      | 
 | ----------- | ------------------------------ | -------------------- | 
 | Resources   | Elapsed Time                   | 0:00:00.00           | 
 | ----------- | ------------------------------ | -------------------- | 
�

foobarwiz1 foobarwiz2

     1.00       2.00


Number of cases read:  1    Number of cases listed:  1





Appendix B.
----------

'out.sav'

          2446 4c32 4028 2329 2053 5053 5320 4441  $FL2@(#) SPSS DA
00000010: 5441 2046 494c 4520 4d53 2057 696e 646f  TA FILE MS Windo
00000020: 7773 2052 656c 6561 7365 2031 322e 3020  ws Release 12.0 
00000030: 7370 7373 696f 3332 2e64 6c6c 2020 2020  spssio32.dll    
00000040: 0200 0000 0200 0000 0100 0000 0000 0000  ................
00000050: 0100 0000 0000 0000 0000 5940 3236 2041  [EMAIL PROTECTED] A
00000060: 7072 2030 3530 373a 3535 3a31 3020 2020  pr 0507:55:10   
00000070: 2020 2020 2020 2020 2020 2020 2020 2020                  
00000080: 2020 2020 2020 2020 2020 2020 2020 2020                  
00000090: 2020 2020 2020 2020 2020 2020 2020 2020                  
000000a0: 2020 2020 2020 2020 2020 2020 2000 0000               ...
000000b0: 0200 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0208 0500 0208 0500 464f 4f42 4152 5749  ........FOOBARWI
000000d0: 0200 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0208 0500 0208 0500 464f 4f42 4152 5f41  ........FOOBAR_A
000000f0: 0700 0000 0300 0000 0400 0000 0800 0000  ................
00000100: 0c00 0000 0000 0000 0000 0000 d002 0000  ................
00000110: 0100 0000 0100 0000 0200 0000 0200 0000  ................
00000120: 0700 0000 0400 0000 0800 0000 0300 0000  ................
00000130: ffff ffff ffff efff ffff ffff ffff ef7f  ................
00000140: feff ffff ffff efff 0700 0000 0b00 0000  ................
00000150: 0400 0000 0600 0000 0300 0000 0a00 0000  ................
00000160: 0100 0000 0300 0000 0a00 0000 0100 0000  ................
00000170: 0700 0000 0d00 0000 0100 0000 2700 0000  ............'...
00000180: 464f 4f42 4152 5749 3d66 6f6f 6261 7277  FOOBARWI=foobarw
00000190: 697a 3109 464f 4f42 4152 5f41 3d66 6f6f  iz1.FOOBAR_A=foo
000001a0: 6261 7277 697a 32e7 0300 0000 0000 0065  barwiz2........e
000001b0: 66fc 0000 0000 



Appendix C.
-----------

'in.sav'

          2446 4c32 4028 2329 2053 5053 5320 4441  $FL2@(#) SPSS DA
00000010: 5441 2046 494c 4520 4d53 2057 696e 646f  TA FILE MS Windo
00000020: 7773 2052 656c 6561 7365 2031 322e 3020  ws Release 12.0 
00000030: 7370 7373 696f 3332 2e64 6c6c 2020 2020  spssio32.dll    
00000040: 0200 0000 0200 0000 0100 0000 0000 0000  ................
00000050: 0100 0000 0000 0000 0000 5940 3236 2041  [EMAIL PROTECTED] A
00000060: 7072 2030 3530 373a 3535 3a31 3020 2020  pr 0507:55:10   
00000070: 2020 2020 2020 2020 2020 2020 2020 2020                  
00000080: 2020 2020 2020 2020 2020 2020 2020 2020                  
00000090: 2020 2020 2020 2020 2020 2020 2020 2020                  
000000a0: 2020 2020 2020 2020 2020 2020 2000 0000               ...
000000b0: 0200 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0208 0500 0208 0500 464f 4f42 4152 5749  ........FOOBARWI
000000d0: 0200 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0208 0500 0208 0500 464f 4f42 4152 5f41  ........FOOBAR_A
000000f0: 0700 0000 0300 0000 0400 0000 0800 0000  ................
00000100: 0c00 0000 0000 0000 0000 0000 d002 0000  ................
00000110: 0100 0000 0100 0000 0200 0000 0200 0000  ................
00000120: 0700 0000 0400 0000 0800 0000 0300 0000  ................
00000130: ffff ffff ffff efff ffff ffff ffff ef7f  ................
00000140: feff ffff ffff efff 0700 0000 0b00 0000  ................
00000150: 0400 0000 0600 0000 0300 0000 0a00 0000  ................
00000160: 0100 0000 0300 0000 0a00 0000 0100 0000  ................
00000170: 0700 0000 0d00 0000 0100 0000 2700 0000  ............'...
00000180: 464f 4f42 4152 5f41 3d66 6f6f 6261 7277  FOOBAR_A=foobarw
00000190: 697a 3109 464f 4f42 4152 5749 3d66 6f6f  iz1.FOOBARWI=foo
000001a0: 6261 7277 697a 32e7 0300 0000 0000 0065  barwiz2........e
000001b0: 66fc 0000 0000 


Appendix D.
-----------


State of dictionary after reading

File Information

Notes
�
 | -------------------------- | -------------------- | 
 | Output Created             | 26-APR-2005 08:07:26 | 
 | -------------------------- | -------------------- | 
 | Comments                   |                      | 
 | ----------- | ------------ | -------------------- | 
 | Input       | Data         | Z:\Names\in.sav      | 
 |             | ------------ | -------------------- | 
 |             | Filter       | <none>               | 
 |             | ------------ | -------------------- | 
 |             | Weight       | <none>               | 
 |             | ------------ | -------------------- | 
 |             | Split File   | <none>               | 
 | ----------- | ------------ | -------------------- | 
 | Syntax                     | DISPLAY DICTIONARY.  | 
 |                            |                      | 
 | ----------- | ------------ | -------------------- | 
 | Resources   | Elapsed Time | 0:00:00.00           | 
 | ----------- | ------------ | -------------------- | 
�



List of variables on the working file

Name (Position) Label

foobarwiz2 (1)
    Measurement Level: Scale
    Column Width: 10  Alignment: Right
    Print Format: F8.2
    Write Format: F8.2

foobarwiz1 (2)
    Measurement Level: Scale
    Column Width: 10  Alignment: Right
    Print Format: F8.2
    Write Format: F8.2



List

Notes
�
 | -------------------------------------------- | -------------------- | 
 | Output Created                               | 26-APR-2005 08:07:26 | 
 | -------------------------------------------- | -------------------- | 
 | Comments                                     |                      | 
 | ----------- | ------------------------------ | -------------------- | 
 | Input       | Data                           | Z:\Names\in.sav      | 
 |             | ------------------------------ | -------------------- | 
 |             | Filter                         | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | Weight                         | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | Split File                     | <none>               | 
 |             | ------------------------------ | -------------------- | 
 |             | N of Rows in Working Data File | 1                    | 
 | ----------- | ------------------------------ | -------------------- | 
 | Syntax                                       | LIST.                | 
 |                                              |                      | 
 | ----------- | ------------------------------ | -------------------- | 
 | Resources   | Elapsed Time                   | 0:00:00.00           | 
 | ----------- | ------------------------------ | -------------------- | 
�

foobarwiz2 foobarwiz1

     1.00       2.00


Number of cases read:  1    Number of cases listed:  1



Appendix E.
-----------

'out2.sav'

          2446 4c32 4028 2329 2053 5053 5320 4441  $FL2@(#) SPSS DA
00000010: 5441 2046 494c 4520 4d53 2057 696e 646f  TA FILE MS Windo
00000020: 7773 2052 656c 6561 7365 2031 322e 3020  ws Release 12.0 
00000030: 7370 7373 696f 3332 2e64 6c6c 2020 2020  spssio32.dll    
00000040: 0200 0000 0200 0000 0100 0000 0000 0000  ................
00000050: 0100 0000 0000 0000 0000 5940 3236 2041  [EMAIL PROTECTED] A
00000060: 7072 2030 3530 383a 3037 3a32 3620 2020  pr 0508:07:26   
00000070: 2020 2020 2020 2020 2020 2020 2020 2020                  
00000080: 2020 2020 2020 2020 2020 2020 2020 2020                  
00000090: 2020 2020 2020 2020 2020 2020 2020 2020                  
000000a0: 2020 2020 2020 2020 2020 2020 2000 0000               ...
000000b0: 0200 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0208 0500 0208 0500 464f 4f42 4152 5749  ........FOOBARWI
000000d0: 0200 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0208 0500 0208 0500 464f 4f42 4152 5f41  ........FOOBAR_A
000000f0: 0700 0000 0300 0000 0400 0000 0800 0000  ................
00000100: 0c00 0000 0000 0000 0000 0000 d002 0000  ................
00000110: 0100 0000 0100 0000 0200 0000 0200 0000  ................
00000120: 0700 0000 0400 0000 0800 0000 0300 0000  ................
00000130: ffff ffff ffff efff ffff ffff ffff ef7f  ................
00000140: feff ffff ffff efff 0700 0000 0b00 0000  ................
00000150: 0400 0000 0600 0000 0300 0000 0a00 0000  ................
00000160: 0100 0000 0300 0000 0a00 0000 0100 0000  ................
00000170: 0700 0000 0d00 0000 0100 0000 2700 0000  ............'...
00000180: 464f 4f42 4152 5749 3d66 6f6f 6261 7277  FOOBARWI=foobarw
00000190: 697a 3209 464f 4f42 4152 5f41 3d66 6f6f  iz2.FOOBAR_A=foo
000001a0: 6261 7277 697a 31e7 0300 0000 0000 0065  barwiz1........e
000001b0: 66fc 0000 0000 

Attachment: pgpZSQHKEJfWh.pgp
Description: PGP signature

_______________________________________________
pspp-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-dev

Reply via email to