Greetings all, In follow up to this thread (I am copying all participants), I want to provide some additional data.
In review, Peter Flom the original poster, received the following warning message when using read.spss() to import a .SAV format SPSS data set into R: Warning message: c:\NDRI\cvar\data\cvar2rev3.sav: Compression bias (0) is not the usual value of 100. That warning message is generated in file sfm-read.c, which is a part of the foreign package. The code in that file to read SPSS datasets was provided by Ben Pfaff, who has authored an open source version of SPSS, called PSPP (http://www.gnu.org/software/pspp/pspp.html). The bias setting is part of the routine that transforms data byte codes in compressed .SAV files. This value is stored in the SPSS data file header along with a compression TRUE/FALSE flag. The bias setting is not used in non-compressed .SAV files. During offlist exchanges with Peter, he indicated that the SPSS data file in question was created via the use of DBMS/Copy rather than via SPSS itself. In this case, a SAS dataset was converted into the SPSS dataset via DBMS/Copy. Peter was then attempting to import the SPSS .SAV file into R using read.spss(). For those unfamiliar, DBMS/Copy (http://www.dataflux.com/dbms/copy.asp) is a file transformation application that can take input files from one format and generate output files in alternate formats. There is at least one other similar data mapping/transformation application that I am familiar with called DataJunction (http://pervasive.datajunction.com/djcosmos). DBMS/Copy was originally published by a company called Conceptual, which in 2002 sold the product to SAS, where it is now sold via Dataflux, which is a SAS subsidiary. Last week, I communicated with the Dataflux/SAS tech support folks to try to pursue a better understanding of the etiology of the problem. It turns out that the original author of DBMS/Copy is now employed at SAS and was available to review this issue. The bottom line is that in DBMS/Copy, the default is to generate a non-compressed SPSS format file. Thus, the author's code sets the bias value to 0 by default. In the case of a user generating a compressed .SAV file, the bias setting is set to 100. It is unclear at this time if this was a part of any formal SPSS specification. However, from all available documentation, there is no indication that the bias value can be otherwise adjusted by a user, either directly or indirectly. Thus, to my knowledge at this point, it can take only two values, 0 and 100. If accurate, it would seem to be redundant to the compression TRUE/FALSE flag. In the case of SPSS itself, the bias value of 100 is set by default, whether the .SAV file is compressed or not. Therefore, if using read.spss() on a .SAV file that was generated by SPSS natively, the warning that Peter experienced would not be issued. I hope that this information is of help to folks. With this confirmation in hand, I would like to reiterate my suggestion to add a note to the help for read.spss(), which could read as follows: "NOTE: You may receive the following message: Warning message: FileName: Compression bias (X) is not the usual value of 100. Where 'FileName' will be the SPSS file that you are reading and 'X' will be a numeric value, possibly 0. This may be the result of reading an UNCOMPRESSED SPSS file that was not generated via SPSS natively (ie. via a third party application such as DBMS/Copy). As the exact meaning of this cannot be confirmed in all cases, it is recommended that you verify the integrity of your imported SPSS data after using read.spss()." As an aside, the Dataflux folks indicate that DBMS/Copy, at this time, cannot read SPSS version 11 files. Thus it would seem that there has been some change in the native .SAV file structure of unknown scope. Presumably, this could have an impact on read.spss(). Best regards, Marc Schwartz P.S. to Thomas. It would seem worthy of consideration to forward this information to Ben Pfaff. Let me know if you want me to do this or if you would prefer otherwise. ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help