Hi John

Thanks for your reply!

FLIP seems to truncate the variable names to 8 chars, and remove the string 
values so that's not really an option.

We did try exporting to text, but we need to get this working for our testers, 
and exporting and running the diff is quite complicated for them.  It also 
seems that the txt export joins variable names together and breaks lines after 
76 chars which makes it very hard to diff.

Exporting to cvs seems to be the best option at the moment and comparing 
manually, as the expected/allowed differences in values are grouped at the end 
of the variables which makes other differences easier to spot.

Tried to add DROP to the MATCH FILES to omit the 55 variables that are allowed 
to be different, but it crashes, so I just cleared these variables from the 
input files which makes MATCH FILES work well. The problem is the actual 
differences are not highlighted and they are often very difficult to spot 
manually and it is a pain to delete 110 variables by hand each time.

Also it is a pain that Copy and Paste don't seem to work for us in the Syntax 
editor on Windows psppire 0.7.4.

I was thinking about trying to use the SPSS Identify Duplicate Cases option, 
but I can't find any reference to this being available in PSPP.

Anyway thanks very much for your suggestions!

ciao
mich


On 03/08/2010, at 10:30 PM, John Darrington wrote:

> Well I would seriously consider Ben's suggestion of exporting to text and 
> using the
> posix diff utility.
> 
> Another possibility, which may be of use since you have a lot of variables 
> but only 
> a few cases, is to use the FLIP command.  Then you will have a lot of cases 
> but fewer variables,
> which will make it feasible to calculate the difference between them with a 
> command like
> CALCULATE diff_X = x_1 - x_2.
> Then any non-zero values you know highlight a difference in the input.
> 
> J'
> 
> On Mon, Aug 02, 2010 at 04:43:57PM +1000, Michelle Parker wrote:
>     HI John
> 
>     Thanks, this works and is great!
> 
>     But, I'm finding each file has some allowed differences, e.g. dates, 
> times, durations, so every file will be found in this list.
> 
>     Since there are probably 20% of the values that are different between 
> files, it would be much easier just to list the values when they are 
> different. 
> 
>     Can I highlight every individual difference?
> 
>     Non-different values could be empty or even spaces to make the important 
> differences easier to spot in the output.
> 
>     What do you think?
> 
>     Much appreciated!
> 
>     thanks
>     mich
> 
>     On 20/07/2010, at 1:31 AM, John Darrington wrote:
> 
>> One way to do this is as follows:
>> 
>> MATCH FILES
>>       /FILE='f1.sav' /IN=file1 /SORT
>>       /FILE='f2.sav' /IN=file2 /SORT
>>       /BY ALL
>>       .
>> 
>> SELECT IF file2=0 OR file1=0.
>> 
>> 
>> LIST.
>> 
>> This will show a list of all the cases which don't match.  And you get two 
>> extra
>> variables file1 and file2 showing where those cases came from.
>> 
>> J'
>> 
>> 
>> On Mon, Jul 19, 2010 at 02:09:42PM +1000, Michelle Parker wrote:
>>    Hi Michel
>> 
>>    Thanks for getting back to me.
>> 
>>    The files have 730 variables, types and lengths are identical.
>>    There are 13 cases in each file.
>> 
>>    Some of the cases may have different values (eg date/times) but in 
>> general they should be the same between files. Specifically I need to know 
>> if there are any differences.
>> 
>>    thanks!
>>    mich
>> 
>> 
>> 
>> 
>> 
>>    On 19/07/2010, at 12:48 PM, Michel Boaventura wrote:
>> 
>>> Hello Michelle,
>>> 
>>> Would you like to compare the variables or the cases on the files? If the 
>>> variables,
>>> it matters if they have the same name but diverge on type, length, etc?
>>> 
>>> Regards,
>>> 
>>> Michel
>>> 
>>> _______________________________________________
>>> Pspp-users mailing list
>>> [email protected]
>>> http://lists.gnu.org/mailman/listinfo/pspp-users
>> 
>>    ---------------------------------------
>>    Michelle Parker
>>    Web Objectives Pty Ltd
>>    33 Ridge St
>>    Gordon, NSW, 2072
>>    Australia 
>>    Phone: (02) 9499 3166
>>    Fax: (02) 9499 3166
>>    Mobile : 0412 064 123
>>    [email protected]
>>    ---------------------------------------
>> 
>> 
>> 
>> 
>>    _______________________________________________
>>    Pspp-users mailing list
>>    [email protected]
>>    http://lists.gnu.org/mailman/listinfo/pspp-users
>> 
>> 
>> -- 
>> PGP Public key ID: 1024D/2DE827B3 
>> fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
>> See http://pgp.mit.edu or any PGP keyserver for public key.
>> 
>> 
> 
>     ---------------------------------------
>     Michelle Parker
>     Web Objectives Pty Ltd
>     33 Ridge St
>     Gordon, NSW, 2072
>     Australia 
>     Phone: (02) 9499 3166
>     Fax: (02) 9499 3166
>     Mobile : 0412 064 123
>     [email protected]
>     ---------------------------------------
> 
> 
> 
> 
>     _______________________________________________
>     Pspp-users mailing list
>     [email protected]
>     http://lists.gnu.org/mailman/listinfo/pspp-users
> 
> 
> -- 
> PGP Public key ID: 1024D/2DE827B3 
> fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
> See http://pgp.mit.edu or any PGP keyserver for public key.
> 
> 

---------------------------------------
Michelle Parker
Web Objectives Pty Ltd
33 Ridge St
Gordon, NSW, 2072
Australia 
Phone: (02) 9499 3166
Fax: (02) 9499 3166
Mobile : 0412 064 123
[email protected]
---------------------------------------



_______________________________________________
Pspp-users mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-users

Reply via email to