Hi John,

I have tested this command with pseudo-random real numbers drawn from a uniform 
distribution with parameters 0 and 1. I was comparing the group numbers of 
observations. I have no testing result as a file. After re-implementing the 
code as you described, i can prepare a testing sheet, if you want. 

Data should be standardized before any cluster analysis, so that, i will 
re-implement the code to standardize each single column of data matrix.  

Best. 

Mehmet Hakan Satman
http://www.mhsatman.com


--- On Sun, 3/13/11, [email protected] <[email protected]> wrote:

From: [email protected] <[email protected]>
Subject: pspp-dev Digest, Vol 87, Issue 7
To: [email protected]
Date: Sunday, March 13, 2011, 6:00 PM

Send pspp-dev mailing list submissions to
    [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
    http://lists.gnu.org/mailman/listinfo/pspp-dev
or, via email, send a message with subject or body 'help' to
    [email protected]

You can reach the person managing the list at
    [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of pspp-dev digest..."


Today's Topics:

   1. Re: K-Means Clustering (John Darrington)


----------------------------------------------------------------------

Message: 1
Date: Sun, 13 Mar 2011 14:36:06 +0000
From: John Darrington <[email protected]>
Subject: Re: K-Means Clustering
To: Mehmet Hakan Satman <[email protected]>
Cc: [email protected], John Darrington <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset="us-ascii"

Hi Mehmet,

Thanks for this.  It seems to be basically working.  There are a number of 
improvements
that can be made however.

1. It'll be more consistent with the rest of PSPP if you call the new file 
"quick-cluster.c"
   with a hyphen.

2. Instead of editing the Makefile, add the name of the new file to the 
manifest in 
   src/language/stats/automake.mk

3. Can you remove the line UNIMPL_CMD ("QUICK CLUSTER", "Fast clustering")
   from command.def instead of commenting it out.



 Now the "quick cluster" command can parse these options in the pspp command 
line:
     
     quick cluster /VARIABLES=x y z /GROUPS=5 /MAXITER=100.

This is different to the syntax in the SPSS documentation which expects:

   QUICK CLUSTER x y z 
      /CRITERIA = CLUSTER(5) MXITER (100).

where the /CRITERIA subcommand and each part thereof is optional.  You can see 
an example of how to 
implement a /CRITERIA subcommand in src/language/stats/factor.c - in fact, you 
may be able to copy much of that parser's code.

Avoid using atoi in the parser.  Instead of    groups=atoi(lex_tokcstr(lexer));
write :
    lex_force_int (lexer);
    groups = lex_integer (lexer); 
     

  i think a small development pdf documentation does not satisfies the needs of 
implementing 
  something in PSPP.

You're  right.  The developer documentation is woefully incomplete.

You mentioned earlier that you had tested the results against spss. Do you have 
the results
from these tests, and the test data that you used?  I would be interested to 
see this.
     

Best regards,

John

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : 
http://lists.gnu.org/archive/html/pspp-dev/attachments/20110313/3d77b46b/attachment.bin

------------------------------

_______________________________________________
pspp-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-dev


End of pspp-dev Digest, Vol 87, Issue 7
***************************************



      
_______________________________________________
pspp-dev mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/pspp-dev

Reply via email to