Hi John, I have tested this command with pseudo-random real numbers drawn from a uniform distribution with parameters 0 and 1. I was comparing the group numbers of observations. I have no testing result as a file. After re-implementing the code as you described, i can prepare a testing sheet, if you want.
Data should be standardized before any cluster analysis, so that, i will re-implement the code to standardize each single column of data matrix. Best. Mehmet Hakan Satman http://www.mhsatman.com --- On Sun, 3/13/11, [email protected] <[email protected]> wrote: From: [email protected] <[email protected]> Subject: pspp-dev Digest, Vol 87, Issue 7 To: [email protected] Date: Sunday, March 13, 2011, 6:00 PM Send pspp-dev mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit http://lists.gnu.org/mailman/listinfo/pspp-dev or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of pspp-dev digest..." Today's Topics: 1. Re: K-Means Clustering (John Darrington) ---------------------------------------------------------------------- Message: 1 Date: Sun, 13 Mar 2011 14:36:06 +0000 From: John Darrington <[email protected]> Subject: Re: K-Means Clustering To: Mehmet Hakan Satman <[email protected]> Cc: [email protected], John Darrington <[email protected]> Message-ID: <[email protected]> Content-Type: text/plain; charset="us-ascii" Hi Mehmet, Thanks for this. It seems to be basically working. There are a number of improvements that can be made however. 1. It'll be more consistent with the rest of PSPP if you call the new file "quick-cluster.c" with a hyphen. 2. Instead of editing the Makefile, add the name of the new file to the manifest in src/language/stats/automake.mk 3. Can you remove the line UNIMPL_CMD ("QUICK CLUSTER", "Fast clustering") from command.def instead of commenting it out. Now the "quick cluster" command can parse these options in the pspp command line: quick cluster /VARIABLES=x y z /GROUPS=5 /MAXITER=100. This is different to the syntax in the SPSS documentation which expects: QUICK CLUSTER x y z /CRITERIA = CLUSTER(5) MXITER (100). where the /CRITERIA subcommand and each part thereof is optional. You can see an example of how to implement a /CRITERIA subcommand in src/language/stats/factor.c - in fact, you may be able to copy much of that parser's code. Avoid using atoi in the parser. Instead of groups=atoi(lex_tokcstr(lexer)); write : lex_force_int (lexer); groups = lex_integer (lexer); i think a small development pdf documentation does not satisfies the needs of implementing something in PSPP. You're right. The developer documentation is woefully incomplete. You mentioned earlier that you had tested the results against spss. Do you have the results from these tests, and the test data that you used? I would be interested to see this. Best regards, John -- PGP Public key ID: 1024D/2DE827B3 fingerprint = 8797 A26D 0854 2EAB 0285 A290 8A67 719C 2DE8 27B3 See http://pgp.mit.edu or any PGP keyserver for public key. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://lists.gnu.org/archive/html/pspp-dev/attachments/20110313/3d77b46b/attachment.bin ------------------------------ _______________________________________________ pspp-dev mailing list [email protected] http://lists.gnu.org/mailman/listinfo/pspp-dev End of pspp-dev Digest, Vol 87, Issue 7 ***************************************
_______________________________________________ pspp-dev mailing list [email protected] http://lists.gnu.org/mailman/listinfo/pspp-dev
