All: I have a customer database closed to 5 million customers Each customer has different category variables (like Customer Type, Country of Origin etc) and different range variables (like Daily Transaction amount, Daily Transaction Count etc). I need to segement these customers into different groups or clusters where in the group members in a group share common characteristics
For example if i have the Data set Id Ctry CustomerType DailyTransactionAmt 1 IQ CType 1 2000 2 IQ CType 1 3000 3 IQ CType 1 4000 4 IQ CType 1 3000 5 IQ CType 1 10000 6 IQ CType 1 11000 7 IQ CType 1 12000 8 IQ CType 1 11000 9 IN CType 1 10000 10 IN CType 1 15000 11 IN CType 1 55000 12 IN CType 1 60000 13 IN CType 1 70000 14 IQ CType 2 85000 15 IQ CType 2 75000 16 IQ CType 2 90000 17 IQ CType 2 10000 18 IQ CType 2 3500 19 IQ CType 2 3000 20 IQ CType 2 4000 21 IQ CType 2 4000 22 IN CType 2 1100 23 IN CType 2 1000 I need an output like CType1 --- IQ -- (2000 <= amt<= 4000) [Members: 1,2,3,4] CType1 ---- IQ -- (10000 <= amt <=12000) [Members: 5,6,7,8] CType1 ---- IN -- (10000 <= amt <=15000) [Members: 9,10] CType1 ---- IN -- (55000 <= amt <=70000) [Members: 11,12,13] CType2 ---- IQ -- (75000 <= amt <=100000) [Members: 14,15,16,17] CType2 ---- IQ -- (3000 <= amt <=40000) [Members: 18,19,20,21] CType2 ---- IN -- (1000 <= amt <=1100) [Members: 22,23] Please note that I dont know the number of clusters before hand. I am new to this area and am reading up on different material and I would appreciate any suggestions you can provide Thanks Satish . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
