Seems your questions belong to rule mining for frequent item sets. check arules package
Weidong Gu On Tue, Sep 27, 2011 at 11:13 PM, pip56789 <pd...@virginia.edu> wrote: > Hi, > > I have a few methodological and implementation questions for ya'll. Thank > you in advance for your help. I have a dataset that reflects people's > preference choices. I want to see if there's any kind of clustering effect > among certain preference choices (e.g. do people who pick choice A also pick > choice D). > > I have a data set that has one record per user ID, per preference choice. > It's a "long" form of a data set that looks like this: > > ID | Page > 123 | Choice A > 123 | Choice B > 456 | Choice A > 456 | Choice B > ... > > I thought that I should do the following > > 1. Make the data set "wide", counting the observations so the data looks > like this: > ID | Count of Preference A | Count of Preference B > 123 | 1 | 1 > ... > > Using > table1 <- dcast(data,ID ~ Page,fun.aggregate=length,value_var='Page' ) > > 2. Create a correlation matrix of preferences > cor(table2[,-1]) > > How would I restrict my correlation to show preferences that met a minimum > sample threshold? Can you confirm if the two following commands do the same > thing? What would I do from here (or am I taking the wrong approach) > table1 <- dcast(data,Page ~ Page,fun.aggregate=length,value_var='Page' ) > table2 <- with(data, table(Page,Page)) > > > many thanks, > Peter > > -- > View this message in context: > http://r.789695.n4.nabble.com/Data-transformation-cleaning-tp3849889p3849889.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.