Re: [R] pairs plots in R
If you want to do efficient exploratory data analysis on this kind of dataset, then interactive graphics with parallel coordinate plots (ipcp in iplots) should help. Of course, it depends what you mean by large. It might be worth looking at the book Graphics of Large Datasets for some ideas. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany Tel: + 49 821 5982218 From: Sharma, Dhruv [EMAIL PROTECTED] Date: 19 October 2008 10:58:53 pm GMT+02:00 To: r-help@r-project.org Subject: [R] pairs plots in R Hi, is there a way to take a data frame with 100+ columns and large data set to do efficient exploratory analysis in R with pairs? I find using pairs on the whole matrix is slow and the resulting matrix is tiny. Also the variable of interest for me is a binary var Y or N . Is there an efficient way to graphically view many variable relationships that does not look teeny ? I could do pairs 10 at a time but this seems too brute force. thanks Dhruv [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs plots in R
Thanks Felix. Regards, Dhruv -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Felix Andrews Sent: Sunday, October 19, 2008 11:37 PM To: Sharma, Dhruv Cc: r-help@r-project.org Subject: Re: [R] pairs plots in R One idea: if the primary variable of interest is a categorical (binary), I would rather look at univariate plots for each of your 100 variables, grouped by the primary one. e.g. library(latticeExtra) marginal.plot(~ myBigDat, data = myBigData, groups = myBinaryVar, auto.key = TRUE, layout = c(4, 4)) (This is a convenient interface to lattice::densityplot and lattice::dotplot) If you view 16 such densityplots per page, that still gives you 7 pages. You could use playwith() (from playwith package) to scroll through the pages. -Felix 2008/10/20 Sharma, Dhruv [EMAIL PROTECTED]: Hi, is there a way to take a data frame with 100+ columns and large data set to do efficient exploratory analysis in R with pairs? I find using pairs on the whole matrix is slow and the resulting matrix is tiny. Also the variable of interest for me is a binary var Y or N . Is there an efficient way to graphically view many variable relationships that does not look teeny ? I could do pairs 10 at a time but this seems too brute force. thanks Dhruv [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 http://www.neurofractal.org/felix/ 3358 543D AAC6 22C2 D336 80D9 360B 72DD 3E4C F5D8 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs plots in R
thanks Antony. regards, Dhruv From: Antony Unwin [mailto:[EMAIL PROTECTED] Sent: Monday, October 20, 2008 7:00 AM To: r-help@r-project.org Cc: Sharma, Dhruv Subject: Re: [R] pairs plots in R If you want to do efficient exploratory data analysis on this kind of dataset, then interactive graphics with parallel coordinate plots (ipcp in iplots) should help. Of course, it depends what you mean by large. It might be worth looking at the book Graphics of Large Datasets for some ideas. Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, 86135 Augsburg, Germany Tel: + 49 821 5982218 From: Sharma, Dhruv [EMAIL PROTECTED] Date: 19 October 2008 10:58:53 pm GMT+02:00 To: r-help@r-project.org Subject: [R] pairs plots in R Hi, is there a way to take a data frame with 100+ columns and large data set to do efficient exploratory analysis in R with pairs? I find using pairs on the whole matrix is slow and the resulting matrix is tiny. Also the variable of interest for me is a binary var Y or N . Is there an efficient way to graphically view many variable relationships that does not look teeny ? I could do pairs 10 at a time but this seems too brute force. thanks Dhruv [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs plots in R
One idea: if the primary variable of interest is a categorical (binary), I would rather look at univariate plots for each of your 100 variables, grouped by the primary one. e.g. library(latticeExtra) marginal.plot(~ myBigDat, data = myBigData, groups = myBinaryVar, auto.key = TRUE, layout = c(4, 4)) (This is a convenient interface to lattice::densityplot and lattice::dotplot) If you view 16 such densityplots per page, that still gives you 7 pages. You could use playwith() (from playwith package) to scroll through the pages. -Felix 2008/10/20 Sharma, Dhruv [EMAIL PROTECTED]: Hi, is there a way to take a data frame with 100+ columns and large data set to do efficient exploratory analysis in R with pairs? I find using pairs on the whole matrix is slow and the resulting matrix is tiny. Also the variable of interest for me is a binary var Y or N . Is there an efficient way to graphically view many variable relationships that does not look teeny ? I could do pairs 10 at a time but this seems too brute force. thanks Dhruv [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Felix Andrews / 安福立 http://www.neurofractal.org/felix/ 3358 543D AAC6 22C2 D336 80D9 360B 72DD 3E4C F5D8 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.