Re: [R] help on hmisc
On 05/07/2010 10:12 AM, nvanzuy...@gmail.com wrote: Hi, I thought I would just jump in on this as I am running an i7 as well. I use hmisc for the doBy functions and it would make a huge difference particularly with large data sets to run this on 64bit windows. I'm not sure how to compile from source and usually use the install.packages option. At the moment I have two versions of R installed and switch between them depending on what I'm working with. Having an hmisc package for 64bit windows would really help. Thanks Natalie Natalie you must be thinking of another package. doBy is not in Hmisc. summarize, mApply, etc., are in Hmisc. You might look at the data.table package too. Frank On May 7, 2010 1:52pm, Joris Meysjorism...@gmail.com wrote: Zach, The R-gurus will correct me when I'm wrong, but as far as my very limited experience goes, the 64bit version only gives you an advantage when throwing around huge datasets or doing very memory-intensive tasks. For most of the things I do with R, there is no difference at all. Now the difference between an old x86 and a new quadcore i7, that's another story... Cheers Joris On Fri, May 7, 2010 at 2:32 PM, zach Li zach...@hotmail.com wrote: thanks Joris, the reason I am looking for the instructions is that I hope 64 bit hmisc will run better(faster) than 32 bit on 64 environment. Regards, Zach. -- Date: Fri, 7 May 2010 11:10:36 +0200 Subject: Re: [R] help on hmisc From: jorism...@gmail.com To: zach...@hotmail.com CC: r-help@r-project.org Puzzling question. You install R, you click on install packages, you select a mirror, you select hmisc, and done. There is a 64bit version of R, but a 32bit runs smooth on a Windows 7 64bit as well. if you love the command line, look at ?install.packages. I can't see why you would like to compile an R package yourself. So in case you have a specific problem, a bit more information would come handy. Cheers Joris On Fri, May 7, 2010 at 3:30 AM, zach Li zach...@hotmail.com wrote: can anyone know where i can find information on compile hmisc on windows, especially 64 windows? thanks, _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started.http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3 -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] -- Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on hmisc
On 08.05.2010 15:04, Frank E Harrell Jr wrote: On 05/07/2010 10:12 AM, nvanzuy...@gmail.com wrote: Hi, I thought I would just jump in on this as I am running an i7 as well. I use hmisc for the doBy functions and it would make a huge difference particularly with large data sets to run this on 64bit windows. I'm not sure how to compile from source and usually use the install.packages option. At the moment I have two versions of R installed and switch between them depending on what I'm working with. Having an hmisc package for 64bit windows would really help. Thanks Natalie Natalie you must be thinking of another package. doBy is not in Hmisc. summarize, mApply, etc., are in Hmisc. You might look at the data.table package too. Additionally, If you install the 64-bit version of R-2.11.0, you can simply install.packages(Hmisc) and you got it - giben you are really talking about Hmisc. Frank On May 7, 2010 1:52pm, Joris Meysjorism...@gmail.com wrote: Zach, The R-gurus will correct me when I'm wrong, but as far as my very limited experience goes, the 64bit version only gives you an advantage when throwing around huge datasets or doing very memory-intensive tasks. For most of the things I do with R, there is no difference at all. Now the difference between an old x86 and a new quadcore i7, that's another story... Cheers Joris On Fri, May 7, 2010 at 2:32 PM, zach Li zach...@hotmail.com wrote: thanks Joris, the reason I am looking for the instructions is that I hope 64 bit hmisc will run better(faster) than 32 bit on 64 environment. Regards, Zach. -- Date: Fri, 7 May 2010 11:10:36 +0200 Subject: Re: [R] help on hmisc From: jorism...@gmail.com To: zach...@hotmail.com CC: r-help@r-project.org Puzzling question. You install R, you click on install packages, you select a mirror, you select hmisc, and done. There is a 64bit version of R, but a 32bit runs smooth on a Windows 7 64bit as well. if you love the command line, look at ?install.packages. I can't see why you would like to compile an R package yourself. So in case you have a specific problem, a bit more information would come handy. Cheers Joris On Fri, May 7, 2010 at 3:30 AM, zach Li zach...@hotmail.com wrote: can anyone know where i can find information on compile hmisc on windows, especially 64 windows? thanks, _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started.http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3 -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on hmisc
Thanks very much it was Hmisc that I was looking for. Originally I would get a message saying that Hmisc was not available for 64bit when I tried downloading it through R. Thanks very much, Natalie 2010/5/8 Uwe Ligges lig...@statistik.tu-dortmund.de On 08.05.2010 15:04, Frank E Harrell Jr wrote: On 05/07/2010 10:12 AM, nvanzuy...@gmail.com wrote: Hi, I thought I would just jump in on this as I am running an i7 as well. I use hmisc for the doBy functions and it would make a huge difference particularly with large data sets to run this on 64bit windows. I'm not sure how to compile from source and usually use the install.packages option. At the moment I have two versions of R installed and switch between them depending on what I'm working with. Having an hmisc package for 64bit windows would really help. Thanks Natalie Natalie you must be thinking of another package. doBy is not in Hmisc. summarize, mApply, etc., are in Hmisc. You might look at the data.table package too. Additionally, If you install the 64-bit version of R-2.11.0, you can simply install.packages(Hmisc) and you got it - giben you are really talking about Hmisc. Frank On May 7, 2010 1:52pm, Joris Meysjorism...@gmail.com wrote: Zach, The R-gurus will correct me when I'm wrong, but as far as my very limited experience goes, the 64bit version only gives you an advantage when throwing around huge datasets or doing very memory-intensive tasks. For most of the things I do with R, there is no difference at all. Now the difference between an old x86 and a new quadcore i7, that's another story... Cheers Joris On Fri, May 7, 2010 at 2:32 PM, zach Li zach...@hotmail.com wrote: thanks Joris, the reason I am looking for the instructions is that I hope 64 bit hmisc will run better(faster) than 32 bit on 64 environment. Regards, Zach. -- Date: Fri, 7 May 2010 11:10:36 +0200 Subject: Re: [R] help on hmisc From: jorism...@gmail.com To: zach...@hotmail.com CC: r-help@r-project.org Puzzling question. You install R, you click on install packages, you select a mirror, you select hmisc, and done. There is a 64bit version of R, but a 32bit runs smooth on a Windows 7 64bit as well. if you love the command line, look at ?install.packages. I can't see why you would like to compile an R package yourself. So in case you have a specific problem, a bit more information would come handy. Cheers Joris On Fri, May 7, 2010 at 3:30 AM, zach Li zach...@hotmail.com wrote: can anyone know where i can find information on compile hmisc on windows, especially 64 windows? thanks, _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3 -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on hmisc
Puzzling question. You install R, you click on install packages, you select a mirror, you select hmisc, and done. There is a 64bit version of R, but a 32bit runs smooth on a Windows 7 64bit as well. if you love the command line, look at ?install.packages. I can't see why you would like to compile an R package yourself. So in case you have a specific problem, a bit more information would come handy. Cheers Joris On Fri, May 7, 2010 at 3:30 AM, zach Li zach...@hotmail.com wrote: can anyone know where i can find information on compile hmisc on windows, especially 64 windows? thanks, _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on hmisc
Zach, The R-gurus will correct me when I'm wrong, but as far as my very limited experience goes, the 64bit version only gives you an advantage when throwing around huge datasets or doing very memory-intensive tasks. For most of the things I do with R, there is no difference at all. Now the difference between an old x86 and a new quadcore i7, that's another story... Cheers Joris On Fri, May 7, 2010 at 2:32 PM, zach Li zach...@hotmail.com wrote: thanks Joris, the reason I am looking for the instructions is that I hope 64 bit hmisc will run better(faster) than 32 bit on 64 environment. Regards, Zach. -- Date: Fri, 7 May 2010 11:10:36 +0200 Subject: Re: [R] help on hmisc From: jorism...@gmail.com To: zach...@hotmail.com CC: r-help@r-project.org Puzzling question. You install R, you click on install packages, you select a mirror, you select hmisc, and done. There is a 64bit version of R, but a 32bit runs smooth on a Windows 7 64bit as well. if you love the command line, look at ?install.packages. I can't see why you would like to compile an R package yourself. So in case you have a specific problem, a bit more information would come handy. Cheers Joris On Fri, May 7, 2010 at 3:30 AM, zach Li zach...@hotmail.com wrote: can anyone know where i can find information on compile hmisc on windows, especially 64 windows? thanks, _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started.http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3 -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on hmisc
Hi, I thought I would just jump in on this as I am running an i7 as well. I use hmisc for the doBy functions and it would make a huge difference particularly with large data sets to run this on 64bit windows. I'm not sure how to compile from source and usually use the install.packages option. At the moment I have two versions of R installed and switch between them depending on what I'm working with. Having an hmisc package for 64bit windows would really help. Thanks Natalie On May 7, 2010 1:52pm, Joris Meys jorism...@gmail.com wrote: Zach, The R-gurus will correct me when I'm wrong, but as far as my very limited experience goes, the 64bit version only gives you an advantage when throwing around huge datasets or doing very memory-intensive tasks. For most of the things I do with R, there is no difference at all. Now the difference between an old x86 and a new quadcore i7, that's another story... Cheers Joris On Fri, May 7, 2010 at 2:32 PM, zach Li zach...@hotmail.com wrote: thanks Joris, the reason I am looking for the instructions is that I hope 64 bit hmisc will run better(faster) than 32 bit on 64 environment. Regards, Zach. -- Date: Fri, 7 May 2010 11:10:36 +0200 Subject: Re: [R] help on hmisc From: jorism...@gmail.com To: zach...@hotmail.com CC: r-help@r-project.org Puzzling question. You install R, you click on install packages, you select a mirror, you select hmisc, and done. There is a 64bit version of R, but a 32bit runs smooth on a Windows 7 64bit as well. if you love the command line, look at ?install.packages. I can't see why you would like to compile an R package yourself. So in case you have a specific problem, a bit more information would come handy. Cheers Joris On Fri, May 7, 2010 at 3:30 AM, zach Li zach...@hotmail.com wrote: can anyone know where i can find information on compile hmisc on windows, especially 64 windows? thanks, _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- The New Busy is not the old busy. Search, chat and e-mail from your inbox. Get started.http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3 -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on hmisc
can anyone know where i can find information on compile hmisc on windows, especially 64 windows? thanks, _ The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail. ID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with Hmisc, cut2, split and quantile
Hello, I have a set of data with two columns: Target and Actual. A http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt is attached but the data looks like this: Actual Target -0.125 0.016124906 0.135 0.120799865 ... ... ... ... I want to be able to break the data into tables based on quantiles in the Target column. I can see (using cut2, and also quantile) how to get the barrier points between the different quantiles, and I can see how I would achieve this if I was just looking to split up a vector. However I am trying to break up the whole table based on those quantiles, not just the vector. The following code shows me the ranges for the deciles of the Target data: library(Hmisc) read_data=read.table(C:/Sample table.txt, head = T) table(cut2(Read_data$Target,g=10)) However I would like to be able to break the table into ten separate tables, each with both Actual and Target data, based on the Target data deciles: top_decile = ...(top decile of read_data, based on Target data) next_decile = ...and so on... bottom_decile = ... That way I could manipulate the deciles, graph them separately (and together) and so on, just as easily as I can the whole table. I'm sure this must be simple, but I can't see the way forward. I have also looked at split() and quantile() but have not been able to get them to achieve what I am after. Can anybody see a simple way foward on this? Thanks, Guy -- View this message in context: http://n4.nabble.com/Help-with-Hmisc-cut2-split-and-quantile-tp1584647p1584647.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Hmisc, cut2, split and quantile
On 2010-03-08 8:47, Guy Green wrote: Hello, I have a set of data with two columns: Target and Actual. A http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt is attached but the data looks like this: Actual Target -0.125 0.016124906 0.135 0.120799865 ... ... ... ... I want to be able to break the data into tables based on quantiles in the Target column. I can see (using cut2, and also quantile) how to get the barrier points between the different quantiles, and I can see how I would achieve this if I was just looking to split up a vector. However I am trying to break up the whole table based on those quantiles, not just the vector. The following code shows me the ranges for the deciles of the Target data: library(Hmisc) read_data=read.table(C:/Sample table.txt, head = T) table(cut2(Read_data$Target,g=10)) However I would like to be able to break the table into ten separate tables, each with both Actual and Target data, based on the Target data deciles: top_decile = ...(top decile of read_data, based on Target data) next_decile = ...and so on... bottom_decile = ... I would just add a factor variable indicating to which decile a particular observation belongs: dat$DEC - with(dat, cut(Target, breaks=10, labels=1:10)) If you really want to have separate data frames you can then split on the decile: L - split(dat, dat$DEC) -Peter Ehlers That way I could manipulate the deciles, graph them separately (and together) and so on, just as easily as I can the whole table. I'm sure this must be simple, but I can't see the way forward. I have also looked at split() and quantile() but have not been able to get them to achieve what I am after. Can anybody see a simple way foward on this? Thanks, Guy -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Hmisc, cut2, split and quantile
try as.numeric(read_data$DEC) this should turn it into a numeric variable that you can work with hth David Freedman CDC, Atlanta Guy Green wrote: Hi Peter others, Thanks (Peter) - that gets me really close to what I was hoping for. The one problem I have is that the cut approach breaks the data into intervals based on the absolute value of the Target data, rather than their frequency. In other words, if the data ranged from 0 to 50, the data would be separated into 0-5, 5-10 and so on, regardless of the frequency within those categories. However I want to get the data into deciles. The code that does this (incorporating Peter's) is: read_data=read.table(C:/Sample table.txt, head = T) read_data$DEC - with(read_data, cut(Target, breaks=10, labels=1:10)) L - split(read_data, read_data$DEC) This means that I can get separate data frames, such as L$'10', which comes out tidy, but only containing 2 data items (the sample has 63 rows, so each decile should have 6+ data items): ActualTarget DEC 9 0.572 0.3778386 10 31 0.2990.3546606 10 If I try to adjust this to get deciles using cut2(), I can break the data into deciles as follows: read_data=read.table(C:/Sample table.txt, head = T) read_data$DEC - with(read_data, cut2(read_data$Target, g=10), labels=1:10) L - split(read_data, read_data$DEC) However this time, while the data is broken into even data frames, the labels for the separate data frames are unuseable, e.g.: $`[ 0.26477, 0.37784]` ActualTarget DEC 6 0.243 0.2650960[ 0.26477, 0.37784] 9 0.572 0.3778386[ 0.26477, 0.37784] 10 -0.049 0.3212681[ 0.26477, 0.37784] 15 0.780 0.2778518[ 0.26477, 0.37784] 31 0.299 0.3546606[ 0.26477, 0.37784] 33 0.105 0.2647676[ 0.26477, 0.37784] Could anyone suggest a way of rearranging this to make the labels useable again? Sample data is reattached http://n4.nabble.com/file/n1585427/Sample_table.txt Sample_table.txt . Thanks, Guy Peter Ehlers wrote: On 2010-03-08 8:47, Guy Green wrote: Hello, I have a set of data with two columns: Target and Actual. A http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt is attached but the data looks like this: Actual Target -0.125 0.016124906 0.135 0.120799865 ... ... ... ... I want to be able to break the data into tables based on quantiles in the Target column. I can see (using cut2, and also quantile) how to get the barrier points between the different quantiles, and I can see how I would achieve this if I was just looking to split up a vector. However I am trying to break up the whole table based on those quantiles, not just the vector. However I would like to be able to break the table into ten separate tables, each with both Actual and Target data, based on the Target data deciles: top_decile = ...(top decile of read_data, based on Target data) next_decile = ...and so on... bottom_decile = ... I would just add a factor variable indicating to which decile a particular observation belongs: dat$DEC - with(dat, cut(Target, breaks=10, labels=1:10)) If you really want to have separate data frames you can then split on the decile: L - split(dat, dat$DEC) -Peter Ehlers -- Peter Ehlers University of Calgary -- View this message in context: http://n4.nabble.com/Help-with-Hmisc-cut2-split-and-quantile-tp1584647p1585503.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Hmisc, cut2, split and quantile
Hi Peter others, Thanks (Peter) - that gets me really close to what I was hoping for. The one problem I have is that the cut approach breaks the data into intervals based on the absolute value of the Target data, rather than their frequency. In other words, if the data ranged from 0 to 50, the data would be separated into 0-5, 5-10 and so on, regardless of the frequency within those categories. However I want to get the data into deciles. The code that does this (incorporating Peter's) is: read_data=read.table(C:/Sample table.txt, head = T) read_data$DEC - with(read_data, cut(Target, breaks=10, labels=1:10)) L - split(read_data, read_data$DEC) This means that I can get separate data frames, such as L$'10', which comes out tidy, but only containing 2 data items (the sample has 63 rows, so each decile should have 6+ data items): ActualTarget DEC 9 0.572 0.3778386 10 31 0.2990.3546606 10 If I try to adjust this to get deciles using cut2(), I can break the data into deciles as follows: read_data=read.table(C:/Sample table.txt, head = T) read_data$DEC - with(read_data, cut2(read_data$Target, g=10), labels=1:10) L - split(read_data, read_data$DEC) However this time, while the data is broken into even data frames, the labels for the separate data frames are unuseable, e.g.: $`[ 0.26477, 0.37784]` ActualTarget DEC 6 0.243 0.2650960[ 0.26477, 0.37784] 9 0.572 0.3778386[ 0.26477, 0.37784] 10 -0.049 0.3212681[ 0.26477, 0.37784] 15 0.780 0.2778518[ 0.26477, 0.37784] 31 0.299 0.3546606[ 0.26477, 0.37784] 33 0.105 0.2647676[ 0.26477, 0.37784] Could anyone suggest a way of rearranging this to make the labels useable again? Sample data is reattached http://n4.nabble.com/file/n1585427/Sample_table.txt Sample_table.txt . Thanks, Guy Peter Ehlers wrote: On 2010-03-08 8:47, Guy Green wrote: Hello, I have a set of data with two columns: Target and Actual. A http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt is attached but the data looks like this: Actual Target -0.125 0.016124906 0.1350.120799865 ... ... ... ... I want to be able to break the data into tables based on quantiles in the Target column. I can see (using cut2, and also quantile) how to get the barrier points between the different quantiles, and I can see how I would achieve this if I was just looking to split up a vector. However I am trying to break up the whole table based on those quantiles, not just the vector. However I would like to be able to break the table into ten separate tables, each with both Actual and Target data, based on the Target data deciles: top_decile = ...(top decile of read_data, based on Target data) next_decile = ...and so on... bottom_decile = ... I would just add a factor variable indicating to which decile a particular observation belongs: dat$DEC - with(dat, cut(Target, breaks=10, labels=1:10)) If you really want to have separate data frames you can then split on the decile: L - split(dat, dat$DEC) -Peter Ehlers -- Peter Ehlers University of Calgary -- View this message in context: http://n4.nabble.com/Help-with-Hmisc-cut2-split-and-quantile-tp1584647p1585427.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with Hmisc, cut2, split and quantile
On 2010-03-08 18:00, Guy Green wrote: Hi Peter others, Thanks (Peter) - that gets me really close to what I was hoping for. The one problem I have is that the cut approach breaks the data into intervals based on the absolute value of the Target data, rather than their frequency. In other words, if the data ranged from 0 to 50, the data would be separated into 0-5, 5-10 and so on, regardless of the frequency within those categories. However I want to get the data into deciles. The code that does this (incorporating Peter's) is: read_data=read.table(C:/Sample table.txt, head = T) read_data$DEC- with(read_data, cut(Target, breaks=10, labels=1:10)) L- split(read_data, read_data$DEC) This means that I can get separate data frames, such as L$'10', which comes out tidy, but only containing 2 data items (the sample has 63 rows, so each decile should have 6+ data items): ActualTarget DEC 9 0.572 0.3778386 10 31 0.2990.3546606 10 If I try to adjust this to get deciles using cut2(), I can break the data into deciles as follows: read_data=read.table(C:/Sample table.txt, head = T) read_data$DEC- with(read_data, cut2(read_data$Target, g=10), labels=1:10) L- split(read_data, read_data$DEC) However this time, while the data is broken into even data frames, the labels for the separate data frames are unuseable, e.g.: $`[ 0.26477, 0.37784]` ActualTarget DEC 6 0.243 0.2650960[ 0.26477, 0.37784] 9 0.572 0.3778386[ 0.26477, 0.37784] 10 -0.049 0.3212681[ 0.26477, 0.37784] 15 0.780 0.2778518[ 0.26477, 0.37784] 31 0.299 0.3546606[ 0.26477, 0.37784] 33 0.105 0.2647676[ 0.26477, 0.37784] Could anyone suggest a way of rearranging this to make the labels useable again? Sample data is reattached http://n4.nabble.com/file/n1585427/Sample_table.txt Sample_table.txt . I think that the easiest way would be to relabel the levels of DEC: read_data$DEC - factor(read_data$DEC, labels = 1:10) or, since I would prefer letters as factor levels: read_data$DEC - factor(read_data$DEC, labels = LETTERS[1:10]) Another way would be to use cut2() with onlycuts=TRUE to get the breaks and then use these with cut() as in my original post: brks - cut2(read_data$Target, g=10, onlycuts=TRUE) read_data$DEC- with(read_data, cut(Target, breaks=brks, labels=1:10)) But I still don't see why you want a list of separate data frames. For most analyses, it's more convenient to just use the factor variable to subset the data as needed. -Peter Ehlers Thanks, Guy Peter Ehlers wrote: On 2010-03-08 8:47, Guy Green wrote: Hello, I have a set of data with two columns: Target and Actual. A http://n4.nabble.com/file/n1584647/Sample_table.txt Sample_table.txt is attached but the data looks like this: Actual Target -0.125 0.016124906 0.135 0.120799865 ... ... ... ... I want to be able to break the data into tables based on quantiles in the Target column. I can see (using cut2, and also quantile) how to get the barrier points between the different quantiles, and I can see how I would achieve this if I was just looking to split up a vector. However I am trying to break up the whole table based on those quantiles, not just the vector. However I would like to be able to break the table into ten separate tables, each with both Actual and Target data, based on the Target data deciles: top_decile = ...(top decile of read_data, based on Target data) next_decile = ...and so on... bottom_decile = ... I would just add a factor variable indicating to which decile a particular observation belongs: dat$DEC- with(dat, cut(Target, breaks=10, labels=1:10)) If you really want to have separate data frames you can then split on the decile: L- split(dat, dat$DEC) -Peter Ehlers -- Peter Ehlers University of Calgary -- Peter Ehlers University of Calgary __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.