[R] How to programme R to randomly replace some X values with Outliers
Dear experts, I am a beginner of R. I'm looking for experts to guide me how to do programming in R in order to randomly replace 5 observations in X explanatory variable with outliers drawn from U(15,20) in sample size n=100. The replacement subject to y 15. The ultimate goal of my study is to compare the std of y with and without the presence of outliers based on average of 1000 simulation. Info : X~U(0,10) Y=6+2X+norm(0,1) Thank you. Hock Ann [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How many R packages are not free?
On Sat, 2 Oct 2010, Peter Dalgaard wrote: On 10/02/2010 07:38 PM, Spencer Graves wrote: Is there anything on CRAN that is NOT free? I assumed that CRAN had a policy of not accepting anything that could not be freely distributed, but I could not find any such statement in a quick search. The code by Uwe identified 52 packages with file LICENCE or file LICENSE, plus others with combinations of something like GPL with file LICENCE or file LICENSE. I believe the CRAN policy is just that: Freely redistributable. Free Software usually means something else: Free usage and modification. I am sure that is the intention, but a few packages have changed their licence terms since they were accepted. 'mclust' and 'optmatch' are two, and they are not currently 'freely redistributable'. One main issue is code licensed free for non-commercial usage or academic usage, which are not Free Software. However, CRAN being a repository with many academic users, it does serve a purpose to distribute them for research purposes. [Long discussion omitted of whether that sort of license was ever a good idea] I've not seen in this thread mention of the Licencse/FOSS filter and option 'checkPackageLicense' - see the help on available.packages() and options() respectively. These do enable people to work within the subset of packages 'known to be Free or Open Source'. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to programme R to randomly replace some X values with Outliers
Dear Hock Ann, I am not sure of all your requirements, but this should at least get you started. I show it by hand and also wrapped up in a function. In the function I made two density plots that I thought might be interesting to you, you can just delete those two lines of code if you do not want them. The code follows below. Cheers, Josh ## X is pseudo-random numbers from the uniform distribution ## ~U(0, 10) X - runif(n = 100, min = 0, max = 10) ## We can check that X is about what we would expect ## The mean should be (1/2) * (10 + 0) mean(x = X) ## The variance should be (1/12) * ((10 - 0)^2) var(x = X) ## I am assuming norm(0, 1) to be representing the standard normal distribution ## Create a vectory of these numbers to be used in the formula for Y ## this is important since you will be create Y twice ## (with X and then X replaced with some outliers) ## You do not want R to regenerate the random normal values and change things that way Z - rnorm(n = 100, mean = 0, sd = 1) ## Create Y from your formula, where Z = norm(0, 1) ## Y = 6 + 2X + norm(0, 1) Y - 6 + 2 * X + Z ## Now I am using sample() to randomly select some values ## between 1 and the length of X, these will be the positions ## of the elements of X to be replaced toreplace - sample(x = seq_along(X), size = 5, replace = FALSE) ## Now replace the X values X[toreplace] - runif(n = 5, min = 15, max = 20) ## Create Ynew based off updated X Ynew - 6 + 2 * X + Z ## Calculate the standard deviations of Y and Ynew ## and store in a named vector called results results - c(SD_Y = sd(Y), SD_Ynew = sd(Ynew)) ## print the results vector to screen to look at it results ## Now if you wanted to do this many times ## and potentially change a few values easily ## we can put it in a function ## n is the number in each sample ## a and b are the min and max of the uniform distribution for X ## a.outlier and b.outlier are the same but for the outliers ## nreplace is how many values of X you want to replace ## reps is how many times you want to run it ## I have written the values to default to what you said in your emamil ## but obviously it would be easy to change any one of them mysampler - function(n = 100, a = 0, b = 10, a.outlier = 15, b.outlier = 20, nreplace = 5, reps = 1000) { if(any(c(n, nreplace, reps) 1)) { stop(n, nreplace, and reps must all be at least 1) } results - matrix(0, nrow = reps, ncol = 2, dimnames = list(NULL, c(SD_Y, SD_Ynew))) for(i in 1:reps) { X - runif(n = n, min = a, max = b) Z - rnorm(n = n, mean = 0, sd = 1) Y - 6 + 2 * X + Z toreplace - sample(x = seq_along(X), size = nreplace, replace = FALSE) X[toreplace] - runif(n = nreplace, min = a.outlier, max = b.outlier) Ynew - 6 + 2 * X + Z results[i, ] - c(sd(Y), sd(Ynew)) } dev.new() par(mfrow = c(2, 1)) plot(density(results[,SD_Y]), xlim = range(results)) plot(density(results[,SD_Ynew]), xlim = range(results)) return(results) } ## You might find the following documentation helpful ?runif # generate random values from uniform ?rnorm # from normal ?for # to do your simulation On Sat, Oct 2, 2010 at 11:12 PM, Hock Ann Lim lim...@yahoo.com wrote: Dear experts, I am a beginner of R. I'm looking for experts to guide me how to do programming in R in order to randomly replace 5 observations in X explanatory variable with outliers drawn from U(15,20) in sample size n=100. The replacement subject to y 15. The ultimate goal of my study is to compare the std of y with and without the presence of outliers based on average of 1000 simulation. Info : X~U(0,10) Y=6+2X+norm(0,1) Thank you. Hock Ann [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to programme R to randomly replace some X values with Outliers
N - 100 Nrep - 5 X - runif(N, 0, 10) Y - 6 + 2*X + rnorm(N, 0, 1) X[ sample(which(Y 15), Nrep) ] - runif(Nrep, 15, 20) Hope this helps, Michael On 3 October 2010 16:12, Hock Ann Lim lim...@yahoo.com wrote: Dear experts, I am a beginner of R. I'm looking for experts to guide me how to do programming in R in order to randomly replace 5 observations in X explanatory variable with outliers drawn from U(15,20) in sample size n=100. The replacement subject to y 15. The ultimate goal of my study is to compare the std of y with and without the presence of outliers based on average of 1000 simulation. Info : X~U(0,10) Y=6+2X+norm(0,1) Thank you. Hock Ann [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Modifying a data.frame
Hello list members I have a problem with modifying a data.frame. As an example given is a data.frame called ex : ex-data.frame(id=c(1,2,3,4,5,6),obs=c(14,9,20,36,55,47),eff=c(A,A,B,C,C,C)) After that I would like to modify the object ex with the following short script: for (i in ex) { if(ex[i,3]==A|| ex[i,3]==C){ ex[i,4]-- } else { ex[i,4]-10 } } This script is creating an error message: Fehler in if (ex[i, 3] == A || ex[i, 3] == C) { : Fehlender Wert, wo TRUE/FALSE nötig ist Why this script doesn't work properly? Thanks a lot for your hints Beat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Deducer and Contingency Tables
Is there a way to use Deducer to analyze contingency table data that is only available in a row-by-column summary form (not a data frame)? Thanks. Jim Watkins __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Non-Parametric Adventures in R
I just started using R and I'm having all sorts of fun trying different things. I'm going to document the different things I'm doing here as a kind of case study. I'm hoping that I'll get help from the community so that I can use R properly. Anyways, in this study, I have demographic data, drug usage data, and side effect data. All of this is loaded into a csv file. I'm using Rweb as an interface, so I had to modify the cgi-bin code slightly, but it works pretty well. I'm looking for frequency counts, some summary data for columns where it makes sense, plots and X-squared tests. My data frame is named X since that's what Rweb names it. 1) I was thinking I'd have to go through each nominal variable (i.e. table(X$race) ), but I think I have it figured out now. summary(X) is nice, but I need to recode nominal data with labels so the results are meaningful. - 2) I had an issue with multiple plots overwriting each other, and I managed to bypass that with: par(mfrow=c(2,1)) I have to update it to correspond to the number of plots I think. There's probably a better way to do this. barplot(table(X$race)) prints out a barplot so that's great - 3) I was able to code my data so it shows up in tables better with X$race - factor(X$race, levels = c(0,2), labels = c(African American,White,Non-Hispanic)) !! 4) The coding for all of my drug variables is identical, and I'd like to create a loop that goes through and labels accordingly I'm not having good success with this yet, but here's what I'm trying. X[1,] - factor(X[1,], levels = c(0,1,2,3,4,5), labels= c(none,last week,last 3 month,last year,regular use at least 3 months,unknown length of usage)) I know I would need to replace the [1,] with something that gives me the column, but I'm not sure what to put syntactically at the moment. 5) I had more success creating new variables based on the old ones. So I end up with yes/no answers to drug usage for (i in 24:56) { X[,i+173] - ifelse(X[,i] 0,c(1),c(0)) } I'd like to have been able to make a new variable name based off of the old variable name (i.e. dropping _when from the end of each and replace it with _yn) --- --- 6) I'm able to make a cross-tabulated table and perform a X-squared test just fine with my recoded variable table(X$race,X[,197]) prop.test(table(X$race,X[,197])) but I would like to be able to do so with all of my drugs, although I can't seem to make that work for (i in 197:229) { table(X$race,X[,i]) prop.test(table(X$race,X[,i])) } - Thanks for reading over this and I do appreciate any help. I understand that there's an R way of doing things, and I look forward to learning the method. -- View this message in context: http://r.789695.n4.nabble.com/Non-Parametric-Adventures-in-R-tp2952754p2952754.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with ggplot2 - Boxplot
Thanks a lot Hadley, this worked. Regards, Raoul -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-ggplot2-Boxplot-tp2549970p2952914.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Output Graphics GIF
On Mon, Sep 27, 2010 at 11:31 AM, Tal Galili tal.gal...@gmail.com wrote: I am guessing you are saving the plot using the menu system. If that is the case, have a look at: ?pdf ?png Generally, I like saving my graphics to pdf since it is vectorized. Cheers, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Mon, Sep 27, 2010 at 2:39 PM, Nilza BARROS nilzabar...@gmail.comwrote: Dear R users, How could I managed graphics in GIF format? What I have been doing is graphics in *.ps or *.eps and after I convert them using CONVERT (from ImageMagick) but the output quality is not good. Since these graphics will be use for other users they must have a better image quality. I really appreciate any help, -- Abraço, Nilza Barros [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abraço, Nilza Barros [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with panel.text in Lattice - Putting labels for co-oridnates in a plot
Hi, I am trying to create a Lattice dotplot that has the following data graphed. I need to put labels for each of the co-oridnates on the plot. I have managed to get only one label dispalyed as I don't completely understand the panel.text function. Can someone please help me? # Sub Reason is a text field that I need to see the volumes for (Vols) dotplot(DU_Summary_plotdata$SubReason ~ DU_Summary_plotdata$Vols ,horiz=TRUE,main=Top Sub-Reasons - Volumes (90% of Volumes), family=serif,font=2,xlab=Volumes,ylab=Sub-Reasons,labels=DU_Summary_plotdata$Vols,pch=,cex=1.5, panel = function(x, y, ...) { panel.dotplot(x, y, ...) panel.text(1,2,labels =DU_Summary_plotdata$Vols , pos = 4) }) The dataset DU_Summary_plotdata is made up of: SubReason-c( SR_1, SR_2 , SR_3, SR_4, SR_5, SR_6, SR_7, SR_8) Vols-c( 33827,17757,11404,5999,5305,3515,3051,1924) Thanks, Raoul -- View this message in context: http://r.789695.n4.nabble.com/Help-with-panel-text-in-Lattice-Putting-labels-for-co-oridnates-in-a-plot-tp2952919p2952919.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Non-Parametric Adventures in R
Dear Jamesp, This might be (more?) fitting for a blog then the R-help mailing list. I'd suggest you to open a blog on (it takes less then 4 minutes): wordpress.com It now has syntax highlighting for R code: http://www.r-statistics.com/2010/09/r-syntax-highlighting-for-bloggers-on-wordpress-com/ I also combined a list of tips for the R blogger http://r-bloggers.com/, on this post: http://www.r-statistics.com/2010/07/blogging-about-r-presentation-and-audio/ Cheers, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sat, Oct 2, 2010 at 11:27 PM, Jamesp james.jrp...@gmail.com wrote: I just started using R and I'm having all sorts of fun trying different things. I'm going to document the different things I'm doing here as a kind of case study. I'm hoping that I'll get help from the community so that I can use R properly. Anyways, in this study, I have demographic data, drug usage data, and side effect data. All of this is loaded into a csv file. I'm using Rweb as an interface, so I had to modify the cgi-bin code slightly, but it works pretty well. I'm looking for frequency counts, some summary data for columns where it makes sense, plots and X-squared tests. My data frame is named X since that's what Rweb names it. 1) I was thinking I'd have to go through each nominal variable (i.e. table(X$race) ), but I think I have it figured out now. summary(X) is nice, but I need to recode nominal data with labels so the results are meaningful. - 2) I had an issue with multiple plots overwriting each other, and I managed to bypass that with: par(mfrow=c(2,1)) I have to update it to correspond to the number of plots I think. There's probably a better way to do this. barplot(table(X$race)) prints out a barplot so that's great - 3) I was able to code my data so it shows up in tables better with X$race - factor(X$race, levels = c(0,2), labels = c(African American,White,Non-Hispanic)) !! 4) The coding for all of my drug variables is identical, and I'd like to create a loop that goes through and labels accordingly I'm not having good success with this yet, but here's what I'm trying. X[1,] - factor(X[1,], levels = c(0,1,2,3,4,5), labels= c(none,last week,last 3 month,last year,regular use at least 3 months,unknown length of usage)) I know I would need to replace the [1,] with something that gives me the column, but I'm not sure what to put syntactically at the moment. 5) I had more success creating new variables based on the old ones. So I end up with yes/no answers to drug usage for (i in 24:56) { X[,i+173] - ifelse(X[,i] 0,c(1),c(0)) } I'd like to have been able to make a new variable name based off of the old variable name (i.e. dropping _when from the end of each and replace it with _yn) --- --- 6) I'm able to make a cross-tabulated table and perform a X-squared test just fine with my recoded variable table(X$race,X[,197]) prop.test(table(X$race,X[,197])) but I would like to be able to do so with all of my drugs, although I can't seem to make that work for (i in 197:229) { table(X$race,X[,i]) prop.test(table(X$race,X[,i])) } - Thanks for reading over this and I do appreciate any help. I understand that there's an R way of doing things, and I look forward to learning the method. -- View this message in context: http://r.789695.n4.nabble.com/Non-Parametric-Adventures-in-R-tp2952754p2952754.html Sent from the R help mailing list archive at Nabble.com.
Re: [R] Tinn R
On 2 October 2010 19:21, Tal Galili tal.gal...@gmail.com wrote: Hi Raphael, Why won't you try notepad++ with npptor ? It does almost everything tinnR does. While alternatives to popular windows editors are being mentioned here, I feel like Gvim (http://www.vim.org/) along Vim-R-plugin2 (http://www.vim.org/scripts/script.php?script_id=2628) should be cited. The Vim-R-plugin developer recently added windows support to a lean cross-platform package that works really very well. Philippe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Include externally generated pdf in output (without Sweave)
Dear useRs, I generated a simple image-based report using the sequence: pdf() plot(.) textplot( for short texts, from gplots) dev.off() Is there an easy way to include an single pdf-page from an external file (not R generated). Note: For final reports, I know how to use Sweave, but I am looking for a quick solution with less overhead. Something like textplot() for pdf. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modifying a data.frame
You should examine what is being looped over when you use a for loop with the i in dataframe syntax: j-1; for(i in ex){ cat('step', j, i, sep= , fill=T); j-j+1} As you can see, each column in ex is being set to i for each step of the for loop. Instead, it seems that you want to step over every row--a change made in the first line: for (i in 1:dim(ex)[1]) { if(ex[i,3]==A|| ex[i,3]==C){ ex[i,4]- - }else { ex[i,4]-10 } } 1:dim(ex)[1] is then a vector of row index values that is looped over. A more R-ish version of this might be: ex[,4] - ifelse(ex$eff == 'A' | ex$eff == 'C', -, 10) I'm not sure this is the case, but if - is supposed to represent missingness, missing values are represented by `NA`s in R. ex[,4] - ifelse(ex$eff == 'A' | ex$eff == 'C', NA, 10) ?NA for more info. Note: those are not single quotes, but instead back-ticks. Hope that helps, Jeff. On Sun, Oct 3, 2010 at 4:58 AM, Bapst Beat beat.ba...@braunvieh.ch wrote: Hello list members I have a problem with modifying a data.frame. As an example given is a data.frame called ex : ex-data.frame(id=c(1,2,3,4,5,6),obs=c(14,9,20,36,55,47),eff=c(A,A,B,C,C,C)) After that I would like to modify the object ex with the following short script: for (i in ex) { if(ex[i,3]==A|| ex[i,3]==C){ ex[i,4]-- } else { ex[i,4]-10 } } This script is creating an error message: Fehler in if (ex[i, 3] == A || ex[i, 3] == C) { : Fehlender Wert, wo TRUE/FALSE nötig ist Why this script doesn't work properly? Thanks a lot for your hints Beat __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ranked Set Sampling
Dear R Users; Are you aware of any package that calculates Ranked Set Sample? If you have a code that you are willing to share, I will acknowledge that in my work. Thanks much Ahmed [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Non-Parametric Adventures in R
!! 4) The coding for all of my drug variables is identical, and I'd like to create a loop that goes through and labels accordingly I'm not having good success with this yet, but here's what I'm trying. X[1,] - factor(X[1,], levels = c(0,1,2,3,4,5), labels= c(none,last week,last 3 month,last year,regular use at least 3 months,unknown length of usage)) I know I would need to replace the [1,] with something that gives me the column, but I'm not sure what to put syntactically at the moment. [I assume you meant X[,1] there] Well a for loop like in 5) is not out of reach, you just need to figure out what to loop over. It's probably neatest to do it by name, but you could also do it by number (and that may be more convenient if the drug variables are listed sequentially). drugvar - c(5,7,9,13) --OR-- drugvar - c(aspirin,warfarin, heroin, nicotine) in either case, mylabels - c(none,last week,last 3 month,last year,regular use at least 3 months,unknown length of usage) for (i in drugvar) X[i] - factor(X[i], levels = 0:5, labels= mylabels) (Or X[,drugvar] but single index will extract the column as well.) Or, using a more advanced idiom: X[drugvar] - lapply(X[drugvar], factor, levels=0:5, labels=mylabels) 5) I had more success creating new variables based on the old ones. So I end up with yes/no answers to drug usage for (i in 24:56) { X[,i+173] - ifelse(X[,i] 0,c(1),c(0)) } (Don't use c(0). Not that it is that harmful, it is just unnecessary and labels yourself as a newbie...). I'd write the ifelse() bit as as.numeric(X[,i] 0), and the whole thing is very close to X - cbind(X, as.numeric(X[24:56] 0)) except for colnames issues, I'd like to have been able to make a new variable name based off of the old variable name (i.e. dropping _when from the end of each and replace it with _yn) sub() is your friend: Z - as.data.frame(as.numeric(X[24:56]0)) names(Z) - sub(_when$, _yn, names(Z)) X - cbind(X, Z) --- --- 6) I'm able to make a cross-tabulated table and perform a X-squared test just fine with my recoded variable table(X$race,X[,197]) prop.test(table(X$race,X[,197])) but I would like to be able to do so with all of my drugs, although I can't seem to make that work for (i in 197:229) { table(X$race,X[,i]) prop.test(table(X$race,X[,i])) } That's basically fine, just remember to print() the results when they are generated in a loop. -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Non-Parametric Adventures in R
Jamesp james.jrp...@gmail.com [Sat, Oct 02, 2010 at 11:27:09PM CEST]: [...] 1) I was thinking I'd have to go through each nominal variable (i.e. table(X$race) ), but I think I have it figured out now. summary(X) is nice, but I need to recode nominal data with labels so the results are meaningful. Labels are not a concept which comes with R-base. You may want to try the Hmisc package and the label and describe functions. Unfortunately, reporting functions in R-base make no use of labels. - 2) I had an issue with multiple plots overwriting each other, and I managed to bypass that with: par(mfrow=c(2,1)) I have to update it to correspond to the number of plots I think. There's probably a better way to do this. Try for example pdf(yourfilename.pdf) ... plotting routines ... dev.off() R does not provide a graphics browser by itself, only one graphic window, so you may want to use the capabilities of external programs such as your favourite pdf viewer. barplot(table(X$race)) prints out a barplot so that's great plot(table(numeric variable)) draws barplots with scaled x axis, which I think is even greater when looking at integer random variables. - 3) I was able to code my data so it shows up in tables better with X$race - factor(X$race, levels = c(0,2), labels = c(African American,White,Non-Hispanic)) !! 4) The coding for all of my drug variables is identical, and I'd like to create a loop that goes through and labels accordingly Cycle over the column names, one example: x - data.frame(replicate(8, sample(as.factor(c(Black, Asian, White, Hispanic, Native)), 20, replace=TRUE))) for (col in c(X2, X3, X4)) { levels(x[[col]])[c(2, 5)] - c(African American, White, non-Hispanic) } Generally, the use of loops is not encouraged. Here it is a simple thing to do as you need the modification of x as a side effect. 5) I had more success creating new variables based on the old ones. So I end up with yes/no answers to drug usage for (i in 24:56) { X[,i+173] - ifelse(X[,i] 0,c(1),c(0)) } I'd like to have been able to make a new variable name based off of the old variable name (i.e. dropping _when from the end of each and replace it with _yn) untested, but along these lines (pls provide a small data example with your questions so they can be addressed more directly): for (col in grep(_when$, colnames(X))) { X[, sub(_when$, _yn)] - ifelse(X[, col] 0, 1, 0) } if you insist on coding your _yn variables as numeric. In R, the data type boolean exists, so it would be more idiomatic to simply have X[, col] 0 without the ifelse() construct. --- --- 6) I'm able to make a cross-tabulated table and perform a X-squared test just fine with my recoded variable table(X$race,X[,197]) prop.test(table(X$race,X[,197])) but I would like to be able to do so with all of my drugs, although I can't seem to make that work for (i in 197:229) { table(X$race,X[,i]) prop.test(table(X$race,X[,i])) } in my toy example: apply(x[, -1], 2, function(vec) fisher.test(table(x[, 1], vec))) Note the non-use of a loop here, the upside being that a list of test results is returned (which you'd have to build yourself if using a loop). I couldn't apply a prop test here as I didn't have vectors of trials and successes, and I wonder how you got them out of your table() function. If you don't understand each single command, type ?commandname. If you have any further questions after reading up on the descriptions, feel free to post them here, but please provide toy examples of your own. -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark
Re: [R] Output Graphics GIF
Date: Sat, 2 Oct 2010 23:59:50 -0300 From: nilzabar...@gmail.com To: tal.gal...@gmail.com CC: r-help@r-project.org Subject: Re: [R] Output Graphics GIF On Mon, Sep 27, 2010 at 11:31 AM, Tal Galili wrote: I am guessing you are saving the plot using the menu system. If that is the case, have a look at: ?pdf ?png Generally, I like saving my graphics to pdf since it is vectorized. btw, is SVG supported at all? Now that you mention it that could be a good option for some plots. I just used pdf earlier for testing but if you just have a simple plot as a picture then an image format should be a better choice. I've always complained about the cost-benefit for pdf compared to alternatives but if used properly it can be a good choice in some cases ( I think I tried to explain some objections I had to pdf files on the itext mailing list, a package which may be of interest to the other poster intereted in manipulating pdf files). Use a format beneficial for the type of data you have. Cheers, Tal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Include externally generated pdf in output (without Sweave)
Hello Dieter, Looking at this thread (from 2005) http://tolstoy.newcastle.edu.au/R/help/05/10/14320.html It seems you can't read a pdf file to R (at least then, I hope there was an update since). BUT You could potentially read an image file (like, for example, tiff) using something like read.picture {SoPhy} And then write that into the PDF you are creating in R. The best thing is if there was some function to read a vector file into R (and not only the pixel). Maybe such a function exits, you should check. Either way, great question. Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sun, Oct 3, 2010 at 12:04 PM, Dieter Menne dieter.me...@menne-biomed.dewrote: Dear useRs, I generated a simple image-based report using the sequence: pdf() plot(.) textplot( for short texts, from gplots) dev.off() Is there an easy way to include an single pdf-page from an external file (not R generated). Note: For final reports, I know how to use Sweave, but I am looking for a quick solution with less overhead. Something like textplot() for pdf. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Deducer and Contingency Tables
Hi Jim, It might be worth to also ask this in the deducer google group: http://groups.google.com/group/deducer?pli=1 Best, Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sun, Oct 3, 2010 at 1:40 AM, Jim Watkins cjwat...@gwi.net wrote: Is there a way to use Deducer to analyze contingency table data that is only available in a row-by-column summary form (not a data frame)? Thanks. Jim Watkins __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package for converting R datasets into SQL Server (create table and insert statements)?
Hi, R contains many good datasets which would be valuable in other platforms as well. My intention is to use R datasets on SQL Server as a sample tables. Is there a package that would do automatic conversion from the dataset schema into a SQL Server CREATE TABLE statement (and INSERT INTO statements)? For example. str(cars) 'data.frame': 50 obs. of 2 variables: $ speed: num 4 4 7 7 8 9 10 10 10 11 ... $ dist : num 2 10 4 22 16 10 18 26 34 17 ... would become create table dbo.cars ( id int identity(1,1) not null, speed int not null, dist int not null, constraint PK_id primary key clustered (id ASC) on [PRIMARY] ) insert into dbo.cars values (N'4', N'2'), (N'4', N'10'), (N'7', N'4'), etc. -J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Include externally generated pdf in output (without Sweave)
Hi, Check the grImport package (I think it has a vignette, perhaps on Paul Murrell's homepage.) HTH, baptiste On 3 October 2010 14:52, Tal Galili tal.gal...@gmail.com wrote: Hello Dieter, Looking at this thread (from 2005) http://tolstoy.newcastle.edu.au/R/help/05/10/14320.html It seems you can't read a pdf file to R (at least then, I hope there was an update since). BUT You could potentially read an image file (like, for example, tiff) using something like read.picture {SoPhy} And then write that into the PDF you are creating in R. The best thing is if there was some function to read a vector file into R (and not only the pixel). Maybe such a function exits, you should check. Either way, great question. Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Sun, Oct 3, 2010 at 12:04 PM, Dieter Menne dieter.me...@menne-biomed.dewrote: Dear useRs, I generated a simple image-based report using the sequence: pdf() plot(.) textplot( for short texts, from gplots) dev.off() Is there an easy way to include an single pdf-page from an external file (not R generated). Note: For final reports, I know how to use Sweave, but I am looking for a quick solution with less overhead. Something like textplot() for pdf. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn R
try eclipse,dude http://www.walware.de/goto/statet Eclipse Plug-In for R: StatET Homepage R Project www.r-project.org Homepage Eclipse www.eclipse.org This is an Eclipse plug-in, supporting you to write R scripts and documentations. R is a language and environment for statistical computing and graphics. The Eclipse Project provides a kind of universal tool platform - an open extensible IDE for anything and nothing in particular. R, the Eclipse IDE, and StatET are open source software, available for many operating systems. Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri On Sun, Oct 3, 2010 at 2:52 PM, Philippe Glaziou glaz...@gmail.com wrote: On 2 October 2010 19:21, Tal Galili tal.gal...@gmail.com wrote: Hi Raphael, Why won't you try notepad++ with npptor ? It does almost everything tinnR does. While alternatives to popular windows editors are being mentioned here, I feel like Gvim (http://www.vim.org/) along Vim-R-plugin2 (http://www.vim.org/scripts/script.php?script_id=2628) should be cited. The Vim-R-plugin developer recently added windows support to a lean cross-platform package that works really very well. Philippe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Include externally generated pdf in output (without Sweave)
Tal Galili wrote: You could potentially read an image file (like, for example, tiff) using something like read.picture {SoPhy} Thanks to you and Baptiste Auguie. Looks like the best way would be to import the picture as a pixel graphics with read.picture. It might be possible to read ps with grImport (Murells/suggested by Baptiste), but I always found the postscript-detour messy. Dieter -- View this message in context: http://r.789695.n4.nabble.com/Include-externally-generated-pdf-in-output-without-Sweave-tp2953057p2953182.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
Thanks Frank and Greg, This makes alot more sense to me now. I appreciate you are both very busy, but i was wondering if i could trouble you for one last piece of advice. As my data is a little complicated for a first effort at R let alone modelling! The response is on a range from 1-6, which indicates extinction risk - 1 being least concern and 6 being critical - hence using a ordinal model The factors (6) are categorical - FRUIT TYPE - fleshy/dry HABITAT - terrestrial, aquatic, epiphyte etc etc I am asking the question - How do different combinations of factors effect extinction risk. Based on what you have both said i have called predict(model1, type=fitted) Would this be the best way predicting the probability of falling into each response category - y=2y=3 y=4 y=5 y=6 10.502220616 0.410236021 0.2892270912 0.2191420568 0.1774250519 20.745221699 0.668501579 0.5412223837 0.4486151612 0.3847379442 30.720381333 0.639796647 0.5095814746 0.4174618165 0.3551631876 40.752321112 0.676811675 0.5505781183 0.4579680710 0.3937100283 50.824388319 0.763956402 0.6543788296 0.5663098186 0.5008981585 60.824388319 0.763956402 0.6543788296 0.5663098186 0.5008981585 70.824388319 0.763956402 0.6543788296 0.5663098186 0.5008981585 80.824388319 0.763956402 0.6543788296 0.5663098186 0.5008981585 90.526291649 0.433739868 0.3094355120 0.2360800803 0.1919312111 I have 100 species for which i have their factors and i want to predict their response, so if i do the above and use the newdata function, and present the probabilities as above rather than trying to classify them? I tried polr and that classified each response as either 1 or 6 i.e no 2,3,4,5 - as did calling predict(model1, type=fitted.ind) which resulted in the probabilities of being 1 or 6 far outweighing 2,3,4,5 (Below) - this may just be that my model is not powefull enough to discrimate effectively as i know that is incorrect ( Brier score 2.01, AUC 66.9)? EXTINCTION=1 EXTINCTION=2 EXTINCTION=3 EXTINCTION=4 EXTINCTION=5 EXTINCTION=6 1 0.4977794 0.0919845942 0.121008930 0.070085034 0.0417170048 0.1774250519 2 0.2547783 0.0767201200 0.127279196 0.092607223 0.0638772170 0.3847379442 3 0.2796187 0.0805846862 0.130215173 0.092119658 0.0622986289 0.3551631876 4 0.2476789 0.0755094367 0.126233557 0.092610047 0.0642580427 0.3937100283 5 0.1756117 0.0604319173 0.109577572 0.088069011 0.0654116601 0.5008981585 6 0.1756117 0.0604319173 0.109577572 0.088069011 0.0654116601 0.5008981585 7 0.1756117 0.0604319173 0.109577572 0.088069011 0.0654116601 0.5008981585 8 0.1756117 0.0604319173 0.109577572 0.088069011 0.0654116601 0.5008981585 9 0.4737084 0.0925517814 0.124304356 0.073355432 0.0441488692 0.1919312111 10 0.2489307 0.0757263892 0.126424896 0.092614323 0.0641934484 0.3921102030 Thanks very much for any advice given, John 10 0.751069260 0.675342871 0.5489179746 0.4563036514 0.3921102030 On 1 Oct 2010, at 23:13, Frank Harrell wrote: Well put Greg. The job of the statistician is to produce good estimates (probabilities in this case). Those cannot be translated into action without subject-specific utility functions. Classification during the analysis or publication stage is not necessary. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2951976.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A problem about nomogram--thank you for you help
dear professor: I am a doctor of urinary,and I am developing a nomogram of bladder tumor.Now I have a problem about this. I have got the result like this through analysing the dataset exp11.sav through multinominal logistic regression by SPSS 17.0.(the Sig. is high,that is good ,it is just aexperimental data ) Parameter Estimates Ya B Std. Error Wald df Sig. Exp(B) 95% Confidence Interval for Exp(B) Lower Bound Upper Bound 1 Intercept -1.338 .595 5.059 1 .024 T.Grade .559 .319 3.076 1 .079 1.749 .936 3.265 Sex .920 .553 2.766 1 .096 2.511 .849 7.428 Smoking -.896 .474 3.580 1 .058 .408 .161 1.033 a. The reference category is: 0. And after that,I want to develop the nomogram through R-Project. And I load the package rms T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(0:1,labels=c(F,M)) Smoking-factor(0:1,labels=c(No,yes)) L-0.559T.Grade-0.896Smoking+0.92Sex-1.338 # error (错误: 不适用于非函数;error:it is not fit the non-function) The R-project index that the last program error. can you tell me where is the mistake.and how to get the correct equation . thank you for you help! And I an sorry about my poor english! truly yours __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A problem about nomogram--thank you for you help
Firstly, `*` is the multiplication operator in R. Secondly, you'll need to convert your factors to numerics: L-0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338 Cheers, Jeff. 2010/10/3 笑啸 dingdongl...@126.com: dear professor: I am a doctor of urinary,and I am developing a nomogram of bladder tumor.Now I have a problem about this. I have got the result like this through analysing the dataset exp11.sav through multinominal logistic regression by SPSS 17.0.(the Sig. is high,that is good ,it is just aexperimental data ) Parameter Estimates Ya B Std. Error Wald df Sig. Exp(B) 95% Confidence Interval for Exp(B) Lower Bound Upper Bound 1 Intercept -1.338 .595 5.059 1 .024 T.Grade .559 .319 3.076 1 .079 1.749 .936 3.265 Sex .920 .553 2.766 1 .096 2.511 .849 7.428 Smoking -.896 .474 3.580 1 .058 .408 .161 1.033 a. The reference category is: 0. And after that,I want to develop the nomogram through R-Project. And I load the package rms T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(0:1,labels=c(F,M)) Smoking-factor(0:1,labels=c(No,yes)) L-0.559T.Grade-0.896Smoking+0.92Sex-1.338 # error (错误: 不适用于非函数;error:it is not fit the non-function) The R-project index that the last program error. can you tell me where is the mistake.and how to get the correct equation . thank you for you help! And I an sorry about my poor english! truly yours __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpreting the example given by Frank Harrell in the predict.lrm {Design} help
You still seem to be hung up on making arbitrary classifications. Instead, look at tendencies using odds ratios or rank correlation measures. My book Regression Modeling Strategies covers this. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Frank-Harrell-in-the-predict-lrm-Design-help-tp2883311p2953220.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A problem about nomogram--thank you for you help
Please take the time to study the subject matter, and note that a nomogram is just a graphical method. It is not a statistical model or a process. Frank - Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/A-problem-about-nomogram-thank-you-for-you-help-tp2953209p2953221.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tyring to save plots using windoze 7 and cygwin
Date: Sat, 2 Oct 2010 16:35:03 -0700 Subject: Re: [R] tyring to save plots using windoze 7 and cygwin From: jwiley.psych gmail.com To: marchy...@hotmail.com CC: r-help@r-project.org Hi Mike, sessionInfo() R version 2.11.1 (2010-05-31) x86_64-pc-mingw32 sessionInfo() R version 2.11.1 (2010-05-31) i386-pc-mingw32 locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base ## # Initialize pdf (or whatever) device pdf(myfile.pdf) # plot your graph plot(xyz, main=imp rate, xlab=Time(GMT),ylab=imp/minute) # add the grid lines grid() # shut the device down dev.off() yes, that works fine thanks. I guess it has been a while :) Hopefully now I can find the 3D plotting stuff and other things I need as I seem to recall many years ago it took a while to find... Thanks. You would use a similar process for postscript(), png(), etc. What version of R are you using? I do not have the save.plot() function (at least in the packages that load by default). You can learn more by poking around the help pages ?dev.copy ?Devices ?windows HTH, Josh On Sat, Oct 2, 2010 at 3:43 PM, Mike Marchywka wrote: Hi, I'd been using R in the past and recently installed it on a new windoze 7 machine. There have been many issues with compatibility and 32/64 bit apps etc and I did find on google on isolated complaint that saveplot failed in scripts a long time ago. R seems to work fine except script-based plot saving as pdf has not worked. I have tried the following, none of which seem to function, xyz -read.table(time_frac2) x=plot(xyz,main=imp rate, xlab=Time(GMT),ylab=imp/minute) grid() dev2bitmap(xxx.pdf,type=pdfwrite) save.plot(x,file=xxx.pdf,format=pdf) dev.copy(pdf,auto_pdf.pdf) dev.off() savePlot(./auto_hit_rate.pdf,type=pdf) q() Now apparently R does save the plot in a default file Rplots.pdf which is just fine for my immediate needs but this may have limitations for future usages. Just curious to know what other may have gotten to work or not work. Thanks. - - - - - - Mike Marchywka | V.P. Technology 415-264-8477 marchy...@phluant.com Online Advertising and Analytics for Mobile http://www.phluant.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ranked Set Sampling
Ahmed Albatineh wrote: Are you aware of any package that calculates Ranked Set Sample? If you have a code that you are willing to share, I will acknowledge that in my work. Thanks much Ahmed I wonder if this is a phrase that is uniformly understood? One possibility is that you are asking to sample elements of a set based on some ranking function. In that case you may need to describe in more detail how you want to handle ties and whether this function is supposed to deal with multivariate strata. (There are many base functions that handle univariate situations and there are packages that provide support for more complex ones.) You are also requested (in the Posting Guide) to provide an example that can be cut and pasted and desired results against which responder can judge the degree to which their efforts agree with your hopes. -- David. -- View this message in context: http://r.789695.n4.nabble.com/Ranked-Set-Sampling-tp2953108p2953304.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with panel.text in Lattice - Putting labels for co-oridnates in a plot
RaoulD wrote: Hi, I am trying to create a Lattice dotplot that has the following data graphed. I need to put labels for each of the co-oridnates on the plot. I have managed to get only one label dispalyed as I don't completely understand the panel.text function. Can someone please help me? # Sub Reason is a text field that I need to see the volumes for (Vols) dotplot(DU_Summary_plotdata$SubReason ~ DU_Summary_plotdata$Vols ,horiz=TRUE,main=Top Sub-Reasons - Volumes (90% of Volumes), family=serif,font=2,xlab=Volumes,ylab=Sub-Reasons,labels=DU_Summary_plotdata$Vols,pch=,cex=1.5, panel = function(x, y, ...) { panel.dotplot(x, y, ...) panel.text(1,2,labels =DU_Summary_plotdata$Vols , pos = 4) }) The dataset DU_Summary_plotdata is made up of: SubReason-c( SR_1, SR_2 , SR_3, SR_4, SR_5, SR_6, SR_7, SR_8) Vols-c( 33827,17757,11404,5999,5305,3515,3051,1924) Thanks, Raoul Several problems with your example and explanation: 1) It will not run as posted, since your plot call expects that data to be in a dataframe and you only constructed 2 vectors. 2) Even if you naively put the two vectors in a dataframe, the SubReason columns would expect to be able to find objects with those names, so I suspect you should have quoted them. Trying to fix these: DU_Summary_plotdata - data.frame(SubReason=c( 'SR_1', 'SR_2' , 'SR_3', 'SR_4', 'SR_5', 'SR_6', 'SR_7', 'SR_8'), Vols=c( 33827,17757,11404,5999,5305,3515,3051,1924) ) produced a single text value with a set of eight blue arrows at the indicated horizontal locations. 3) But the panel.text function was only given a single location at which to plot the values. (The interpreter plotted the first value at point (1,2) with left justification.) Trying instead: dotplot(DU_Summary_plotdata$SubReason ~ DU_Summary_plotdata$Vols ,horiz=TRUE,main=Top Sub-Reasons - Volumes (90% of Volumes), family=serif,font=2,xlab=Volumes,ylab=Sub-Reasons,labels=DU_Summary_plotdata$Vols,pch=,cex=1.5, panel = function(x, y, ...) { panel.dotplot(x, y, ...) panel.text(DU_Summary_plotdata$Vols, 1:length(DU_Summary_plotdata$SubReason), labels =DU_Summary_plotdata$Vols , pos = 4) }) # you need to offer the x value first and the y-values are ascending integers. It seems close to what you perhaps wanted (but did not really explain very fully). It has the defect that the right-most value is partly off the plot area. That can be remedied by changing the pos=4 argument to pos=1. -- David. -- View this message in context: http://r.789695.n4.nabble.com/Help-with-panel-text-in-Lattice-Putting-labels-for-co-oridnates-in-a-plot-tp2952919p2953317.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Output Graphics GIF
On Sun, Oct 3, 2010 at 5:47 AM, Mike Marchywka marchy...@hotmail.com wrote: Date: Sat, 2 Oct 2010 23:59:50 -0300 From: nilzabar...@gmail.com To: tal.gal...@gmail.com CC: r-help@r-project.org Subject: Re: [R] Output Graphics GIF On Mon, Sep 27, 2010 at 11:31 AM, Tal Galili wrote: I am guessing you are saving the plot using the menu system. If that is the case, have a look at: ?pdf ?png Generally, I like saving my graphics to pdf since it is vectorized. btw, is SVG supported at all? Now that you mention it that could be a good option for some plots. I just used pdf earlier for testing but if you just have a simple plot as a picture then an image format should be a better choice. I've always complained about the cost-benefit for pdf compared to alternatives but if used properly it can be a good choice in some cases ( I think I tried to explain some objections I had to pdf files on the itext mailing list, a package which may be of interest to the other poster intereted in manipulating pdf files). Use a format beneficial for the type of data you have. ...and basically never ever use JPEG for your scientific graphs - it's evil! It's driver should be hidden away in some obscure package far far away, because too people still use it. /Henrik Cheers, Tal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ranked Set Sampling
On 10/03/2010 06:32 PM, David Winsemius wrote: Ahmed Albatineh wrote: Are you aware of any package that calculates Ranked Set Sample? If you have a code that you are willing to share, I will acknowledge that in my work. Thanks much Ahmed I wonder if this is a phrase that is uniformly understood? One possibility is that you are asking to sample elements of a set based on some ranking function. In that case you may need to describe in more detail how you want to handle ties and whether this function is supposed to deal with multivariate strata. (There are many base functions that handle univariate situations and there are packages that provide support for more complex ones.) You are also requested (in the Posting Guide) to provide an example that can be cut and pasted and desired results against which responder can judge the degree to which their efforts agree with your hopes. It's a fairly well-defined concept. I.e., you can google for it... On the other hand, same search points to authors like Jeff Terpstra who explicitly says that he codes in R, so maybe ask him instead of the the world at large? -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scatterplot error message
Hi All. I am a new R user. Trying to do scatterplot. Not sure how to resolve this error message A-subset (ErablesGatineau, station==A) B-subset (ErablesGatineau, station==B) plot(diam ~ biom) abline(lm(diam ~ biom), col = red) goodcases - !(is.na(diam) | is.na(biom)) lines(lowess(diam[goodcases] ~ biom[goodcases])) library(car) scatterplot(diam ~ biom, reg.line = lm, smooth = TRUE, + labels = FALSE, boxplots = FALSE, span = 0.5, data = A) Error in `row.names-.data.frame`(`*tmp*`, value = FALSE) : invalid 'row.names' length [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] readBin which has two different sizes
Hi, I have a binary file which has the following structure: 1) Some header in the beginning 2) Thousands of 216-byte data sub-grouped into 4 54-byte data structured as 4-byte time stamp (big endian) followed by 50 1-byte (8-bit) samples. So far this is how I am trying: #Open a connection for binary file to.read - file(binary file, rb) #Read header info - readBin(to.read, character(),1) #Read data: byte=1, n=number of data (i know this from header) * 216 (bytes per data) data - readBin(to.read, integer(), size=1, n=35269*216, signed=TRUE, endian = big) I am able to read the header but obviously having trouble in that last line because my data has two sizes; 4-byte time stamp and one byte (8-bit) samples. Also, one is unsigned and other is signed. How do i read these two differently sized, signed data? Would appreciate any help, thanks. -- View this message in context: http://r.789695.n4.nabble.com/readBin-which-has-two-different-sizes-tp2953365p2953365.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ranked Set Sampling
This is certainly not my area of expertise, but like Peter mentioned, Jeff Terpstra published this: http://www.jstatsoft.org/v14/i07 which has R code listed as supplements. Joe McKean seems to keep an updated version of that code here: http://www.stat.wmich.edu/red5328/WWest/ And Brent Johnson has extended that code for variable selection/regression: http://userwww.service.emory.edu/~bajohn3/software.html Maybe that's a start; if you find more by following Peter's suggestion of privately contacting authors, please follow-up with the list. Cheers, Jeff. On Sun, Oct 3, 2010 at 1:38 PM, Peter Dalgaard pda...@gmail.com wrote: On 10/03/2010 06:32 PM, David Winsemius wrote: Ahmed Albatineh wrote: Are you aware of any package that calculates Ranked Set Sample? If you have a code that you are willing to share, I will acknowledge that in my work. Thanks much Ahmed I wonder if this is a phrase that is uniformly understood? One possibility is that you are asking to sample elements of a set based on some ranking function. In that case you may need to describe in more detail how you want to handle ties and whether this function is supposed to deal with multivariate strata. (There are many base functions that handle univariate situations and there are packages that provide support for more complex ones.) You are also requested (in the Posting Guide) to provide an example that can be cut and pasted and desired results against which responder can judge the degree to which their efforts agree with your hopes. It's a fairly well-defined concept. I.e., you can google for it... On the other hand, same search points to authors like Jeff Terpstra who explicitly says that he codes in R, so maybe ask him instead of the the world at large? -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatterplot error message
On Oct 3, 2010, at 1:58 PM, nisaf wrote: Hi All. I am a new R user. Trying to do scatterplot. Not sure how to resolve this error message A-subset (ErablesGatineau, station==A) B-subset (ErablesGatineau, station==B) plot(diam ~ biom) Did you also attache either (or both??? of those data.frames)? abline(lm(diam ~ biom), col = red) goodcases - !(is.na(diam) | is.na(biom)) lines(lowess(diam[goodcases] ~ biom[goodcases])) library(car) scatterplot(diam ~ biom, reg.line = lm, smooth = TRUE, + labels = FALSE, boxplots = FALSE, span = 0.5, data = A) Error in `row.names-.data.frame`(`*tmp*`, value = FALSE) : invalid 'row.names' length Given the fact that you deemed it necessary to pre-qualify your data when you constructed goodcases, the invalid row.names length error suggest you may need to supply scatterplot() when using the reg.line argument with data that has been similarly culled of NA values. ?complete. case ?na.omit Perhaps trying with a data=A[goodcases, ] -- David. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readBin which has two different sizes
On Sun, Oct 3, 2010 at 10:59 AM, Ab Hu master.rs...@yahoo.com wrote: Hi, I have a binary file which has the following structure: 1) Some header in the beginning 2) Thousands of 216-byte data sub-grouped into 4 54-byte data structured as 4-byte time stamp (big endian) followed by 50 1-byte (8-bit) samples. So far this is how I am trying: #Open a connection for binary file to.read - file(binary file, rb) #Read header info - readBin(to.read, character(),1) #Read data: byte=1, n=number of data (i know this from header) * 216 (bytes per data) data - readBin(to.read, integer(), size=1, n=35269*216, signed=TRUE, endian = big) I am able to read the header but obviously having trouble in that last line because my data has two sizes; 4-byte time stamp and one byte (8-bit) samples. Also, one is unsigned and other is signed. Outline: 1. Use x - readBin(..., what=raw, n=35269*(54*4)) to read your raw (byte) data. 2. Turn it into a 54x4x35269 array, e.g. dim(x) - c(54,4,35269). 3. Extract the 4-byte time stamps by yT - x[1:4,,,drop=FALSE]; This is of type raw. Use readBin() to parse it, i.e. zT - readBin(yT, what=integer, size=4, signed=TRUE, endian=big). There are your timestamps. 4. Extract the 50 1-byte samples by yS - x[5:54,,,drop=FALSE]; zS - readBin(yS, what=integer, size=1, signed=FALSE); Something like that. /Henrik How do i read these two differently sized, signed data? Would appreciate any help, thanks. -- View this message in context: http://r.789695.n4.nabble.com/readBin-which-has-two-different-sizes-tp2953365p2953365.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ranked Set Sampling
On Oct 3, 2010, at 2:13 PM, Jeffrey Spies wrote: This is certainly not my area of expertise, but like Peter mentioned, Jeff Terpstra published this: http://www.jstatsoft.org/v14/i07 which has R code listed as supplements. Joe McKean seems to keep an updated version of that code here: http://www.stat.wmich.edu/red5328/WWest/ And Brent Johnson has extended that code for variable selection/ regression: http://userwww.service.emory.edu/~bajohn3/software.html Maybe that's a start; if you find more by following Peter's suggestion of privately contacting authors, please follow-up with the list. Yes; search terms ' ranked set sampling r-project ' ... does produce a very complete book-length piece: Robust Nonparametric Statistical Methods (2010) by Hettmansperger and McKean which has a section on the question posed at printed pg 53 (pg 64 of the pdf version) and on second print page cites R code available: http://www.stat.wmich.edu/mckean/Rfuncs/ (And I note that Terpstra and McKean are co-authors and that the same institution.) Cheers, Jeff. On Sun, Oct 3, 2010 at 1:38 PM, Peter Dalgaard pda...@gmail.com wrote: On 10/03/2010 06:32 PM, David Winsemius wrote: Ahmed Albatineh wrote: Are you aware of any package that calculates Ranked Set Sample? If you have a code that you are willing to share, I will acknowledge that in my work. Thanks much Ahmed I wonder if this is a phrase that is uniformly understood? One possibility is that you are asking to sample elements of a set based on some ranking function. In that case you may need to describe in more detail how you want to handle ties and whether this function is supposed to deal with multivariate strata. (There are many base functions that handle univariate situations and there are packages that provide support for more complex ones.) You are also requested (in the Posting Guide) to provide an example that can be cut and pasted and desired results against which responder can judge the degree to which their efforts agree with your hopes. It's a fairly well-defined concept. I.e., you can google for it... On the other hand, same search points to authors like Jeff Terpstra who explicitly says that he codes in R, so maybe ask him instead of the the world at large? -- Peter Dalgaard Center for Statistics, Copenhagen Business School Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Programmaticly finding number of processors by R code
Dear List Sorry if this question seems very basic. Is there a function to pro grammatically find number of processors in my system _ I want to pass this as a parameter to snow in some serial code to parallel code functions Regards Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ranked Set Sampling
On Oct 3, 2010, at 3:01 PM, David Winsemius wrote: On Oct 3, 2010, at 2:13 PM, Jeffrey Spies wrote: This is certainly not my area of expertise, but like Peter mentioned, Jeff Terpstra published this: http://www.jstatsoft.org/v14/i07 which has R code listed as supplements. Joe McKean seems to keep an updated version of that code here: http://www.stat.wmich.edu/red5328/WWest/ And Brent Johnson has extended that code for variable selection/ regression: http://userwww.service.emory.edu/~bajohn3/software.html Maybe that's a start; if you find more by following Peter's suggestion of privately contacting authors, please follow-up with the list. Yes; search terms ' ranked set sampling r-project ' ... does produce a very complete book-length piece: Robust Nonparametric Statistical Methods (2010) by Hettmansperger and McKean which has a section on the question posed at printed pg 53 (pg 64 of the pdf version) and on second print page cites R code available: http://www.stat.wmich.edu/mckean/Rfuncs/ And that link does not succeed, but this one does: http://fisher.stat.wmich.edu/joe/Stat666/Rfuncs/ (And I note that Terpstra and McKean are co-authors and that the same institution.) Cheers, Jeff. On Sun, Oct 3, 2010 at 1:38 PM, Peter Dalgaard pda...@gmail.com wrote: On 10/03/2010 06:32 PM, David Winsemius wrote: Ahmed Albatineh wrote: Are you aware of any package that calculates Ranked Set Sample? If you have a code that you are willing to share, I will acknowledge that in my work. Thanks much Ahmed I wonder if this is a phrase that is uniformly understood? One possibility is that you are asking to sample elements of a set based on some ranking function. In that case you may need to describe in more detail how you want to handle ties and whether this function is supposed to deal with multivariate strata. (There are many base functions that handle univariate situations and there are packages that provide support for more complex ones.) You are also requested (in the Posting Guide) to provide an example that can be cut and pasted and desired results against which responder can judge the degree to which their efforts agree with your hopes. It's a fairly well-defined concept. I.e., you can google for it... David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Programmaticly finding number of processors by R code
Without knowing your OS, there is no way anyone can tell you. And you probably want to know 'cores' rather than CPUs. And for some specific OSes, you will find answers in the archives. Beware that this is not a well-defined question: are these physical or virtual cores?, and having them in the system and being allowed to use them are different questions. Package 'multicore' is one that attempts to do this in its function detectCores (see the source code). And on Sparc Solaris it is pretty useless as it gives virtual CPUs, 8x the number of real CPUs. On Sun, 3 Oct 2010, Ajay Ohri wrote: Dear List Sorry if this question seems very basic. Is there a function to pro grammatically find number of processors in my system _ I want to pass this as a parameter to snow in some serial code to parallel code functions Regards Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plyr: a*ply with functions that return matrices-- possible bug in aaply?
I have an application where I have a function to calculate results for a 2-way table or matrix, which returns a matrix with one less row and column. To keep this short, the function below captures the structure: fun2way - function(f){ if (!length(dim(f)) ==2) stop(only for 2-way arrays) R - dim(f)[1] C - dim(f)[2] f[1:(R-1), 1:(C-1)] } Now, I want to extend this to higher-way arrays, using apply-like methods over the strata (all but the first two dimensions), and returning an array in which the last dimensions correspond to strata. That is, I want to define something like the following using an a*ply method, but aaply gives a result in which the applied .margin(s) do not appear last in the result, contrary to the documentation for ?aaply. I think this is a bug, either in the function or the documentation, but perhaps there's something I misunderstand for this case. fun - function(f, stratum=NULL) { L - length(dim(f)) if (L 2 is.null(stratum)) stratum - 3:L if (is.null(stratum)) { result - fun2way(f) } else { require(plyr) result - aaply(f, stratum, fun2way) ## order of dimensions screwed up! } result } For example, by hand (or with a loop) I can calculate the pieces and combine them as I want using abind(): # apply separately to strata t1-fun2way(HairEyeColor[,,1]) t2-fun2way(HairEyeColor[,,2]) library(abind) abind(t1, t2, along=3) , , 1 Brown Blue Hazel Black32 1110 Brown53 5025 Red 10 10 7 , , 2 Brown Blue Hazel Black369 5 Brown66 3429 Red 167 7 alply() gives me what I want, but with the strata as list elements, rather than an array library(plyr) # strata define separate list elements alply(HairEyeColor, 3, fun2way) $`1` Eye HairBrown Blue Hazel Black32 1110 Brown53 5025 Red 10 10 7 $`2` Eye HairBrown Blue Hazel Black369 5 Brown66 3429 Red 167 7 attr(,split_type) [1] array attr(,split_labels) Sex 1 Male 2 Female However, with aaply(), dim[3] ends up as first dimension, not last # dim[3] ends up as first dimension, not last aaply(HairEyeColor, 3, fun2way) , , Eye = Brown Hair Sex Black Brown Red Female3666 16 Male 3253 10 , , Eye = Blue Hair Sex Black Brown Red Female 934 7 Male 1150 10 , , Eye = Hazel Hair Sex Black Brown Red Female 529 7 Male 1025 7 str(aaply(as.array(HairEyeColor), 3, fun2way)) num [1:2, 1:3, 1:3] 36 32 66 53 16 10 9 11 34 50 ... - attr(*, dimnames)=List of 3 ..$ Sex : chr [1:2] Female Male ..$ Hair: chr [1:3] Black Brown Red ..$ Eye : chr [1:3] Brown Blue Hazel ## aaply should return this aperm(aaply(HairEyeColor, 3, fun2way), c(2,3,1)) , , Sex = Female Eye HairBrown Blue Hazel Black369 5 Brown66 3429 Red 167 7 , , Sex = Male Eye HairBrown Blue Hazel Black32 1110 Brown53 5025 Red 10 10 7 On the other hand, aaply() does work as I expect, with an array of size 2 x C x strata library(vcd) fun2way(Employment[,,1]) 1Mo 1-3Mo 3-12Mo 1-2Yr 2-5Yr 8 35 70 62 56 fun2way(Employment[,,2]) 1Mo 1-3Mo 3-12Mo 1-2Yr 2-5Yr 40 85181 85118 aaply(Employment, 3, fun2way) LayoffCause 1Mo 1-3Mo 3-12Mo 1-2Yr 2-5Yr Closure 835 706256 Replaced 408518185 118 -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Programmaticly finding number of processors by R code
windows and ubuntu linux are my OS intent is to use them in the snow makecluster statement so I am not sure what I need cores,cpus,real,virtual basically the max amount of clusters i can create on my machine 2) if I have a workgroup on windows - can i detect cores/cpus on the network using the detectcore Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri On Mon, Oct 4, 2010 at 1:25 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: Without knowing your OS, there is no way anyone can tell you. And you probably want to know 'cores' rather than CPUs. And for some specific OSes, you will find answers in the archives. Beware that this is not a well-defined question: are these physical or virtual cores?, and having them in the system and being allowed to use them are different questions. Package 'multicore' is one that attempts to do this in its function detectCores (see the source code). And on Sparc Solaris it is pretty useless as it gives virtual CPUs, 8x the number of real CPUs. On Sun, 3 Oct 2010, Ajay Ohri wrote: Dear List Sorry if this question seems very basic. Is there a function to pro grammatically find number of processors in my system _ I want to pass this as a parameter to snow in some serial code to parallel code functions Regards Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Johnson Distribution Fit
Hi, I am trying to fit a Johnson SB distribution using fitdist function in fitdistrplus Library. I have defined the Johnson SB distribution from ( http://www.ntrand.com/johnson-sb-distribution/) . But it gives me the follwing errors. Any help would be appreciated #xi = xi #lambda =l #delta =d #gamma = g djohn = function(x,xi,l,d,g) (d/(l*sqrt(2*pi)*((x-xi)/l)*(1-((x-xi)/l*exp[-0.5*(g + d*log(((x-xi)/l)/(1-((x-xi)/l^2] pjohn = function(x,xi,l,d,g) pnorm(g + d*log(((x-xi)/l)/(1-((x-xi)/l qjohn = function(p,xi,l,d,g) xi + (l*exp((qnorm(p) - g)/d))/(1 + exp((qnorm(p) - g)/d)) f1c - fitdist(data2,john,start=list(xi = 0.5 ,l = 50, d = 1, g = 1)) Error in fitdist(data2, john, start = list(xi = 0.5, l = 50, d = 1, : the function mle failed to estimate the parameters, with the error code 100 In addition: Warning message: In log(((x)/l)/(1 - ((x)/l))) : NaNs produced Cheers AG [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to iterate through different arguments?
If I have a model line = lm(y~x1) and I want to use a for loop to change the number of explanatory variables, how would I do this? So for example I want to store the model objects in a list. model1 = lm(y~x1) model2 = lm(y~x1+x2) model3 = lm(y~x1+x2+x3) model4 = lm(y~x1+x2+x3+x4) model5 = lm(y~x1+x2+x3+x4+x5)... model10. model_function = function(x){ for(i in 1:x) { } If x =1, then the list will only add model1. If x =2, then the list will add both model1 and model2. If x=3, then the list will add model1 model 2 and model3 and so on. How do I translate this into code? -- View this message in context: http://r.789695.n4.nabble.com/How-to-iterate-through-different-arguments-tp2953511p2953511.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot matrix - Pearson linear correlation and Density Ellipse
Hi, I used the pairs.panels() in pkg:psych and it is helpful. It saves time. but if I use this line: pairs.panels(cfcap[8:11], scale = FALSE, lm=TRUE,ellipses=TRUE, digits = 2 ) The results are: - The upper.panel does not show the pearson r but the lm data. Furthermore, can I use the pairwise.complete.obs method for the upper.panel. Can it be fixed? - Can I remove the histograms? - Can I control the eliipse alpha? Thanks a lot. -- View this message in context: http://r.789695.n4.nabble.com/Scatterplot-matrix-Pearson-linear-correlation-and-Density-Ellipse-tp2763552p2953521.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues loading rtiff 1.4.1 with R 2.6.2 on Windows
I used rtiff and found a potential problem, which is not strictly rtiff's fault. The more-or-less standard out there is that tiff data should be 8-bits. I needed to be able to read the full 16-bit pixel values from images taken in my lab. After some hacking, I wrote a script which pulled the parameters of interest out of the header (like number of rows and columns), then read the data section out. Perversity being what it is, I don't have a copy of the function where I am right now, but if you would like a copy, please drop me an email. Carl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issues loading rtiff 1.4.1 with R 2.6.2 on Windows
Have you tried it on a newer version of R? 2.6.2 is pretty old. On Sun, Oct 3, 2010 at 4:41 PM, Carl Witthoft c...@witthoft.com wrote: I used rtiff and found a potential problem, which is not strictly rtiff's fault. The more-or-less standard out there is that tiff data should be 8-bits. I needed to be able to read the full 16-bit pixel values from images taken in my lab. After some hacking, I wrote a script which pulled the parameters of interest out of the header (like number of rows and columns), then read the data section out. Perversity being what it is, I don't have a copy of the function where I am right now, but if you would like a copy, please drop me an email. Carl __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick | Auburn University | | Department of Biological Sciences | | 331 Funchess Hall | | Auburn, Alabama | | 36849 | |___| | sas0...@auburn.edu | | http://www.auburn.edu/~sas0025 | |___| Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis A big computer, a complex algorithm and a long time does not equal science. -Robert Gentleman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn R
Raphael, I too had problems setting up Tinn-R 2.3.5.2 with R 2.11.1-x64 in 64-bit Windows 7. The following I had previously written to a colleague to show how I resolved the problems. I'm not sure if any of this will be of help to you, but Step 3 fixed an issue I was having with .trPaths. Cheers, Jeremy Jeremy Hetzel Boston University 1) Change permissions for the C:\Users\$USER\AppData\Roaming\Tinn-R directory. Windows 7 makes this directory read-only by default, but Tinn-R expects to be able to write to it, causing errors. For me, I went to C:\Users\jthetzel\AppData\Roaming, right-clicked the Tinn-R directory, and un-checked the 'Read-only' attribute. 2) Run Tinn-R as administrator I found that if I went back to the ...Roaming\Tinn-R directory later, Windows had reset the attribute to 'Read-only'. So, far good measure, I also changed the Tinn-R.exe file under C:\Program Files (x86)\Tinn-R \bin\ to run as administrator. Browse to the directory, right-click the Tinn-R.exe file, select properties, click the Compatibility tab, and check the Run as Administrator box. Having accomplished step 2), step 1) is probably unnecessary. But I include it here just because it's what I did, and Tinn-R magically works again. 3) Change OptionsApplicationRRguiType to 'Partial' from 'Whole' in Tinn-R, via the menu bar Steps 1) and 2) addressed the problem where Tinn-R would complain about '.trPaths' when sending code to R. However, I had the additional problem of Tinn-R refusing to even attempt to send code to R (the 'R send file' and 'R send selection' icons were dimmed out). Changing OptionsApplicationRRguiType to 'Partial' from 'Whole' in Tinn-R, via the menu bar solved this problem. I don't know why. The following is for anyone having problems with using Tinn-R with the development R version 2.12.0x64. In the Windows development version I downloaded, both the 32 and 64 bit R libraries were included. Subsequently, the folder structure has changed, which Tinn-R does not recognize by default. 4) Change OptionsApplicationRRterm.exe search path to wherever your Rterm.exe is located. For me, it was C:\Program Files\R\R-2.12.0dev\bin\x64\Rterm.exe 5) Change OptionsApplicationRRgui.exe search path to wherever your Rgui.exe is located. For me, it was C:\Program Files\R\R-2.12.0dev\bin\x64\Rgui.exe 6) Set 'OptionsApplicationR Use latest installed version (always)' to 'No' from 'Yes' If this is not changed, Tinn-R will try to reconfigure itself every time it is re-opened to use the most recent installed R. Since it is not familiar with the change in folder structure in the development version, it will fail. 7) Copy ...R-2.12.0dev\etc\Rprofile.site file to ...R-2.12.0dev\bin \etc\Rprofile.site Tinn-R expects the Rprofile.site file to be in the ..\etc folder, relative to wherever Rgui.exe or Rterm.exe are. However, in the new folder structure, the etc folder is actually located at ..\..\etc, relative to Rgui.exe and Rterm.exe. I could not find anywhere to manually reconfigure the search path for the Rprofile.site file, so I just created a new etc folder where Tinn-R is expecting it. In my case, I created C:\Program Files\R\R-2.12.0dev\bin\etc and copied the Rprofile.site file to it. For comparison, the contents of my Rprofile.site file appears below. For some reason, every time I re- start Tinn-R, it appends another copy of the Tinn-R configuration lines to the Rprofile.site file. However, this behavior has not caused any problems thus far. #-- Example of my Rprofile.site file--# # Things you might want to change # options(papersize=a4) # options(editor=notepad) # options(pager=internal) # set the default help type # options(help_type=text) options(help_type=html) # set a site library # .Library.site - file.path(chartr(\\, /, R.home()), site- library) # set a CRAN mirror # local({r - getOption(repos) # r[CRAN] - http://my.local.cran; # options(repos=r)}) ##=== ## Tinn-R: necessary packages and functions ## Tinn-R: = 2.2.0.2 with TinnR package = 1.0.3 ##=== ## Set the URL of the preferred repository, below some examples: options(repos='http://software.rc.fas.harvard.edu/mirrors/R/') # USA #options(repos='http://cran.ma.imperial.ac.uk/') # UK #options(repos='http://brieger.esalq.usp.br/CRAN/') # Brazil library(utils) ## Check necessary packages necessary - c('TinnR', 'svSocket') installed - necessary %in% installed.packages()[, 'Package'] if (length(necessary[!installed]) =1) install.packages(necessary[!installed]) ## Load packages library(TinnR) library(svSocket) ## Uncoment the two lines below if you want Tinn-R to always start R at start-up ## (Observation: check the path of Tinn-R.exe) #options(IDE='C:/Tinn-R/bin/Tinn-R.exe') #trStartIDE() ## Set options options(use.DDE=T) ## Start DDE trDDEInstall() .trPaths -
Re: [R] Programmaticly finding number of processors by R code
If no-one replies with a better way, here's a way: under POSIX-compliant systems, you can write a small C function and wrap it in an R function. The C program would be something like #include unistd.h void nProcessors(int n) { #ifdef _SC_NPROCESSORS_ONLN long nProcessorsOnline = sysconf(_SC_NPROCESSORS_ONLN); #else long nProcessorsOnline = 2; // This is my guess - most computers today have at least 2 cores #endif n = (int) nProcessorsOnline; } You need to compile the function into a dynamic library (shared object on linux) and load the dynamic library with dyn.load before using it. The R wrapper would be a function along the lines of nProcessors = function() { n = 0; res = .C(nProcessors, n = as.integer(n)); res$n } I haven't actually tested the R function so please take this with a grain of salt, but I do use the C function in my own C code. You may also want to read this thread about potential pitfalls and limitations, plus another (simpler) way that would work on linux: http://forum.soft32.com/linux2/CPUs-machine-ftopict13343.html AFAIK, this will not work on Windows because Windows is not POSIX compliant, but I'm not sure. Peter On Sun, Oct 3, 2010 at 10:03 AM, Ajay Ohri ohri2...@gmail.com wrote: Dear List Sorry if this question seems very basic. Is there a function to pro grammatically find number of processors in my system _ I want to pass this as a parameter to snow in some serial code to parallel code functions Regards Ajay __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Programmaticly finding number of processors by R code
In windows try this: Sys.getenv('NUMBER_OF_PROCESSORS') On Sun, Oct 3, 2010 at 2:03 PM, Ajay Ohri ohri2...@gmail.com wrote: Dear List Sorry if this question seems very basic. Is there a function to pro grammatically find number of processors in my system _ I want to pass this as a parameter to snow in some serial code to parallel code functions Regards Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Johnson Distribution Fit
On Oct 3, 2010, at 3:47 PM, Abey George wrote: Hi, I am trying to fit a Johnson SB distribution using fitdist function in fitdistrplus Library. I have defined the Johnson SB distribution from ( http://www.ntrand.com/johnson-sb-distribution/) . But it gives me the follwing errors. Any help would be appreciated Are you really trying to estimate the bounding values as well as the gamma and delta parameters. Those would seem to be more likely determined by the nature of the problem, e.g., policy limits on the insured sums if this were a financial problem. #xi = xi #lambda =l #delta =d #gamma = g djohn = function(x,xi,l,d,g) (d/(l*sqrt(2*pi)*((x-xi)/l)*(1-((x-xi)/l*exp[-0.5*(g + d*log(((x-xi)/l)/(1-((x-xi)/l^2] You used exp[ ] where you probably wanted exp(). pjohn = function(x,xi,l,d,g) pnorm(g + d*log(((x-xi)/l)/(1-((x-xi)/ l qjohn = function(p,xi,l,d,g) xi + (l*exp((qnorm(p) - g)/d))/(1 + exp((qnorm(p) - g)/d)) f1c - fitdist(data2,john,start=list(xi = 0.5 ,l = 50, d = 1, g = 1)) You have not given us the data2 variables, so we have no way of checking whether any of them appear outside the range [epsilon, lambda +epsilon]. Using your data2 vector, what are the results of : any(data2 0.5 | data2 0.5+50) #? Error in fitdist(data2, john, start = list(xi = 0.5, l = 50, d = 1, : the function mle failed to estimate the parameters, with the error code 100 In addition: Warning message: In log(((x)/l)/(1 - ((x)/l))) : NaNs produced Cheers AG David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sampling from normal distribution
Hello If i want to resampl from the tails of normal distribution , are these commans equivelant?? upper tail:qnorm(runif(n,pnorm(b),1)) if b is an upper tail boundary or upper tail:qnorm((1-p)+p(runif(n)) if p is the probability of each interval (the observatins are divided to intervals) Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to iterate through different arguments?
There are several ways to do this. The following is only one of the ways. One of the advantages of this approach is that it allows including both continuous and categorical variables. I'll demonstrate with the iris dataset. Place your variables in a dataframe with the y variable in the first column. Then, out-list() vars-names(iris) out[[1]]-lm(Sepal.Length~1,data=iris) for(k in 2:length(vars)){ out[[k]]-update(out[[k-1]],as.formula(paste(.~.+,vars[k],sep=))) } On Sun, Oct 3, 2010 at 4:29 PM, lord12 trexi...@yahoo.com wrote: If I have a model line = lm(y~x1) and I want to use a for loop to change the number of explanatory variables, how would I do this? So for example I want to store the model objects in a list. model1 = lm(y~x1) model2 = lm(y~x1+x2) model3 = lm(y~x1+x2+x3) model4 = lm(y~x1+x2+x3+x4) model5 = lm(y~x1+x2+x3+x4+x5)... model10. model_function = function(x){ for(i in 1:x) { } If x =1, then the list will only add model1. If x =2, then the list will add both model1 and model2. If x=3, then the list will add model1 model 2 and model3 and so on. How do I translate this into code? -- View this message in context: http://r.789695.n4.nabble.com/How-to-iterate-through-different-arguments-tp2953511p2953511.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readBin which has two different sizes
Thanks! I'll give that a try. -- View this message in context: http://r.789695.n4.nabble.com/readBin-which-has-two-different-sizes-tp2953365p2953619.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sampling from normal distribution
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of solafah bh Sent: Sunday, October 03, 2010 3:39 PM To: R help mailing list Subject: [R] sampling from normal distribution Hello If i want to resampl from the tails of normal distribution , are these commans equivelant?? upper tail:qnorm(runif(n,pnorm(b),1)) if b is an upper tail boundary or upper tail:qnorm((1-p)+p(runif(n)) if p is the probability of each interval (the observatins are divided to intervals) Regards Yes, they are equivalent, although the second formula is missing a closing parenthesis and a multiplication operator. You could also simplify the second formula to qnorm(1-p*runif(n)) Hope this is helpful, Dan Daniel Nordlund Bothell __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to iterate through different arguments?
On Sun, Oct 3, 2010 at 4:29 PM, lord12 trexi...@yahoo.com wrote: If I have a model line = lm(y~x1) and I want to use a for loop to change the number of explanatory variables, how would I do this? So for example I want to store the model objects in a list. model1 = lm(y~x1) model2 = lm(y~x1+x2) model3 = lm(y~x1+x2+x3) model4 = lm(y~x1+x2+x3+x4) model5 = lm(y~x1+x2+x3+x4+x5)... model10. model_function = function(x){ for(i in 1:x) { } If x =1, then the list will only add model1. If x =2, then the list will add both model1 and model2. If x=3, then the list will add model1 model 2 and model3 and so on. How do I translate this into code? Here are a couple of approaches. The first one is simpler and may be adequate. The second has the advantage that it writes out the formula in full, fo, which is shown in the output: lapply(1:4, function(i) lm(y1 ~., anscombe[c(1:i, 5)])) lapply(1:4, function(i) { fo - formula(model.frame(y1 ~., anscombe[c(1:i, 5)])) do.call(lm, list(fo, quote(anscombe))) }) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scatterplot matrix - Pearson linear correlation and Density Ellipse
Dear ashz, Unfortunately, much of you want is not possible with the current implementation of pairs.panels. Since pairs.panels is adapted from the help file of pairs, you might try pairs and then add in the panel functions that do what you want. That the lm option does it what it does met a need of mine for a demo of the difference of regression slopes of X on Y versus Y on X. Your request is very reasonable and I will implement it in the next revision. To get the pairwise complete rather than pairwise, you can preprocess your data file with na.omit e.g., cfc - na.omit(cfcap[8:11]) pairs.panesl(cfc) At 1:37 PM -0700 10/3/10, ashz wrote: Hi, I used the pairs.panels() in pkg:psych and it is helpful. It saves time. but if I use this line: pairs.panels(cfcap[8:11], scale = FALSE, lm=TRUE,ellipses=TRUE, digits = 2 ) The results are: - The upper.panel does not show the pearson r but the lm data. Furthermore, can I use the pairwise.complete.obs method for the upper.panel. Can it be fixed? - Can I remove the histograms? Not as a call, but you change pairs.panels to draw just the densities by substituting panel.hist.density - function(x,...) { usr - par(usr); on.exit(par(usr)) par(usr = c(usr[1:2], 0, 1.5) ) h - hist(x, plot = FALSE) breaks - h$breaks; nB - length(breaks) y - h$counts; y - y/max(y) #rect(breaks[-nB], 0, breaks[-1], y,col=hist.col) # --- comment this line out tryd - try( d - density(x,na.rm=TRUE,bw=nrd,adjust=1.2),silent=TRUE) if(class(tryd) != try-error) { d$y - d$y/max(d$y) lines(d)} } in place of the current panel.hist.density function. - Can I control the eliipse alpha? Not yet. Good idea. Your requests are all very reasonable and will be added to my wish list of additions to pairs.panels. This will not happen for several weeks, however. Bill Thanks a lot. -- View this message in context: http://r.789695.n4.nabble.com/Scatterplot-matrix-Pearson-linear-correlation-and-Density-Ellipse-tp2763552p2953521.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CRAN (and crantastic) updates this week
CRAN (and crantastic) updates this week New packages Updated packages BARD (1.18), BAS (0.92), CollocInfer (0.1.2), CompQuadForm (1.1), CompRandFld (0.2), COUNT (1.1.0), DPpackage (1.1-2), FAiR (0.4-6) This email provided as a service for the R community by http://crantastic.org. Like it? Hate it? Please let us know: crana...@gmail.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can't find and install reshape2??
Hi everyone, Im trying to install reshape2. But when I click on install package its not coming up!?!?! Im getting reshape, but no reshape2? Ive also tried download.packages(reshape2, destdir=c:\\) download.packages(Reshape2, destdir=c:\\) but no luck!!! Does anyone have any ideas what could be going on? Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] short captions for xtable?
A quick update. The package maintainer says that it is indeed impossible. My solution was a bit of a workaround. I used the add.to.row option with \multicolumn to incorporate the source attribution into the table as the last row. Example: print(xtable(table_name,caption=short_caption),hline.after=NULL,add.to.row=list(pos=list(-1,0,l,l+1),command=c('\\toprule ','\\midrule ','\\midrule ','\\bottomrule \\multicolumn{number_of_columns}{l}{source_attribution}'))) -- View this message in context: http://r.789695.n4.nabble.com/short-captions-for-xtable-tp2719000p2953691.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sampling from normal distribution
On 03/10/2010 6:38 PM, solafah bh wrote: Hello If i want to resampl from the tails of normal distribution , are these commans equivelant?? upper tail:qnorm(runif(n,pnorm(b),1)) if b is an upper tail boundary or upper tail:qnorm((1-p)+p(runif(n)) if p is the probability of each interval (the observatins are divided to intervals) You don't say how far up in the tail you are going, but if b is very large, you have to watch out for rounding error. For example, with b=10, pnorm(b) will be exactly equal to 1, and both versions will fail. In general for b 0 you'll get a bit more accuracy by sampling from the lower tail using -b. For really extreme cases you will probably need to switch to a log scale. For example, to get a random sample from a normal, conditional on being larger than 20, you'd want something like n - 10 logp1 - pnorm(-20, log=TRUE) logprobs - log(runif(n)) + logp1 -qnorm(logprobs, log=TRUE) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find and install reshape2??
The first argument in download.packages should be of type character or a vector of characters. This worked for me: install.packages('reshape2') as did: download.packages('reshape2', '~/Downloads/') Cheers, Jeff. On Sun, Oct 3, 2010 at 8:57 PM, Chris Howden ch...@trickysolutions.com.au wrote: Hi everyone, I’m trying to install reshape2. But when I click on “install package” it’s not coming up!?!?! I’m getting reshape, but no reshape2? I’ve also tried download.packages(reshape2, destdir=c:\\) download.packages(Reshape2, destdir=c:\\)…but no luck!!! Does anyone have any ideas what could be going on? Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training (mobile) 0410 689 945 (fax / office) (+618) 8952 7878 ch...@trickysolutions.com.au [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read file
Hi, Michael Thank you for your help. I have already done what you said. But I am still facing problems to deal with my data. I need to split the data according to station.. I was able to identify where the station information start using: my.data-file(d2010100100.txt,open=rt) indata - readLines(my.data, n=2) i-grep(^[837],indata) #station number my.data2-read.table(d2010100100.txt,fill=TRUE,nrows=2) stn- my.data2$V1[i] 2010 10 01 00 *82599 -35.25 -5.91 52 1 * 1008.0 -115 3.1 298.6 294.6 64 2010 10 01 00 *83649 -40.28 -20.26 4 7* 1011.0 - 0 0.0 298.4 296.1 64 1000.0 96 40 5.7 297.9 295.1 32 925.0782325 3.1 295.4 294.1 32 850.0 1520270 4.1 293.8 289.4 32 700.0 3171240 8.7 284.1 279.1 32 500.0 5890275 8.2 266.2 262.9 32 400.0 7600335 9.8 255.4 242.4 32 === As you can see in the data above the line show the number of leves (or lines) for each station. I need to catch these lines so as to be able to feed my database. By the way, I didn't understand the regular expression you've used. I've tried to run it but it did not work. Hope you can help me! Best Regards, Nilza On Sun, Oct 3, 2010 at 2:18 AM, Michael Bedward michael.bedw...@gmail.comwrote: Hello Nilza, If your file is small you can read it into a character vector like this: indata - readLines(foo.dat) If your file is very big you can read it in batches like this... MAXRECS - 1000 # for example fcon - file(foo.dat, open=r) indata - readLines(fcon, n=MAXRECS) The number of lines read will be given by length(indata). You can check to see if the end of the file has been read yet with: isIncomplete( fcon ) If a leading * character is a flag for the start of a station data block you can find this in the indata vector with grepl... start.pos - which(indata, grepl(^\\s*\\*, indata) When you're finished reading the file... close(fcon) Hope this helps, Michael On 3 October 2010 13:31, Nilza BARROS nilzabar...@gmail.com wrote: Dear R-users, I would like to know how could I read a file with different lines lengths. I need read this file and create an output to feed my database. So after reading I'll need create an output like this INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20100910,837460, 39,390) I mean, each line should be read. But I don`t how to do this when these lines have different lengths I really appreciate any help. Thanks. Below the file that should be read === *2010 10 01 00 83746 -43.25 -22.81 6 51* 1012.0 -320 1.5 299.1 294.4 64 1000.0114250 4.1 298.4 294.8 32 925.0797 0 0.0 293.6 292.9 32 850.0 1524195 3.1 289.6 288.9 32 700.0 315629011.3 280.1 280.1 32 500.0 587028020.1 266.1 260.1 32 400.0 757026523.7 256.6 222.7 32 300.0 967026528.8 240.2 218.2 32 250.0 1092028027.3 230.2 220.2 32 200.0 1239026032.4 218.7 206.7 32 176.0 -25537.6 -.0 -.0 8 150.0 1418024535.5 205.1 196.1 32 100.0 1656030017.0 195.2 186.2 32 *2010 10 01 00 83768 -51.13 -23.33569 41 * 1000.0 79 - -.0 -.0 -.0 32 946.0 -270 1.0 295.8 292.1 64 925.0763 15 2.1 296.4 290.4 32 850.0 1497175 3.6 290.8 288.4 32 700.0 3140295 9.8 282.9 278.6 32 500.0 584028523.7 267.1 232.1 32 400.0 755025535.5 255.4 231.4 32 300.0 964026537.0 242.2 216.2 32 Best Regards, -- Abraço, Nilza Barros [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abraço, Nilza Barros [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can't find and install reshape2??
On Oct 3, 2010, at 8:57 PM, Chris Howden wrote: Hi everyone, I’m trying to install reshape2. From which mirror? Was it even present at the mirror at that time? Have you tried another mirror or retried from the same mirror? But when I click on “install package” it’s not coming up!?!?! I’m getting reshape, but no reshape2? I’ve also tried download.packages(reshape2, destdir=c:\\) download.packages(Reshape2, destdir=c:\\)…but no luck!!! Well, that last one was sure to fail. And the first one should have been invoked with reshape2 rather than reshape2, if the help page is accurate in saying the first argument needs to be a character vector. In neither case would this have installed the package, however. Can you describe what you meant by no luck? Did you actually check those destinations to see if a package was downloaded? Did you get an informative error message that you failed to report? Does anyone have any ideas what could be going on? The main CRAN site package check page says reshape2 is in good shape to be downloaded and installed for all OSes. The usual method is to use install.packages() so I would think the two leading possibilities are operator error (wrong function) or temporary unavailability at the particular mirror you have set for your default. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP development, Data Analysis, Modelling, and Training -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] readBin which has two different sizes
Henrik Bengtsson wrote: 1. Use x - readBin(..., what=raw, n=35269*(54*4)) to read your raw (byte) data. 2. Turn it into a 54x4x35269 array, e.g. dim(x) - c(54,4,35269). 3. Extract the 4-byte time stamps by yT - x[1:4,,,drop=FALSE]; This is of type raw. Use readBin() to parse it, i.e. zT - readBin(yT, what=integer, size=4, signed=TRUE, endian=big). There are your timestamps. 4. Extract the 50 1-byte samples by yS - x[5:54,,,drop=FALSE]; zS - readBin(yS, what=integer, size=1, signed=FALSE); Something like that. /Henrik Tried it and it works great. I had to make just one adjustment, i.e. specify the size of n, which I did using length: yT - x[1:4,,,drop=FALSE] zT - readBin(yT, what=integer, size=4, n=length(yT), signed=FALSE, endian=big) yS - x[5:54,,,drop=FALSE]; zS - readBin(yS, what=integer, size=1, n=length(yS), signed=TRUE) Thanks Henrik! -- View this message in context: http://r.789695.n4.nabble.com/readBin-which-has-two-different-sizes-tp2953365p2953721.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read file
On Oct 3, 2010, at 9:40 PM, Nilza BARROS wrote: Hi, Michael Thank you for your help. I have already done what you said. But I am still facing problems to deal with my data. I need to split the data according to station.. I was able to identify where the station information start using: my.data-file(d2010100100.txt,open=rt) indata - readLines(my.data, n=2) i-grep(^[837],indata) #station number That would give you the line numbers for any line that had an 8 , _or_ a 3, _or_ a 7 as its first digit. Was that your intent? My guess is that you did not really want to use the square braces and should have been using ^837. ?regex # Paragraph starting A character class my.data2-read.table(d2010100100.txt,fill=TRUE,nrows=2) stn- my.data2$V1[i] That would give you the first column values for the lines you earlier selected. This does not look like what I would expect as a value for stn. Is that what you wanted us to think this was? -- David. 2010 10 01 00 *82599 -35.25 -5.91 52 1 * 1008.0 -115 3.1 298.6 294.6 64 2010 10 01 00 *83649 -40.28 -20.26 4 7* 1011.0 - 0 0.0 298.4 296.1 64 1000.0 96 40 5.7 297.9 295.1 32 925.0782325 3.1 295.4 294.1 32 850.0 1520270 4.1 293.8 289.4 32 700.0 3171240 8.7 284.1 279.1 32 500.0 5890275 8.2 266.2 262.9 32 400.0 7600335 9.8 255.4 242.4 32 === As you can see in the data above the line show the number of leves (or lines) for each station. I need to catch these lines so as to be able to feed my database. By the way, I didn't understand the regular expression you've used. I've tried to run it but it did not work. Hope you can help me! Best Regards, Nilza On Sun, Oct 3, 2010 at 2:18 AM, Michael Bedward michael.bedw...@gmail.comwrote: Hello Nilza, If your file is small you can read it into a character vector like this: indata - readLines(foo.dat) If your file is very big you can read it in batches like this... MAXRECS - 1000 # for example fcon - file(foo.dat, open=r) indata - readLines(fcon, n=MAXRECS) The number of lines read will be given by length(indata). You can check to see if the end of the file has been read yet with: isIncomplete( fcon ) If a leading * character is a flag for the start of a station data block you can find this in the indata vector with grepl... start.pos - which(indata, grepl(^\\s*\\*, indata) When you're finished reading the file... close(fcon) Hope this helps, Michael On 3 October 2010 13:31, Nilza BARROS nilzabar...@gmail.com wrote: Dear R-users, I would like to know how could I read a file with different lines lengths. I need read this file and create an output to feed my database. So after reading I'll need create an output like this INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20100910,837460, 39,390) I mean, each line should be read. But I don`t how to do this when these lines have different lengths I really appreciate any help. Thanks. Below the file that should be read === *2010 10 01 00 83746 -43.25 -22.81 6 51* 1012.0 -320 1.5 299.1 294.4 64 1000.0114250 4.1 298.4 294.8 32 925.0797 0 0.0 293.6 292.9 32 850.0 1524195 3.1 289.6 288.9 32 700.0 315629011.3 280.1 280.1 32 500.0 587028020.1 266.1 260.1 32 400.0 757026523.7 256.6 222.7 32 300.0 967026528.8 240.2 218.2 32 250.0 1092028027.3 230.2 220.2 32 200.0 1239026032.4 218.7 206.7 32 176.0 -25537.6 -.0 -.0 8 150.0 1418024535.5 205.1 196.1 32 100.0 1656030017.0 195.2 186.2 32 *2010 10 01 00 83768 -51.13 -23.33569 41 * 1000.0 79 - -.0 -.0 -.0 32 946.0 -270 1.0 295.8 292.1 64 925.0763 15 2.1 296.4 290.4 32 850.0 1497175 3.6 290.8 288.4 32 700.0 3140295 9.8 282.9 278.6 32 500.0 584028523.7 267.1 232.1 32 400.0 755025535.5 255.4 231.4 32 300.0 964026537.0 242.2 216.2 32 Best Regards, -- Abraço, Nilza Barros David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How To Extract Row from A Data Frame
I have a data frame that looks like this: print(df) V2 V3 V4 V5 V6 V7 V8 V9V10V11V12 1 FN 8.637 28.890 31.430 31.052 29.878 33.215 32.728 32.187 29.305 31.462 2 FP 19.936 30.284 33.001 35.100 30.238 34.452 35.849 34.185 31.242 35.635 3 TN 0.000 17.190 16.460 21.100 17.960 15.120 17.200 17.190 15.270 15.310 4 TP 22.831 31.246 33.600 35.439 32.073 33.947 35.050 34.472 31.228 33.701 How can I extract rows as specified, e.g. I tried this to extract the first line (FN) starting from V3 to V12: fn - df[1,df$V3:df$V12] But it gives columns starting not from V3. What's the right way to do it? - G.V. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How To Extract Row from A Data Frame
?subset x V2 V3 V4 V5 V6 V7 V8 V9V10V11V12 1 FN 8.637 28.890 31.430 31.052 29.878 33.215 32.728 32.187 29.305 31.462 2 FP 19.936 30.284 33.001 35.100 30.238 34.452 35.849 34.185 31.242 35.635 3 TN 0.000 17.190 16.460 21.100 17.960 15.120 17.200 17.190 15.270 15.310 4 TP 22.831 31.246 33.600 35.439 32.073 33.947 35.050 34.472 31.228 33.701 subset(x, select = V3:V12) V3 V4 V5 V6 V7 V8 V9V10V11V12 1 8.637 28.890 31.430 31.052 29.878 33.215 32.728 32.187 29.305 31.462 2 19.936 30.284 33.001 35.100 30.238 34.452 35.849 34.185 31.242 35.635 3 0.000 17.190 16.460 21.100 17.960 15.120 17.200 17.190 15.270 15.310 4 22.831 31.246 33.600 35.439 32.073 33.947 35.050 34.472 31.228 33.701 On Sun, Oct 3, 2010 at 10:22 PM, Gundala Viswanath gunda...@gmail.com wrote: I have a data frame that looks like this: print(df) V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 1 FN 8.637 28.890 31.430 31.052 29.878 33.215 32.728 32.187 29.305 31.462 2 FP 19.936 30.284 33.001 35.100 30.238 34.452 35.849 34.185 31.242 35.635 3 TN 0.000 17.190 16.460 21.100 17.960 15.120 17.200 17.190 15.270 15.310 4 TP 22.831 31.246 33.600 35.439 32.073 33.947 35.050 34.472 31.228 33.701 How can I extract rows as specified, e.g. I tried this to extract the first line (FN) starting from V3 to V12: fn - df[1,df$V3:df$V12] But it gives columns starting not from V3. What's the right way to do it? - G.V. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How To Extract Row from A Data Frame
forgot you only wanted the first row: subset(x[1,], select = V3:V12) V3V4V5 V6 V7 V8 V9V10V11V12 1 8.637 28.89 31.43 31.052 29.878 33.215 32.728 32.187 29.305 31.462 On Sun, Oct 3, 2010 at 10:22 PM, Gundala Viswanath gunda...@gmail.com wrote: I have a data frame that looks like this: print(df) V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 1 FN 8.637 28.890 31.430 31.052 29.878 33.215 32.728 32.187 29.305 31.462 2 FP 19.936 30.284 33.001 35.100 30.238 34.452 35.849 34.185 31.242 35.635 3 TN 0.000 17.190 16.460 21.100 17.960 15.120 17.200 17.190 15.270 15.310 4 TP 22.831 31.246 33.600 35.439 32.073 33.947 35.050 34.472 31.228 33.701 How can I extract rows as specified, e.g. I tried this to extract the first line (FN) starting from V3 to V12: fn - df[1,df$V3:df$V12] But it gives columns starting not from V3. What's the right way to do it? - G.V. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] I have aproblem about nomogram--thank you for your help
dear professor: I have a problem about the nomogram.I have got the result through analysing the dataset exp2.sav through multinominal logistic regression by SPSS 17.0. and I want to deveop the nomogram through R-Projject,just like this : n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(0:1,labels=c(F,M)) Smoking-factor(0:1,labels=c(No,yes)) L-0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking)+0.92*as.numeric(Sex)-1.338 y - ifelse(runif(n) plogis(L), 1, 0) ddist - datadist(as.numeric(T.Grade,Sex,Smoking)) load package rms ddist - datadist(as.numeric(T.Grade,Sex,Smoking)) options(datadist='ddist') f-lrm(y~as.numeric(T.Grade)+as.numeric(Sex)+as.numeric(Smoking)) 错误于 error to:model.frame.default(formula = y ~ as.numeric(T.Grade) + as.numeric(Sex) + : 变数的长度不一样 the length of the variable is different ('as.numeric(T.Grade)') I encounter aproblem in the last program,and I try to settle this problem though several ways ,just like: asis(x, parms, label, name) matrx(x, label, name) pol(x, parms, label, name) lsp(x, parms, label, name) rcs(x, parms, label, name) catg(x, parms, label, name) scored(x, parms, label, name) strat(x, label, name) x1 %ia% x2 and i can not settle this problem can you tell me how to settle this problem,thank you turly yours__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How To Extract Row from A Data Frame
On Oct 3, 2010, at 10:22 PM, Gundala Viswanath wrote: I have a data frame that looks like this: print(df) V2 V3 V4 V5 V6 V7 V8 V9V10 V11V12 1 FN 8.637 28.890 31.430 31.052 29.878 33.215 32.728 32.187 29.305 31.462 2 FP 19.936 30.284 33.001 35.100 30.238 34.452 35.849 34.185 31.242 35.635 3 TN 0.000 17.190 16.460 21.100 17.960 15.120 17.200 17.190 15.270 15.310 4 TP 22.831 31.246 33.600 35.439 32.073 33.947 35.050 34.472 31.228 33.701 How can I extract rows as specified, e.g. I tried this to extract the first line (FN) starting from V3 to V12: fn - df[1,df$V3:df$V12] But it gives columns starting not from V3. The : operator only works for numeric values in [,] or []. And even then you would have been passing a very strange arguemtn to :, since df$V3 is a vector rather than a scalar. But these would also fail: df[1, V3:V12] df[1, V3:V12] These all work: df[which(df$V2==FN), grep(^V3$, names(df)):grep(^V12$, names(df)) ] df[1, 2:11] df[1, -1] # the subset function may have tricked you into believing that my statement about the : operator was false, but that function first parses the select= (and subset=) expressions against column names and returns column numbers before passing to : subset(df, V2==FN, select=-V2) subset(df, V2==FN, select=V3:V12) What's the right way to do it? - G.V. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Read file
Hi Nilza, Just to add to David's comments, if you are reading in your file with read.table(..., fill=TRUE), and assuming that you haven't yet replace - with NA, you don't need grep. You can just use the number of NAs in each line to locate data blocks. Date records have 3 NAs Location records have 2 NAs Data records have none. my.data2-read.table(d2010100100.txt,fill=TRUE,nrows=2) na.count - apply( my.data2, 1, function(x) sum( is.na(x) ) ) date.recs - which( na.count == 3) num.stns - length(date.recs) stn.data.length - c(diff(date.recs) - 2, nrow(my.data2) - date.recs[num.stns] - 1) Michael On 4 October 2010 13:05, David Winsemius dwinsem...@comcast.net wrote: On Oct 3, 2010, at 9:40 PM, Nilza BARROS wrote: Hi, Michael Thank you for your help. I have already done what you said. But I am still facing problems to deal with my data. I need to split the data according to station.. I was able to identify where the station information start using: my.data-file(d2010100100.txt,open=rt) indata - readLines(my.data, n=2) i-grep(^[837],indata) #station number That would give you the line numbers for any line that had an 8 , _or_ a 3, _or_ a 7 as its first digit. Was that your intent? My guess is that you did not really want to use the square braces and should have been using ^837. ?regex # Paragraph starting A character class my.data2-read.table(d2010100100.txt,fill=TRUE,nrows=2) stn- my.data2$V1[i] That would give you the first column values for the lines you earlier selected. This does not look like what I would expect as a value for stn. Is that what you wanted us to think this was? -- David. 2010 10 01 00 *82599 -35.25 -5.91 52 1 * 1008.0 - 115 3.1 298.6 294.6 64 2010 10 01 00 *83649 -40.28 -20.26 4 7* 1011.0 - 0 0.0 298.4 296.1 64 1000.0 96 40 5.7 297.9 295.1 32 925.0 782 325 3.1 295.4 294.1 32 850.0 1520 270 4.1 293.8 289.4 32 700.0 3171 240 8.7 284.1 279.1 32 500.0 5890 275 8.2 266.2 262.9 32 400.0 7600 335 9.8 255.4 242.4 32 === As you can see in the data above the line show the number of leves (or lines) for each station. I need to catch these lines so as to be able to feed my database. By the way, I didn't understand the regular expression you've used. I've tried to run it but it did not work. Hope you can help me! Best Regards, Nilza On Sun, Oct 3, 2010 at 2:18 AM, Michael Bedward michael.bedw...@gmail.comwrote: Hello Nilza, If your file is small you can read it into a character vector like this: indata - readLines(foo.dat) If your file is very big you can read it in batches like this... MAXRECS - 1000 # for example fcon - file(foo.dat, open=r) indata - readLines(fcon, n=MAXRECS) The number of lines read will be given by length(indata). You can check to see if the end of the file has been read yet with: isIncomplete( fcon ) If a leading * character is a flag for the start of a station data block you can find this in the indata vector with grepl... start.pos - which(indata, grepl(^\\s*\\*, indata) When you're finished reading the file... close(fcon) Hope this helps, Michael On 3 October 2010 13:31, Nilza BARROS nilzabar...@gmail.com wrote: Dear R-users, I would like to know how could I read a file with different lines lengths. I need read this file and create an output to feed my database. So after reading I'll need create an output like this INSERT INTO TEMP (DATA,STATION,VAR1,VAR2) VALUES (20100910,837460, 39,390) I mean, each line should be read. But I don`t how to do this when these lines have different lengths I really appreciate any help. Thanks. Below the file that should be read === *2010 10 01 00 83746 -43.25 -22.81 6 51* 1012.0 - 320 1.5 299.1 294.4 64 1000.0 114 250 4.1 298.4 294.8 32 925.0 797 0 0.0 293.6 292.9 32 850.0 1524 195 3.1 289.6 288.9 32 700.0 3156 290 11.3 280.1 280.1 32 500.0 5870 280 20.1 266.1 260.1 32 400.0 7570 265 23.7 256.6 222.7 32 300.0 9670 265 28.8 240.2 218.2 32 250.0 10920 280 27.3 230.2 220.2 32 200.0 12390 260 32.4 218.7 206.7 32 176.0 - 255 37.6 -.0 -.0 8 150.0 14180 245 35.5 205.1 196.1 32 100.0 16560 300 17.0 195.2 186.2 32 *2010 10 01 00 83768 -51.13 -23.33 569 41 * 1000.0 79 - -.0 -.0 -.0 32 946.0 - 270 1.0 295.8 292.1 64 925.0 763 15 2.1 296.4 290.4 32 850.0 1497 175 3.6 290.8 288.4 32 700.0 3140 295 9.8 282.9 278.6 32 500.0 5840 285 23.7 267.1 232.1 32 400.0 7550 255 35.5 255.4 231.4 32 300.0 9640 265 37.0 242.2 216.2 32 Best
Re: [R] I have aproblem about nomogram--thank you for your help
On Oct 3, 2010, at 10:42 PM, 笑啸 wrote: dear professor: I have a problem about the nomogram.I have got the result through analysing the dataset exp2.sav through multinominal logistic regression by SPSS 17.0. That is an inadequate specification of a statistical analysis (although it might pass for such in the typical medical journal). and I want to deveop the nomogram through R-Projject,just like this : I know of no way of taking a function developed in SPSS/SAS/Stata and simply dropping it into the nomogram function to generate sensible output. There may be such a method that you could piece together by examining the code, but it appears to me that you are not yet ready for that task. nomogram() was clearly developed to by used as a part of the rms package rather than as a stand-alone graphical utility. n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(0:1,labels=c(F,M)) Smoking-factor(0:1,labels=c(No,yes)) L-0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) +0.92*as.numeric(Sex)-1.338 L [1] -0.755 -0.172 0.363 0.946 y - ifelse(runif(n) plogis(L), 1, 0) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) # wrap the vectors into a dataframe. options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) # skip the as.numeric()'s ### Gives an error message due to singular X matrix. f-lrm(y~T.Grade +Sex+Smoking, data=dfr) singular information matrix in lrm.fit (rank= 5 ). Offending variable(s): Sex=M Error in lrm(y ~ T.Grade + Sex + Smoking, data = dfr) : Unable to fit model using “lrm.fit” #Try instead: n-100 set.seed(10) T.Grade-factor(0:3,labels=c(G0, G1, G2,G3)) Sex-factor(sample(0:1, 100, replace=TRUE),labels=c(F,M)) Smoking-factor(sample(0:1, 100, replace=TRUE),labels=c(No,yes)) dfr$L - with(dfr, 0.559*as.numeric(T.Grade)-0.896*as.numeric(Smoking) +0.92*as.numeric(Sex)-1.338) dfr$y - with(dfr, ifelse(runif(n) plogis(L), 1, 0) ) dfr - data.frame(T.Grade,Sex,Smoking, L, y) ddist - datadist(dfr) options(datadist='ddist') f-lrm(y~T.Grade +Sex+Smoking, data=dfr) # Then follow the example on the help(nomogram page) nom - nomogram(f, fun=function(x)1/(1+exp(-x)), # or fun=plogis fun.at=c(.001,.01,.05,seq(.1,.9,by=.1),.95,.99,.999), funlabel=Risk of Death) plot(nom, xfrac=.45) Please note: this is _not_ your nomogram function due to the random aspects of the creation of y, and this is NOT multinomial logistic regression (since you only have a dichotomous outcome). For one possible variant of multinomial logistic regression supported in the rms/Hmisc function suite, you would need to use polr. -- David. 错误于 error to:model.frame.default(formula = y ~ as.numeric(T.Grade) + as.numeric(Sex) + : 变数的长度不一样 the length of the variable is different ('as.numeric(T.Grade)') I encounter aproblem in the last program,and I try to settle this problem though several ways ,just like: asis(x, parms, label, name) matrx(x, label, name) pol(x, parms, label, name) lsp(x, parms, label, name) rcs(x, parms, label, name) catg(x, parms, label, name) scored(x, parms, label, name) strat(x, label, name) x1 %ia% x2 and i can not settle this problem can you tell me how to settle this problem,thank you turly yours --- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Programmaticly finding number of processors by R code
If you have installed multicore (for unix/mac), you can find the number of cores by /*multicore:::detectCores()*/ On 10/3/10 1:03 PM, Ajay Ohri wrote: Dear List Sorry if this question seems very basic. Is there a function to pro grammatically find number of processors in my system _ I want to pass this as a parameter to snow in some serial code to parallel code functions Regards Ajay Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Abhijit Dasgupta, PhD Director and Principal Statistician ARAASTAT Ph: 301.385.3067 E: adasgu...@araastat.com W: http://www.araastat.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.