Re: [R] Import excel file
Hello, According to manual R Data Import/Excel, xlsreadWrite package and RODBC package are used for 32-bit version (tests confirmed this observation) I used XLConnect package but it returned this error : Erreur dans new(J(com.miraisolutions.xlconnect.integration.r.RWorkbookWrapper), : erreur d'évaluation de l'argument 'Class' lors de la sélection d'une méthode pour la fonction 'new' : Erreur dans .jfindClass(as.character(class)) : class not found Thanks for your help Best regards, Cyril Hervy -Message d'origine- De : Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Envoyé : dimanche 18 novembre 2012 19:10 À : Rui Barradas Cc : Hervy Cyril; r-help@r-project.org Objet : Re: [R] Import excel file On 16.11.2012 16:59, Rui Barradas wrote: Hello, I believe it is, but see package XLConnect. The vignette is very helpfull, with lots of examples. ... and many other ways. Please see the manual R Data Import/Excel that ships with R and has a whole section devoted to Excel. Best, Uwe Ligges Hope this helps, Rui Barradas Em 16-11-2012 15:27, Hervy Cyril escreveu: Hello, Is it possible to import an Excel 2000 file (32-bit version) into R 2.15.1 64-bit version? Thanks. Best regards, Cyril Hervy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table
read.table(data.txt,header=TRUE, colClasses=c(character,character,numeric,character, numeric,numeric)) a b c d e f 1 SPX LSZ 100 C 0 34.40 2 SPX LSZ 100 P 0 1.30 3 SPX LSZ 105 C 0 30.30 4 SPX LSZ 105 P 0 1.85 5 SPX LSZ 110 C 0 26.40 It's right result! header=TRUE,not header=T. I don't know why not. -- View this message in context: http://r.789695.n4.nabble.com/read-table-tp871880p4650010.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Stepwise analysis with fixed variables
Hello, How can I run a backward stepwise regression with part of the variables fixed, while the others participate in the backward stepwise analysis? Thank you, Einat -- View this message in context: http://r.789695.n4.nabble.com/Stepwise-analysis-with-fixed-variables-tp4650015.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stepwise analysis with fixed variables
On 19.11.2012 08:49, Einat wrote: Hello, How can I run a backward stepwise regression with part of the variables fixed, while the others participate in the backward stepwise analysis? Thank you, Einat Read ?step and about its argument scope that can be a list with a lower component where you specify the minimal model. Uwe Ligges -- View this message in context: http://r.789695.n4.nabble.com/Stepwise-analysis-with-fixed-variables-tp4650015.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop to subtract arrays / error
Hello, Try the following. Xjj - matrix(nrow = 101, ncol = 1000) for (i in 1:dim(Vsimr)[2]) { Xjj[, i] - Vobsr - Vsimr[, i] } Hope this helps, Rui Barradas Em 19-11-2012 01:41, iembry escreveu: Hi everyone, I am having trouble with creating a loop to subtract arrays. In R, this is what I have done: Vobsr - read.csv(Observed_Flow.csv, header = TRUE, sep =,) # see data below Vsimr - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) # see data below Vobsr - as.matrix(Vobsr[,-1]) # remove column 1 from analysis thus Vobsr is 101x1 double matrix (column 1 is date information) Vsimr - as.matrix(Vsimr[,-1]) # remove column 1 from analysis thus Vsimr is 101x1000 double matrix (column 1 is date information) Vobsr - Vsimr Error in Vobsr - Vsimr : non-conformable arrays Thus I attempted to create the loop below to perform the subtraction operation for each of the 1000 columns. dim(Vsimr)[2] [1] 1000 for (i in 1:dim(Vsimr)[2]) { Xjj - Vobsr - Vsimr[,i] } Xjj is a 101x1 double matrix rather than a 101X1000 double matrix How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. I am including some of the data from the files that I am operating on below: 1 column of Observed_Flow.csv 81.071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 78.184 87.179 97.862 110.23 124.29 140.08 157.57 176.76 197.63 220.18 244.4 270.31 297.88 327.14 358.09 390.71 425.03 461.03 498.72 538.09 579.16 621.91 666.35 712.48 760.29 809.8 860.99 913.87 968.44 1024.7 1082.6 1142.3 1203.6 1266.6 1331.3 1397.7 1465.7 1535.5 1606.9 1680.1 1754.9 1831.4 1907.1 1981.9 2055.9 2129 2201.3 2272.7 2343.3 2413.1 2482 2550.1 2617.3 2683.7 2749.2 2813.9 2877.8 2940.8 3003 3064.3 3124.8 3184.4 3243.2 3301.1 3358.2 3414.5 3469.9 3524.4 3578.2 3631 3683.1 3734.3 3784.6 3834.1 3882.8 3930.6 3977.6 4023.7 4069 4113.4 4157 4199.8 4241.7 4282.7 4323 4362.3 4400.9 4438.6 4475.4 4511.4 4546.6 2 columns of 1000 columns of 1000Samples_Vsim.csv 81.07 81.07 73.19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 24.86 85.94 22.01 96.65 19.65 109.23 17.8 123.67 16.46 139.96 15.61 158.12 15.27 178.14 15.43 200.02 16.1 223.75 17.27 249.35 18.94 276.81 21.11 306.13 23.79 337.31 26.97 370.34 30.65 405.24 34.84 442 39.52 480.62 44.71 521.1 50.41 563.44 56.61 607.64 63.31 653.7 70.51 701.62 78.21 751.4 86.42 803.04 95.13 856.53 104.35 911.89 114.06 969.11 124.28 1028.2 135.01 1089.1 146.23 1151.9 157.96 1216.6 170.19 1283.1 182.93 1351.5 196.16 1421.7 209.9 1493.8 224.15 1567.8 238.89 1643.6 254.14 1721.3 269.89 1800.8 286.15 1882.2 302.91 1965.5 320.17 2050.6 337.18 2134.8 353.93 2218.1 370.44 2300.4 386.69 2381.8 402.7 2462.3 418.45 2541.8 433.95 2620.4 449.2 2698.1 464.2 2774.9 478.94 2850.7 493.44 2925.6 507.68 2999.5 521.67 3072.6 535.41 3144.7 548.9 3215.8 562.14 3286.1 575.12 3355.4 587.86 3423.8 600.34 3491.2 612.57 3557.7 624.55 3623.3 636.28 3688 647.76 3751.7 658.98 3814.5 669.96 3876.4 680.68 3937.3 691.15 3997.3 701.37 4056.4 711.34 4114.6 721.06 4171.8 730.52 4228.1 739.74 4283.4 748.7 4337.9 757.41 4391.4 765.87 4443.9 774.08 4495.6 782.04 4546.3 789.74 4596.1 797.2 4644.9 804.4 4692.8 811.35 4739.8 818.05 4785.9 824.5 4831 830.7 4875.2 836.64 4918.5 842.33 4960.8 847.78 5002.2 852.97 5042.7 857.91 5082.3 -- View this message in context: http://r.789695.n4.nabble.com/loop-to-subtract-arrays-error-tp4650001.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table
On 19.11.2012 08:07, li1127217ye wrote: read.table(data.txt,header=TRUE, colClasses=c(character,character,numeric,character, numeric,numeric)) a b c d e f 1 SPX LSZ 100 C 0 34.40 2 SPX LSZ 100 P 0 1.30 3 SPX LSZ 105 C 0 30.30 4 SPX LSZ 105 P 0 1.85 5 SPX LSZ 110 C 0 26.40 It's right result! header=TRUE,not header=T. I don't know why not. Because you have a T in your workspace that is not an alias for TRUE? Uwe Ligges -- View this message in context: http://r.789695.n4.nabble.com/read-table-tp871880p4650010.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.table
Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of li1127217ye Sent: Monday, November 19, 2012 8:08 AM To: r-help@r-project.org Subject: Re: [R] read.table read.table(data.txt,header=TRUE, colClasses=c(character,character,numeric,character, numeric,numeric)) a b c d e f 1 SPX LSZ 100 C 0 34.40 2 SPX LSZ 100 P 0 1.30 3 SPX LSZ 105 C 0 30.30 4 SPX LSZ 105 P 0 1.85 5 SPX LSZ 110 C 0 26.40 It's right result! header=TRUE,not header=T. I don't know why not. You probably has some T variable defined in your workspace. What is result if you write T in your console. Regards Petr -- View this message in context: http://r.789695.n4.nabble.com/read-table- tp871880p4650010.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop to subtract arrays / error
Hello, Or simpler, since Vobsr only has one column: Xjj - as.vector(Vobsr) - Vsimr Hope this helps, Rui Barradas Em 19-11-2012 10:05, Rui Barradas escreveu: Hello, Try the following. Xjj - matrix(nrow = 101, ncol = 1000) for (i in 1:dim(Vsimr)[2]) { Xjj[, i] - Vobsr - Vsimr[, i] } Hope this helps, Rui Barradas Em 19-11-2012 01:41, iembry escreveu: Hi everyone, I am having trouble with creating a loop to subtract arrays. In R, this is what I have done: Vobsr - read.csv(Observed_Flow.csv, header = TRUE, sep =,) # see data below Vsimr - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) # see data below Vobsr - as.matrix(Vobsr[,-1]) # remove column 1 from analysis thus Vobsr is 101x1 double matrix (column 1 is date information) Vsimr - as.matrix(Vsimr[,-1]) # remove column 1 from analysis thus Vsimr is 101x1000 double matrix (column 1 is date information) Vobsr - Vsimr Error in Vobsr - Vsimr : non-conformable arrays Thus I attempted to create the loop below to perform the subtraction operation for each of the 1000 columns. dim(Vsimr)[2] [1] 1000 for (i in 1:dim(Vsimr)[2]) { Xjj - Vobsr - Vsimr[,i] } Xjj is a 101x1 double matrix rather than a 101X1000 double matrix How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. I am including some of the data from the files that I am operating on below: 1 column of Observed_Flow.csv 81.071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 78.184 87.179 97.862 110.23 124.29 140.08 157.57 176.76 197.63 220.18 244.4 270.31 297.88 327.14 358.09 390.71 425.03 461.03 498.72 538.09 579.16 621.91 666.35 712.48 760.29 809.8 860.99 913.87 968.44 1024.7 1082.6 1142.3 1203.6 1266.6 1331.3 1397.7 1465.7 1535.5 1606.9 1680.1 1754.9 1831.4 1907.1 1981.9 2055.9 2129 2201.3 2272.7 2343.3 2413.1 2482 2550.1 2617.3 2683.7 2749.2 2813.9 2877.8 2940.8 3003 3064.3 3124.8 3184.4 3243.2 3301.1 3358.2 3414.5 3469.9 3524.4 3578.2 3631 3683.1 3734.3 3784.6 3834.1 3882.8 3930.6 3977.6 4023.7 4069 4113.4 4157 4199.8 4241.7 4282.7 4323 4362.3 4400.9 4438.6 4475.4 4511.4 4546.6 2 columns of 1000 columns of 1000Samples_Vsim.csv 81.07 81.07 73.19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 24.86 85.94 22.01 96.65 19.65 109.23 17.8 123.67 16.46 139.96 15.61 158.12 15.27 178.14 15.43 200.02 16.1 223.75 17.27 249.35 18.94 276.81 21.11 306.13 23.79 337.31 26.97 370.34 30.65 405.24 34.84 442 39.52 480.62 44.71 521.1 50.41 563.44 56.61 607.64 63.31 653.7 70.51 701.62 78.21 751.4 86.42 803.04 95.13 856.53 104.35 911.89 114.06 969.11 124.28 1028.2 135.01 1089.1 146.23 1151.9 157.96 1216.6 170.19 1283.1 182.93 1351.5 196.16 1421.7 209.9 1493.8 224.15 1567.8 238.89 1643.6 254.14 1721.3 269.89 1800.8 286.15 1882.2 302.91 1965.5 320.17 2050.6 337.18 2134.8 353.93 2218.1 370.44 2300.4 386.69 2381.8 402.7 2462.3 418.45 2541.8 433.95 2620.4 449.2 2698.1 464.2 2774.9 478.94 2850.7 493.44 2925.6 507.68 2999.5 521.67 3072.6 535.41 3144.7 548.9 3215.8 562.14 3286.1 575.12 3355.4 587.86 3423.8 600.34 3491.2 612.57 3557.7 624.55 3623.3 636.28 3688 647.76 3751.7 658.98 3814.5 669.96 3876.4 680.68 3937.3 691.15 3997.3 701.37 4056.4 711.34 4114.6 721.06 4171.8 730.52 4228.1 739.74 4283.4 748.7 4337.9 757.41 4391.4 765.87 4443.9 774.08 4495.6 782.04 4546.3 789.74 4596.1 797.2 4644.9 804.4 4692.8 811.35 4739.8 818.05 4785.9 824.5 4831 830.7 4875.2 836.64 4918.5 842.33 4960.8 847.78 5002.2 852.97 5042.7 857.91 5082.3 -- View this message in context: http://r.789695.n4.nabble.com/loop-to-subtract-arrays-error-tp4650001.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function customization
On 16.11.2012 09:55, maxbre wrote: Given my reproducible example: new.ex-structure(list(TEC = c(0.21, 0.077, 0.06, 0.033, 0.014, 0.007, 0.21, 0.077, 0.01, 0.033, 0.05, 0.014), LR = c(FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, TRUE, TRUE ), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(1, 2), class = factor)), .Names = c(TEC, LR, group), row.names = c(NA, -12L), class = data.frame) And this few lines of code: library(NADA) out-with(new.ex, cenfit(TEC, LR, group) ) out giving the following output: n n.cen median mean sd group=1 6 2 0.033 0.05827778 0.08357853 group=2 6 3 0.033 0.0698 0.07925407 The output is produces from the survfit object by survival:::print.survfit just adapt the code for your need and add the columns you like. Uwe Ligges I would like to add one more result for each group to the above output, namely “sum”, computed as the product of “n” times “mean” This is pretty much similar (a slight variation) to a question I posted earlier in: http://r.789695.n4.nabble.com/How-to-modify-a-S4-function-in-the-package-NADA-td4649586.html But in this case I have some problems in modifying the cenfit() function dealing with group as a factor My objective is to modify the original function cenfit() so that to get also the computation of “sum” as product of “n” times “mean” For some reasons I can not properly understand, I’m not able to successfully modify my earlier attempt (which was not accounting for groups) mycenfit - function(x) { s = summary(x) c(n = nrow(s), n.cen = nrow(s) - sum(s$n.event), median = median(x), mean = mean(x)[[mean]], sd = sd(x), sum=mean(x)[[mean]]*length(x)) } how to change it in order to properly deal with groups? Thank you for any help max -- View this message in context: http://r.789695.n4.nabble.com/function-customization-tp4649711.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simulation with cpm package
On 13.11.2012 15:45, Christopher Desjardins wrote: Hi, I am running the following code based on the cpm vignette's code. I believe the code is syntactically correct but it just seems to hang R. I can get this to run if I set the sims to 100 but with 2000 it just hangs. Any ideas why? No: Works for me and completes within 90 minutes. Uwe Ligges Thanks, Chris library(cpm) cpmTypes - c(Kolmogorov-Smirnov,Mann-Whitney,Cramer-von-Mises) changeMagnitudes - c(1, 2, 4, 5) changeLocations - c(50,100,300) sims - 2000 ARL0 - 500 startup - 20 results - list() for (cpmType in cpmTypes) { results[[cpmType]] - matrix(numeric(length(changeMagnitudes) * length(changeLocations)), nrow = length(changeMagnitudes)) for (cm in 1:length(changeMagnitudes)) { for (cl in 1:length(changeLocations)) { print(sprintf(cpm:%s magnitude::%s location:%s, cpmType, changeMagnitudes[cm], changeLocations[cl])) temp - numeric(sims) for (s in 1:sims) { x -c(rchisq(changeLocations[cl], df=3), rchisq(2000, df=changeMagnitudes[cm])) temp[s] -detectChangePoint(x, cpmType, ARL0=ARL0, startup=startup)$detectionTime } results[[cpmType]][cm,cl] - mean(temp[temp changeLocations[cl]]) - changeLocations[cl] } } } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Interpretation of davies.test() in segmented package
dear Greg, It does not, however give me Pr(|t|) for the break point coefficients. Of course a test H_0: breakpoint=0 is meaningless.. I need to answer the question H:0 Beta0=Beta with a certainty metric, sorry, who is your Beta0? davies.test() tests the hypothesis H0: leftSlope=rightSlope which implies diffSlope=0 and then the breakpoint does not exist. K in davies.test() means the number of evaluation points used to compute the approximate p-value.. Please contact me off list if you need more details (given detailed questions) vito Il 16/11/2012 20.57, Greg Cohn ha scritto: My data: I have raw data points that form a logit style curve as if they were a time series. Which is to say they form 3 distinct lines with 3 distinct slopes in backwards z pattern. A certain class of my data looks essentially flat to the eye with marginal oscillation. What is important to me is the x value at which the state change is occurring, in other words, the break point Use of segmented(): Segmented does a very good job of capturing the breakpoints and fitting three distinct slopes, i.e. linear models. It does not, however give me Pr(|t|) for the break point coefficients. I need to answer the question H:0 Beta0=Beta with a certainty metric, i.e. probability statistic. This is especially important for my, flat looking data class. davies.test() question: davies.test() only excepts lm() or glm() objects as input. If I run segmented to find 1 breakpoint instead of 2, I get a totally bogus answer. Without knowing the breakpoints, how is this test able to assess the proper breakpiont? It appears to only give 1 best breakpoint, which is not consistent with the breakpoints found by segmented(). Also, is K the number of breakpoints or the number of iterations that it evaluates the breakpoint? Thanks in advance. lmfit-glm(TotRad_KW~HRRPUA_kWm2,data=d1) davies.test(lmfit,seg.Z=~HRRPUA_kWm2,k=1000,alternative=less, beta0=0,dispersion=NULL) Davies' test for a change in the slope data: Model = gaussian , link = identity formula = TotRad_KW ~ HRRPUA_kWm2 segmented variable = HRRPUA_kWm2 `Best' at = 561.205, n.points = 1000, p-value 2.2e-16 alternative hypothesis: less segments - segmented(lmfit, seg.Z=~HRRPUA_kWm2,psi=c(475,550)) summary(segments) ***Regression Model with Segmented Relationship(s)*** Call: segmented.glm(obj = lmfit, seg.Z = ~HRRPUA_kWm2, psi = c(475, 550)) Estimated Break-Point(s): Est. St.Err psi1.HRRP 430.2 4.087 psi2.HRRP 484.6 3.077 t value for the gap-variable(s) V: 0 0 Meaningful coefficients of the linear terms: Estimate Std. Error t value Pr(|t|) (Intercept) -38.6993 274.7666 -0.141 0.8891 HRRPUA_kWm2 1.4297 0.7472 1.914 0.0668 . U1.HRRP 42.2884 4.7696 8.866 NA U2.HRRP -40.5897 4.7123 -8.614 NA --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 (Dispersion parameter for gaussian family taken to be 6934.706) Null deviance: 70776718 on 31 degrees of freedom Residual deviance: 180302 on 26 degrees of freedom AIC: 377.19 Convergence attained in 2 iterations with relative change -1.662839e-14 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Download a file from url
you can use RCurl for web connections from Duncan Lang, see its paper: http://www.omegahat.org/RCurl/RCurlJSS.pdf getURL there will solve your problem. On Sun, Nov 18, 2012 at 2:07 AM, veepsirtt veepsi...@gmail.com wrote: Hi R, I installed wget and tried to download the file from this http://nseindia.com/content/equities/scripvol/datafiles/16-11-2012-TO-16-11-2012ACCEQN.csv but it fails. How to get it using wget? thanks veepsirtt #Define Working Directory, where files would be saved setwd('G:/NIFTY') #Define start and end dates, and convert them into date format startDate = as.Date(2011-01-05, order=ymd) endDate = as.Date(2011-02-01, order=ymd) f - tempfile() downloadfilename=paste(ACC, EQN, sep = ) temp = #Generate URL http://nseindia.com/content/equities/scripvol/datafiles/16-11-2012-TO-16-11-2012ACCEQN.csv myURL = paste(http://nseindia.com/content/equities/scripvol/datafiles/;, as.character(startDate, %d-%m-%Y), -TO-, as.character(endDate, %d-%m-%Y), downloadfilename, .csv, sep = ) download.file(myURL, f, method='wget', extra=-U 'Mozilla/5.0 (X11; Linux) Gecko Firefox/5.0') temp - read.csv(f, sep = ,) head(temp) -- View this message in context: http://r.789695.n4.nabble.com/Download-a-file-from-url-tp4642985p4649907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Introductory R course: 6-8.12.2012
Dear list members, Apologies for cross-posting, there are some places available in an introductory R course. Please, find below the information of this R training course. If you have any question, don't hesitate to contact me. Best regards, Pablo ++ Course: Statistical Analysis with R Where: Linux Hotel, Essen-Horst, Germany When: 6-7-8 December 2012 Instructor: Dr. Pablo E. Verde ++ *Target audience* This course is for data analysts who are familiar with classical statistics statistical software (e.g. SAS, SPSS) and they want to get a working knowledge in R. This is a 3 days intensive training course with 8 hours per day including lecturing and exercises. The course presentation is practical with many worked examples. Lectures are given in English. Discussions can be in English, German or Spanish. ++ Day 1: Introduction to the R system * Introduction to R * Data management with R * Graphical methods with R * Classical statistical procedures with R * Introduction to the R language Day 2: Statistical modeling with R * Linear regression and ANOVA * Generalized linear models: logistic regression, loglinear model, etc. * Issues in statistical modeling * Computer simulation and model checking Day 3: Advance statistical modeling with R * Modeling time to event data and survival analysis * Introduction to mixed linear modeling * Applications of generalized linear mixed effects models * Own projects ++ Costs: Public sector and commercial: 1088,85 Euro (three days course, incl full board and VAT) additionally accommodation (on demand) 63,13 Euro shared double room per night (incl. VAT) or 138,03 single room per night (inc. VAT). Student: 870 Euro (three days course - incl full board, shared double room [single room on demand] and VAT). Some of the courses are frequently fully booked. So please notice that you may have to try several times, until you get a spare place. For more information, please contact: i...@linuxhotel.de ++ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to retrieve data from a matrix
Thank you very much, It works! -- View this message in context: http://r.789695.n4.nabble.com/How-to-retrieve-data-from-a-matrix-tp4649721p4650036.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop to subtract arrays / error
Hi Rui Barradas, how are you? Thank-you very much. That worked perfectly. Irucka Embry -Original Message- From: Rui Barradas [ruipbarra...@sapo.pt] Sent: 11/19/2012 4:05:11 AM To: iruc...@mail2world.com Cc: r-help@r-project.org Subject: Re: [R] loop to subtract arrays / error Hello, Try the following. Xjj - matrix(nrow = 101, ncol = 1000) for (i in 1:dim(Vsimr)[2]) { Xjj[, i] - Vobsr - Vsimr[, i] } Hope this helps, Rui Barradas Em 19-11-2012 01:41, iembry escreveu: Hi everyone, I am having trouble with creating a loop to subtract arrays. In R, this is what I have done: Vobsr - read.csv(Observed_Flow.csv, header = TRUE, sep =,) # see data below Vsimr - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) # see data below Vobsr - as.matrix(Vobsr[,-1]) # remove column 1 from analysis thus Vobsr is 101x1 double matrix (column 1 is date information) Vsimr - as.matrix(Vsimr[,-1]) # remove column 1 from analysis thus Vsimr is 101x1000 double matrix (column 1 is date information) Vobsr - Vsimr Error in Vobsr - Vsimr : non-conformable arrays Thus I attempted to create the loop below to perform the subtraction operation for each of the 1000 columns. dim(Vsimr)[2] [1] 1000 for (i in 1:dim(Vsimr)[2]) { Xjj - Vobsr - Vsimr[,i] } Xjj is a 101x1 double matrix rather than a 101X1000 double matrix How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. I am including some of the data from the files that I am operating on below: 1 column of Observed_Flow.csv 81.071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 78.184 87.179 97.862 110.23 124.29 140.08 157.57 176.76 197.63 220.18 244.4 270.31 297.88 327.14 358.09 390.71 425.03 461.03 498.72 538.09 579.16 621.91 666.35 712.48 760.29 809.8 860.99 913.87 968.44 1024.7 1082.6 1142.3 1203.6 1266.6 1331.3 1397.7 1465.7 1535.5 1606.9 1680.1 1754.9 1831.4 1907.1 1981.9 2055.9 2129 2201.3 2272.7 2343.3 2413.1 2482 2550.1 2617.3 2683.7 2749.2 2813.9 2877.8 2940.8 3003 3064.3 3124.8 3184.4 3243.2 3301.1 3358.2 3414.5 3469.9 3524.4 3578.2 3631 3683.1 3734.3 3784.6 3834.1 3882.8 3930.6 3977.6 4023.7 4069 4113.4 4157 4199.8 4241.7 4282.7 4323 4362.3 4400.9 4438.6 4475.4 4511.4 4546.6 2 columns of 1000 columns of 1000Samples_Vsim.csv 81.07 81.07 73.19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 24.86 85.94 22.01 96.65 19.65 109.23 17.8 123.67 16.46 139.96 15.61 158.12 15.27 178.14 15.43 200.02 16.1 223.75 17.27 249.35 18.94 276.81 21.11 306.13 23.79 337.31 26.97 370.34 30.65 405.24 34.84 442 39.52 480.62 44.71 521.1 50.41 563.44 56.61 607.64 63.31 653.7 70.51 701.62 78.21 751.4 86.42 803.04 95.13 856.53 104.35 911.89 114.06 969.11 124.28 1028.2 135.01 1089.1 146.23 1151.9 157.96 1216.6 170.19 1283.1 182.93 1351.5 196.16 1421.7 209.9 1493.8 224.15 1567.8 238.89 1643.6 254.14 1721.3 269.89 1800.8 286.15 1882.2 302.91 1965.5 320.17 2050.6 337.18 2134.8 353.93 2218.1 370.44 2300.4 386.69 2381.8 402.7 2462.3 418.45 2541.8 433.95 2620.4 449.2 2698.1 464.2 2774.9 478.94 2850.7 493.44 2925.6 507.68 2999.5 521.67 3072.6 535.41 3144.7 548.9 3215.8 562.14 3286.1 575.12 3355.4 587.86 3423.8 600.34 3491.2 612.57 3557.7 624.55 3623.3 636.28 3688 647.76 3751.7 658.98 3814.5 669.96 3876.4 680.68 3937.3 691.15 3997.3 701.37 4056.4 711.34 4114.6 721.06 4171.8 730.52 4228.1 739.74 4283.4 748.7 4337.9 757.41 4391.4 765.87 4443.9 774.08 4495.6 782.04 4546.3 789.74 4596.1 797.2 4644.9 804.4 4692.8 811.35 4739.8 818.05 4785.9 824.5 4831 830.7 4875.2 836.64 4918.5 842.33 4960.8 847.78 5002.2 852.97 5042.7 857.91 5082.3 -- View this message in context: http://r.789695.n4.nabble.com/loop-to-subtract-arrays-error-tp4650001.h tml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. . span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 style=font-size:13.5px___BRGet the Free email that has everyone talking at a href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr font color=#99Unlimited Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; Much More!/font/font/span [[alternative HTML version deleted]]
Re: [R] loop to subtract arrays / error
Hi Rui, thank-you. That was simple and worked great. Irucka -Original Message- From: Rui Barradas [ruipbarra...@sapo.pt] Sent: 11/19/2012 4:13:24 AM To: iruc...@mail2world.com Cc: r-help@r-project.org Subject: Re: [R] loop to subtract arrays / error Hello, Or simpler, since Vobsr only has one column: Xjj - as.vector(Vobsr) - Vsimr Hope this helps, Rui Barradas Em 19-11-2012 10:05, Rui Barradas escreveu: Hello, Try the following. Xjj - matrix(nrow = 101, ncol = 1000) for (i in 1:dim(Vsimr)[2]) { Xjj[, i] - Vobsr - Vsimr[, i] } Hope this helps, Rui Barradas Em 19-11-2012 01:41, iembry escreveu: Hi everyone, I am having trouble with creating a loop to subtract arrays. In R, this is what I have done: Vobsr - read.csv(Observed_Flow.csv, header = TRUE, sep =,) # see data below Vsimr - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) # see data below Vobsr - as.matrix(Vobsr[,-1]) # remove column 1 from analysis thus Vobsr is 101x1 double matrix (column 1 is date information) Vsimr - as.matrix(Vsimr[,-1]) # remove column 1 from analysis thus Vsimr is 101x1000 double matrix (column 1 is date information) Vobsr - Vsimr Error in Vobsr - Vsimr : non-conformable arrays Thus I attempted to create the loop below to perform the subtraction operation for each of the 1000 columns. dim(Vsimr)[2] [1] 1000 for (i in 1:dim(Vsimr)[2]) { Xjj - Vobsr - Vsimr[,i] } Xjj is a 101x1 double matrix rather than a 101X1000 double matrix How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. I am including some of the data from the files that I am operating on below: 1 column of Observed_Flow.csv 81.071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 78.184 87.179 97.862 110.23 124.29 140.08 157.57 176.76 197.63 220.18 244.4 270.31 297.88 327.14 358.09 390.71 425.03 461.03 498.72 538.09 579.16 621.91 666.35 712.48 760.29 809.8 860.99 913.87 968.44 1024.7 1082.6 1142.3 1203.6 1266.6 1331.3 1397.7 1465.7 1535.5 1606.9 1680.1 1754.9 1831.4 1907.1 1981.9 2055.9 2129 2201.3 2272.7 2343.3 2413.1 2482 2550.1 2617.3 2683.7 2749.2 2813.9 2877.8 2940.8 3003 3064.3 3124.8 3184.4 3243.2 3301.1 3358.2 3414.5 3469.9 3524.4 3578.2 3631 3683.1 3734.3 3784.6 3834.1 3882.8 3930.6 3977.6 4023.7 4069 4113.4 4157 4199.8 4241.7 4282.7 4323 4362.3 4400.9 4438.6 4475.4 4511.4 4546.6 2 columns of 1000 columns of 1000Samples_Vsim.csv 81.07 81.07 73.19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 24.86 85.94 22.01 96.65 19.65 109.23 17.8 123.67 16.46 139.96 15.61 158.12 15.27 178.14 15.43 200.02 16.1 223.75 17.27 249.35 18.94 276.81 21.11 306.13 23.79 337.31 26.97 370.34 30.65 405.24 34.84 442 39.52 480.62 44.71 521.1 50.41 563.44 56.61 607.64 63.31 653.7 70.51 701.62 78.21 751.4 86.42 803.04 95.13 856.53 104.35 911.89 114.06 969.11 124.28 1028.2 135.01 1089.1 146.23 1151.9 157.96 1216.6 170.19 1283.1 182.93 1351.5 196.16 1421.7 209.9 1493.8 224.15 1567.8 238.89 1643.6 254.14 1721.3 269.89 1800.8 286.15 1882.2 302.91 1965.5 320.17 2050.6 337.18 2134.8 353.93 2218.1 370.44 2300.4 386.69 2381.8 402.7 2462.3 418.45 2541.8 433.95 2620.4 449.2 2698.1 464.2 2774.9 478.94 2850.7 493.44 2925.6 507.68 2999.5 521.67 3072.6 535.41 3144.7 548.9 3215.8 562.14 3286.1 575.12 3355.4 587.86 3423.8 600.34 3491.2 612.57 3557.7 624.55 3623.3 636.28 3688 647.76 3751.7 658.98 3814.5 669.96 3876.4 680.68 3937.3 691.15 3997.3 701.37 4056.4 711.34 4114.6 721.06 4171.8 730.52 4228.1 739.74 4283.4 748.7 4337.9 757.41 4391.4 765.87 4443.9 774.08 4495.6 782.04 4546.3 789.74 4596.1 797.2 4644.9 804.4 4692.8 811.35 4739.8 818.05 4785.9 824.5 4831 830.7 4875.2 836.64 4918.5 842.33 4960.8 847.78 5002.2 852.97 5042.7 857.91 5082.3 -- View this message in context: http://r.789695.n4.nabble.com/loop-to-subtract-arrays-error-tp4650001.ht ml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. . span
[R] internal cluster quality indexes
Dear useRs, I wanted to know something about the Index.G2 and Index.G3 which Calculate G2 and G2 internal cluster quality indexes. i tried to find material from internet but it seems that the file have been removed. Is it good to have higher values of these indexes or lower? thanks in advance eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] list.files, recursively
Thanks, Rui and Jim, for your replies. I tried to post this question to r.devel but its admin told me that the question rather belongs into r-help. Thanks, Jim, for your suggestion. I have already constructed something similar. I posted my question to suggest modifying the function so that not everybody has to program his own workaroung. So, lets see whether this convinces the people reponsible for these base functions. Cheers Jannis On 18.11.2012 19:02, Rui Barradas wrote: Hello, I believe that's a question for r-devel, but good point. It's docummented that in non-recursive calls to list.files subdirectory names are always included. (With a typo, There always are instead of They always are.) Rui Barradas Em 18-11-2012 17:20, Jannis escreveu: Dear R developers, as far as I understand the manual of list.files(), there is only a way to exclude directories from the returned vector if you use list.files recursively. In non recursive mode, there seems to be no way of excluding directories (the include dirs argument does not seem to have any effect). Would it not be more intuitive and practical to allow the switching off of directory names in both cases? Thanks a lot Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function customization
thank you all for the great help, in particular to dennis murphy in order to close the thread I'm posting here the final solution to my question new.ex-structure(list(TEC = c(0.21, 0.077, 0.06, 0.033, 0.014, 0.007, 0.21, 0.077, 0.01, 0.033, 0.05, 0.014), LR = c(FALSE, FALSE,TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, TRUE, TRUE), group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,2L, 2L), .Label = c(1, 2), class = factor)), .Names = c(TEC,LR, group), row.names = c(NA, -12L), class = data.frame) library(NADA) out-with(new.ex, cenfit(TEC, LR, group) ) #method 1 out - with(new.ex, cenfit(TEC, LR, group)) res - as.data.frame(do.call(rbind, mean(out) )) res$n - out@survfit$n # see str(out) to discover why res$sum - with(res, n * mean) res #method 2 library(plyr) res2 - ldply(mean(out), rbind) names(res2)[-1] - names(mean(out)[[1]]) res2 - mutate(res2, n = out@survfit$n, sum = mean * n) res2 -- View this message in context: http://r.789695.n4.nabble.com/function-customization-tp4649711p4650041.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop to subtract arrays / error
Hi Arun, thanks for your assistance. That worked as well. Irucka -Original Message- From: arun [smartpink...@yahoo.com] Sent: 11/19/2012 7:22:09 AM To: iruc...@mail2world.com Cc: r-help@r-project.org Subject: Re: [R] loop to subtract arrays / error HI, May be this helps: Vobsr-read.table(text= 81.071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 ,sep=,header=FALSE) Vsimr=read.table(text= 81.07 81.07 73..19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 ,sep=,header=FALSE) Vsimr1-as.matrix(Vsimr) sapply(split(Vsimr1,col(Vsimr1)),function(x) x-as.matrix(Vobsr)) #1 2 #[1,] -0.001 -0.001 #[2,] 0.003 0.003 #[3,] -1.181 0.169 #[4,] -3.552 0.518 #[5,] -7.112 1.038 #[6,] -11.849 1.721 #[7,] -17.775 2.585 #[8,] -24.888 3.622 #[9,] -33.179 4.821 #[10,] -42.658 6.202 #if you are using data.frame() res2-sapply(Vsimr,function(x) x-Vobsr[,1]) head(res2,3) # V1 V2 #[1,] -0..001 -0.001 #[2,] 0.003 0.003 #[3,] -1.181 0.169 #or just res3-Vsimr-Vobsr[,1] head(res3,3) # V1 V2 #1 -0.001 -0.001 #2 0.003 0.003 #3 -1.181 0.169 A.K. - Original Message - From: iembry iruc...@mail2world.com To: r-help@r-project.org Cc: Sent: Sunday, November 18, 2012 8:41 PM Subject: [R] loop to subtract arrays / error Hi everyone, I am having trouble with creating a loop to subtract arrays. In R, this is what I have done: Vobsr - read.csv(Observed_Flow.csv, header = TRUE, sep =,) # see data below Vsimr - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) # see data below Vobsr - as.matrix(Vobsr[,-1]) # remove column 1 from analysis thus Vobsr is 101x1 double matrix (column 1 is date information) Vsimr - as.matrix(Vsimr[,-1]) # remove column 1 from analysis thus Vsimr is 101x1000 double matrix (column 1 is date information) Vobsr - Vsimr Error in Vobsr - Vsimr : non-conformable arrays Thus I attempted to create the loop below to perform the subtraction operation for each of the 1000 columns. dim(Vsimr)[2] [1] 1000 for (i in 1:dim(Vsimr)[2]) { Xjj - Vobsr - Vsimr[,i] } Xjj is a 101x1 double matrix rather than a 101X1000 double matrix How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. I am including some of the data from the files that I am operating on below: 1 column of Observed_Flow.csv 81..071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 78.184 87.179 97.862 110.23 124.29 140.08 157.57 176.76 197.63 220.18 244.4 270.31 297.88 327.14 358.09 390.71 425.03 461.03 498.72 538.09 579.16 621.91 666.35 712.48 760.29 809.8 860.99 913.87 968.44 1024.7 1082.6 1142.3 1203.6 1266.6 1331.3 1397.7 1465.7 1535.5 1606.9 1680.1 1754.9 1831.4 1907.1 1981.9 2055.9 2129 2201.3 2272.7 2343.3 2413.1 2482 2550.1 2617.3 2683.7 2749.2 2813.9 2877.8 2940.8 3003 3064.3 3124.8 3184.4 3243.2 3301.1 3358.2 3414.5 3469.9 3524.4 3578.2 3631 3683.1 3734.3 3784.6 3834.1 3882.8 3930.6 3977.6 4023.7 4069 4113.4 4157 4199.8 4241.7 4282.7 4323 4362.3 4400.9 4438.6 4475.4 4511.4 4546.6 2 columns of 1000 columns of 1000Samples_Vsim.csv 81.07 81.07 73.19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 24.86 85.94 22.01 96.65 19.65 109.23 17.8 123.67 16.46 139.96 15.61 158.12 15.27 178.14 15.43 200.02 16.1 223.75 17.27 249.35 18.94 276.81 21.11 306.13 23.79 337.31 26.97 370.34 30.65 405.24 34.84 442 39.52 480.62 44.71 521.1 50.41 563.44 56.61 607.64 63.31 653.7 70.51 701.62 78.21 751.4 86.42 803.04 95.13 856.53 104.35 911.89 114.06 969.11 124.28 1028.2 135.01 1089.1 146.23 1151.9 157.96 1216.6 170.19 1283.1 182.93 1351.5 196.16 1421.7 209.9 1493.8 224.15 1567.8 238.89 1643.6 254.14 1721.3 269.89 1800.8 286.15 1882.2 302.91 1965.5 320.17 2050.6 337.18 2134.8 353.93 2218.1 370.44 2300.4 386.69 2381.8 402.7 2462.3 418.45 2541.8 433.95 2620.4 449.2 2698.1 464.2 2774.9 478.94 2850.7 493.44 2925.6 507.68 2999.5 521.67 3072.6 535.41 3144.7 548.9 3215.8 562.14 3286.1 575.12 3355.4 587.86 3423.8 600.34 3491.2 612.57 3557.7 624.55 3623.3 636.28 3688 647.76 3751.7 658.98 3814.5 669.96 3876.4 680.68 3937.3 691.15 3997.3 701.37 4056.4 711.34 4114.6 721.06 4171.8 730.52 4228.1 739.74 4283.4 748.7 4337.9 757.41 4391.4 765.87 4443.9 774.08 4495.6 782.04 4546.3 789.74 4596..1 797.2 4644.9 804.4 4692.8 811.35 4739.8 818.05 4785.9 824.5 4831 830.7 4875.2 836.64 4918.5 842.33 4960.8 847.78 5002.2 852.97 5042.7 857.91 5082.3 -- View this message in context: http://r.789695.n4.nabble.com/loop-to-subtract-arrays-error-tp4650001.h tml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org
Re: [R] Calculateing means
I have a data matrix with 570 columns containing 95 (samples) with 6 replicates each. How can I calculate the mean of the replicates for 95 samples? Write a function that calculates the sample means for a vector of 95 observations and then use apply() to apply that function to the whole matrix. S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple linear regression with proportion data
-Original Message- Can I use simple linear regression when I have proportion data for both dependent and independent variables? Or, should I use beta regression analysis? Or any suggestion? The distribution of the independent variable is irrelevant (in some circumstances it matters whether it is measured without error or not). Agreed that if you just want a line that goes somewhere near the data you can do pretty much anything. But don't those circumstances you referred to include 'any time you want an unbiased estimate of the slope or a reliable standard error on coefficients'? S *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Classification methods - which one?
Dear all, i searched for some classification methods and I have no glue if i took the right once. My problem: I have a matrix with 17000 rows and 33 colums (genes and patients). The patients are grouped into 3 diseases. No I want to classify the patients and for sure i want to know which rows are more helpful for the classification than others. I tried SVM and random forest. Do you think this are the right classification methods? Maybe there are some hints you can give me. I am more familiar with the Bioconductor packages. Furthermore: This is/was not my field of study in the past but I want to understand it and I am willing to deal with this field. Would be amazing if one of the (more) mathematical people can give me a hint. Thanks and all the best Peter PS: I can upload my underlying data if somebody is interested __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] side by side boxplots
I want to plot the results of multiple paired comparison in a side-by-side, in opposite direction with the standard error, graph. Would you please give me some help of howto. Thanks. Charlie. lbp [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] additive interaction for a dichotomous dependent variable (i.e. risk difference)
Hi, No, I never got any response. I used an article by Knol to solve the issue. Ik hope this is useful for you too. Best regards, Wouter Knol, M.J., van, d.T., Grobbee, D.E., Numans, M.E., Geerlings, M.I., 2007. Estimating interaction on an additive scale between continuous determinants in a logistic regression model. International Journal of Epidemiology 36, -1118. Wouter Peyrot Arts in opleiding tot psychiater GGZ inGeest, dienst onderzoek, onderzoek Locatie Valeriuskliniek, Valeriusplein 9, Postbus 74077, 1070 BB Amsterdam T (020) 788 5425 www.ggzingeest.nlhttp://www.ggzingeest.nl GGZ inGeest partner van VUmc Van: sarahw [via R] [mailto:ml-node+s789695n464968...@n4.nabble.com] Verzonden: vrijdag 16 november 2012 2:39 Aan: Peyrot, Wouter Onderwerp: Re: additive interaction for a dichotomous dependent variable (i.e. risk difference) Did you ever get a response to this or resolve this yourself? Many thanks! If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/additive-interaction-for-a-dichotomous-dependent-variable-i-e-risk-difference-tp4635842p4649687.html To unsubscribe from additive interaction for a dichotomous dependent variable (i.e. risk difference), click herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4635842code=dy5wZXlyb3RAZ2d6aW5nZWVzdC5ubHw0NjM1ODQyfC04NzAyMTA2MDk=. NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml Dit e-mailbericht is uitsluitend bestemd voor de geadresseerde. Als dit bericht niet voor u bestemd is, wordt u verzocht dit aan de afzender te melden en het bericht te vernietigen. Het is niet toegestaan de inhoud van dit bericht verder te verspreiden of te gebruiken. Voor meer informatie over GGZ inGeest: www.ggzingeest.nl. Denk aan het milieu voordat u deze e-mail print. -- View this message in context: http://r.789695.n4.nabble.com/additive-interaction-for-a-dichotomous-dependent-variable-i-e-risk-difference-tp4635842p4650017.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stepwise analysis with fixed variables
Thank you for the quick reply. Two more questions: 1. For example, if this is my code: RegModel = lm(glucose~sex+BMI+height+weight+education+ses,weight=w_without_non_response) summary(RegModel) step(RegModel, direction =backward,scope=list(lower=?,upper=?)) and I want the sex and height variables to be fixed, but the rest of the variables to go into the backward analysis, how should I write the scope function? 2.How can I add an alpha level to the step function as a criterion for the backward regression analysis? Thank you very much, Einat -- View this message in context: http://r.789695.n4.nabble.com/Stepwise-analysis-with-fixed-variables-tp4650015p4650030.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RQuantlib - Convertible Bond Pricing
Hi everyone, I’m working on my Master’s Degree thesis about the pricing of C.B. trying to do that with “R”. I read the paper “RQuantLib: Interfacing QuantLib from R” and now I’m matching several market price (taken from Bloomberg or Deutsche Bank database) with “R” output. Could you help me to understand the parameters within these functions? Let me show you one of the problem that I met in the attached files. As you could see in the image, just shifting the Conv.Ratio from 1 to 10 all the curves in the plot get flattened. First http://imageshack.us/photo/my-images/19/cr10plot.png/ Second http://imageshack.us/photo/my-images/641/cr1plot.png/ Script http://imageshack.us/photo/my-images/801/cr1m.png/ Best regards. Gabriele Carrarini BTG Pactual London Berkeley Square House -- View this message in context: http://r.789695.n4.nabble.com/RQuantlib-Convertible-Bond-Pricing-tp4650027.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-Square in WLS
Hi Peter, why you are involving -1 with this concept? Can you explain more please Cheers Date: Sun, 18 Nov 2012 23:28:26 -0800 From: ml-node+s789695n4650012...@n4.nabble.com To: frespi...@hotmail.com Subject: Re: R-Square in WLS On Nov 18, 2012, at 21:32 , Thomas Lumley wrote: On Fri, Nov 16, 2012 at 4:48 PM, frespider [hidden email] wrote: Hi, I am fitting a weighted least square regression and trying to compute SSE,SST and SSReg but I am not getting SST = SSReg + SSE and I dont know what I am coding wrong. Can you help please? For a start, you need to replace your mu and muZ by weighted means. The -1 in the model formulas also suggests that there will be problems even in the non-weighted case. The addition formula for SSDs works for successive model reductions, so it is required that the span of the design matrix X contains the vector of all ones. -thomas [snip] ## Y = Log(Z) Scale Yhat - X%*%bhat # predicted values mu - mean(Y) To - Y - mu Er - Y - Yhat Re - Yhat - mu lgSST - sum(Weights*(To)^2)# log SST lgSSE - sum(Weights*(Er)^2)# log SSE lgSSR - sum(Weights*(Re)^2)# log SSR lgR-sq - lgSSR/lgSST ### Z Scale ## Z - exp(Y) muZ - mean(Z) Zhat - exp(Yhat+0.5*Sigma2) ToZ - Z-muZ ErZ - Z - Zhat ReZ - Zhat - muZ SST - sum(Weights*(ToZ)^2) # SST SSE - sum(Weights*(ErZ)^2) # SSE SSR - sum(Weights*(ReZ)^2) # SSR Rsq - SSR/SST I don't understand what is wrong with the code. The sum square regression plus the sum square error do not add up to the sum square total in both the Y scale and Z scale. Y is a normal distribution and Z is log normally distributed. Where is the error? Also, is there a way to calculate the weighted sum square? -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: [hidden email] Priv: [hidden email] __ [hidden email] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/R-Square-in-WLS-tp4649693p4650012.html To unsubscribe from R-Square in WLS, click here. NAML -- View this message in context: http://r.789695.n4.nabble.com/R-Square-in-WLS-tp4649693p4650032.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice: defining grouping variable only for the upper/lower panel with splom
Using the mtcars dataset, how to define the grouping variable to be valid only for the upper or lower panel? The following doesn't work: # Code start splom(~data.frame(mpg, disp, hp, drat, wt, qsec), data=mtcars, pscales=0, auto.key=list(columns=3), upper.panel = function(...){ panel.grid(...) panel.xyplot(groups=cyl,...) } ) # Code end -- View this message in context: http://r.789695.n4.nabble.com/lattice-defining-grouping-variable-only-for-the-upper-lower-panel-with-splom-tp4650033.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Biologist R learner
I am a Biologist and a beginner, please help me to solve this: please anyone..its my homework and I dont have a clue abt R yet.. Write a function which does the following tasks: (a) Calculates minimum and maximum value of a given argument x. (b) If x is positive, some new vector gets the value of TRUE, and FALSE otherwise. (c) Creates a vector where the i:th and (i-1):th values of x are always summed. First value of the new vector has the same value as the first component of x. Use the created function to some vector x to show that the function works. -- View this message in context: http://r.789695.n4.nabble.com/Biologist-R-learner-tp4650044.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop to subtract arrays / error
HI, May be this helps: Vobsr-read.table(text= 81.071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 ,sep=,header=FALSE) Vsimr=read.table(text= 81.07 81.07 73.19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 ,sep=,header=FALSE) Vsimr1-as.matrix(Vsimr) sapply(split(Vsimr1,col(Vsimr1)),function(x) x-as.matrix(Vobsr)) # 1 2 #[1,] -0.001 -0.001 #[2,] 0.003 0.003 #[3,] -1.181 0.169 #[4,] -3.552 0.518 #[5,] -7.112 1.038 #[6,] -11.849 1.721 #[7,] -17.775 2.585 #[8,] -24.888 3.622 #[9,] -33.179 4.821 #[10,] -42.658 6.202 #if you are using data.frame() res2-sapply(Vsimr,function(x) x-Vobsr[,1]) head(res2,3) # V1 V2 #[1,] -0.001 -0.001 #[2,] 0.003 0.003 #[3,] -1.181 0.169 #or just res3-Vsimr-Vobsr[,1] head(res3,3) # V1 V2 #1 -0.001 -0.001 #2 0.003 0.003 #3 -1.181 0.169 A.K. - Original Message - From: iembry iruc...@mail2world.com To: r-help@r-project.org Cc: Sent: Sunday, November 18, 2012 8:41 PM Subject: [R] loop to subtract arrays / error Hi everyone, I am having trouble with creating a loop to subtract arrays. In R, this is what I have done: Vobsr - read.csv(Observed_Flow.csv, header = TRUE, sep =,) # see data below Vsimr - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) # see data below Vobsr - as.matrix(Vobsr[,-1]) # remove column 1 from analysis thus Vobsr is 101x1 double matrix (column 1 is date information) Vsimr - as.matrix(Vsimr[,-1]) # remove column 1 from analysis thus Vsimr is 101x1000 double matrix (column 1 is date information) Vobsr - Vsimr Error in Vobsr - Vsimr : non-conformable arrays Thus I attempted to create the loop below to perform the subtraction operation for each of the 1000 columns. dim(Vsimr)[2] [1] 1000 for (i in 1:dim(Vsimr)[2]) { Xjj - Vobsr - Vsimr[,i] } Xjj is a 101x1 double matrix rather than a 101X1000 double matrix How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. How can I subtract each column of Vsimr from the single column of Vobsr over the 1000 columns present? I would like to thank each of you in advance for your assistance. I am including some of the data from the files that I am operating on below: 1 column of Observed_Flow.csv 81.071 73.187 66.991 62.482 59.662 58.529 59.085 61.328 65.259 70.878 78.184 87.179 97.862 110.23 124.29 140.08 157.57 176.76 197.63 220.18 244.4 270.31 297.88 327.14 358.09 390.71 425.03 461.03 498.72 538.09 579.16 621.91 666.35 712.48 760.29 809.8 860.99 913.87 968.44 1024.7 1082.6 1142.3 1203.6 1266.6 1331.3 1397.7 1465.7 1535.5 1606.9 1680.1 1754.9 1831.4 1907.1 1981.9 2055.9 2129 2201.3 2272.7 2343.3 2413.1 2482 2550.1 2617.3 2683.7 2749.2 2813.9 2877.8 2940.8 3003 3064.3 3124.8 3184.4 3243.2 3301.1 3358.2 3414.5 3469.9 3524.4 3578.2 3631 3683.1 3734.3 3784.6 3834.1 3882.8 3930.6 3977.6 4023.7 4069 4113.4 4157 4199.8 4241.7 4282.7 4323 4362.3 4400.9 4438.6 4475.4 4511.4 4546.6 2 columns of 1000 columns of 1000Samples_Vsim.csv 81.07 81.07 73.19 73.19 65.81 67.16 58.93 63 52.55 60.7 46.68 60.25 41.31 61.67 36.44 64.95 32.08 70.08 28.22 77.08 24.86 85.94 22.01 96.65 19.65 109.23 17.8 123.67 16.46 139.96 15.61 158.12 15.27 178.14 15.43 200.02 16.1 223.75 17.27 249.35 18.94 276.81 21.11 306.13 23.79 337.31 26.97 370.34 30.65 405.24 34.84 442 39.52 480.62 44.71 521.1 50.41 563.44 56.61 607.64 63.31 653.7 70.51 701.62 78.21 751.4 86.42 803.04 95.13 856.53 104.35 911.89 114.06 969.11 124.28 1028.2 135.01 1089.1 146.23 1151.9 157.96 1216.6 170.19 1283.1 182.93 1351.5 196.16 1421.7 209.9 1493.8 224.15 1567.8 238.89 1643.6 254.14 1721.3 269.89 1800.8 286.15 1882.2 302.91 1965.5 320.17 2050.6 337.18 2134.8 353.93 2218.1 370.44 2300.4 386.69 2381.8 402.7 2462.3 418.45 2541.8 433.95 2620.4 449.2 2698.1 464.2 2774.9 478.94 2850.7 493.44 2925.6 507.68 2999.5 521.67 3072.6 535.41 3144.7 548.9 3215.8 562.14 3286.1 575.12 3355.4 587.86 3423.8 600.34 3491.2 612.57 3557.7 624.55 3623.3 636.28 3688 647.76 3751.7 658.98 3814.5 669.96 3876.4 680.68 3937.3 691.15 3997.3 701.37 4056.4 711.34 4114.6 721.06 4171.8 730.52 4228.1 739.74 4283.4 748.7 4337.9 757.41 4391.4 765.87 4443.9 774.08 4495.6 782.04 4546.3 789.74 4596.1 797.2 4644.9 804.4 4692.8 811.35 4739.8 818.05 4785.9 824.5 4831 830.7 4875.2 836.64 4918.5 842.33 4960.8 847.78 5002.2 852.97 5042.7 857.91 5082.3 -- View this message in context: http://r.789695.n4.nabble.com/loop-to-subtract-arrays-error-tp4650001.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
[R] Biologist R learner problem!!!help pls
. QUse the built in dataset called iris in this task. (a) Calculate the result of following formula separately in every species for all of the numerical variables: log(x)/x. (b) Calculate trimmed mean for each of the numerical variables using apply–function. Choose your own trimming percentage -- View this message in context: http://r.789695.n4.nabble.com/Biologist-R-learner-problem-help-pls-tp4650045.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on matrix column removal based on another matrix results
Hi everyone, now I am trying to finish writing the code (I had asked for assistance on subtracting arrays) This is what I what I am running in R: source(/home/ie/Documents/TTU/GA_Research/GLUE/R-Project/R_GLUE_Example/NSEr.R) NSEr - function (obs, sim) { {jjh - (as.vector(obs) - sim)^2 Xjjhs - apply(Xjjh, 2, sum) Yii - (obs - mean(obs))^2 Yiis - apply(Yii, 2, sum) NSEr - 1 - (Xjjhs/Yiis) } NSEr} Vsim - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) Vsim - as.matrix(Vsim[,-1]) # remove column 1 from analysis Vobs - read.csv(Observed_Flow.csv, header = TRUE, sep =,) Vobs - as.matrix(Vobs[,-1]) # remove column 1 from analysis NSEr - NSEr(Vobs,Vsim); write.table(NSEr, NSEr.csv, sep =,) NSErr - t(matrix(NSEr)) ## select the behavioural simulations and discard the rest Vsim - Vsim[NSErr 0.6] write.table(Vsim, Vsim.csv, sep =,) **Vsim becomes numeric[42016] rather than a double matrix of 101x416. What is the proper way to remove the columns in Vsim where the NSEr for that column is less than 0.6? I am trying to make Vsim a double matrix of 101x416. Thank-you again. Below is the rest of the code in R: ## normalise Qsim and compute the quantiles NSEr - NSEr[NSEr 0.6] write.table(NSEr, NSEr_great_0.6.csv, sep =,) NSEr - NSEr - 0.6 write.table(NSEr, NSEr_minus0.6.csv, sep =,) NSEr - NSEr/sum(NSEr) write.table(NSEr, NSEr_normalized.csv, sep =,) #NSEr = sum(NSEr) limits - apply(Vsim, 1, wtd.quantile, weights = NSEr, probs = c(0.05,0.95), normwt=F) -- View this message in context: http://r.789695.n4.nabble.com/help-on-matrix-column-removal-based-on-another-matrix-results-tp4650043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R SNA: Creating a adjacency matrix containing all actors but only values of a subset
1 down vote favorite my problem is the following: I am using the R SNA package for social network analysis. Lets say, my starting point is an edgelist with the following characteristics. Every row contains a firm name, the ID of a project they are involved and further characteristics, let's say the projects year. Firms can be in several projects, and one project can consist of a cooperation of more than one firm. Example: Name Project Year AA 1 2003 AB 1 2003 AB 2 2003 AB 3 2004 AC 2 2003 AC 4 2005 For the network analysis i need a adjacency matrix with all firms a srow and column header, which i construct as follows: grants.edgelist - read.csv(00-composed.csv, header = TRUE, sep = ;, quote=\, dec=,, fill = TRUE, comment.char=) grants.2mode - table(grants.edgelist) # cross tabulate - 2-mode sociomatrix grants.adj - grants.2mode%*%t(grants.2mode) # Adjacency matrix as product of the 2-mode sociomatrix` --- Now my problem: I want to run a netlm Regression on the adjacency matrix, where i test how the network in one given year explains the network in the next year. However, therefore i wanted to subset the grants.edgelist in a set for (lets say) 2003 and 2005 only. However, i figured out that not all firms are in projects every year, and therefore the corresponding adjacency matrix has different rows and columns. Now my question: How could i obtain a adjacency matrix containing all firms in row and column header, but their intersection set on zero expect of the year i want to observe. I hope it is clear what i mean. Thank you very much in advance. This problem is driving m crazy today! Best wishes Daniel -- View this message in context: http://r.789695.n4.nabble.com/R-SNA-Creating-a-adjacency-matrix-containing-all-actors-but-only-values-of-a-subset-tp4650046.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stepwise regression scope: all interacting terms (.^2)
David, thanks for the feedback! Steve, thanks for the direction! I have heard and read some about Dr. Harrell's work but somehow had missed the term penalized logistic regression. That was helpful for finding more specific sources to follow Dr. Harrell's (and other's) suggestions. I may have more questions in the near future. On Nov 16, 2012, at 3:32 PM, Steve Lianoglou wrote: Hi Mark, To put some context to David's response below, you can search the list archives for times when people ask about stepwise regression. You can get started here: http://search.gmane.org/search.php?group=gmane.comp.lang.r.generalquery=stepwise+penalized The long and short of it is that you are almost always encouraged to use some regularization/penalized model instead of this stepwise approach. Frank Harrell, in particular, is generally quite vocal against stepwise regression -- I'm actually surprised he hasn't chimed in by now, but maybe he's getting a bit tired of fighting the good fight -- or, it's close to the holiday and he's taking a break ;-) Anyway ... HTH, -steve On Fri, Nov 16, 2012 at 4:13 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 16, 2012, at 12:16 PM, Mark Ebbert wrote: I haven't heard anything on this question. Is there something fundamentally wrong with my question? Any feedback is appreciated. Perhaps failure to read this sig at the bottom of every posted message to rhelp? PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Mark On Nov 15, 2012, at 8:13 AM, Mark T. W. Ebbert wrote: Dear Gurus, Thank you in advance for your assistance. I'm trying to understand scope better when performing stepwise regression using step. From the help page of step: If scope is a single formula, it specifies the upper component, and the lower model is empty. I have a model with a binary response variable and 10 predictor variables. When I perform stepwise regression I define scope=.^2 to allow interactions between all terms. I generally avoid answering questions about stepwise regression, because most of them do not include sufficient background material to justify that strategy. Yours certainly did not. But I am missing something. When I perform stepwise regression (both directions) on the main model (y~x1+x2+…+x10) the method returns quickly with an answer; however, when I define all interactions in the main model (y~x1+x2+…+x10+x1:x2+x1:x3+…) and then perform stepwise regression (backward only) it runs so long I have to kill it. So here's my question: what is the difference between scope=.^2 on the additive (proper term?) model and defining all interactions and doing backward regression? My understanding is that .^2 is supposed to allow all interactions! Well, I would have guessed all two-way interactions (all 45 of them in your case) would be included and then successively reduce until you got to your specified (arbitrary and most likely incorrectly set) endpoint.) I think the help page Details section is unclear on this point. I do not think that the 120 potential three-way interactions are part of the scope in that instance, but it should be easy enough for you to test that possibility. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question about Package 'sampleSection' for IRT model
Dear All, I am Ph.D student at Chulalongkorn University in Thailand, I want to use Package 'sampleSection' to estimate missing data which generate under IRT model(3-PL); n-500 ## number of examinee I-20 ## number of items num.imp-5 ##number of imputations p.missing-c(0.09, 0.01) #prob of missing theta-sort(rnorm(n,0,1)) #ability a-rnorm(I,0.5,0.1) #discrimination b-rnorm(I,0,1) #difficulty c-runif(I,0,0.25) #guess Only item 1 have missing data. If the response to items 1 was a 1 (correct), the probability of missingwas 1%. If the response was a 0 (incorrect), the probability of missing was 9%. Thus, the probability of missing was linked to the response of items itself (an unknown characteristic in real missing data situations). I don't know how to apply function 'Heckman-style selection models' for this case, becase all my variables are unobserved. Could you please tell me how to estimate data under my situation. I'am looking forward your advice. Sincerely yours, Kamontip Srihaset [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Ben Bolker's '‘emdbook’ Package , rbetabinom
Hello, I am using rbetabinom ( to generate beta binomial random variables) function available in the emdbookpackage written by Professor. Ben Bolker for my research study. I have no questions with this function. However, I am looking for the theoretical method/algorithm of the function rbetabinom . Morris (1997), American Naturalist 150:299-327 is given as the reference in the package, But I couldn't fund any theoretical methods to generate beta binomial random variables in this article. I would like to kind request you to suggest me any journal paper or document to study the methods of generating beta binomial random variables used to write the function rbetabinom. Thank you very much. Sincerely, Arun. -- View this message in context: http://r.789695.n4.nabble.com/Ben-Bolker-s-emdbook-Package-rbetabinom-tp4650047.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RMySQL install on windows
Sorry for not taking care of this... If anyone would like to take over maintainership of RMySQL I'm sure the R community would greatly appreciate it. I just don't have the time these days. Jeff [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on matrix column removal based on another matrix results
Hello, Try Vsim[] - Vsim[NSErr 0.6, ] Hope this helps, Rui Barradas Em 19-11-2012 14:30, iembry escreveu: Hi everyone, now I am trying to finish writing the code (I had asked for assistance on subtracting arrays) This is what I what I am running in R: source(/home/ie/Documents/TTU/GA_Research/GLUE/R-Project/R_GLUE_Example/NSEr.R) NSEr - function (obs, sim) { {jjh - (as.vector(obs) - sim)^2 Xjjhs - apply(Xjjh, 2, sum) Yii - (obs - mean(obs))^2 Yiis - apply(Yii, 2, sum) NSEr - 1 - (Xjjhs/Yiis) } NSEr} Vsim - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) Vsim - as.matrix(Vsim[,-1]) # remove column 1 from analysis Vobs - read.csv(Observed_Flow.csv, header = TRUE, sep =,) Vobs - as.matrix(Vobs[,-1]) # remove column 1 from analysis NSEr - NSEr(Vobs,Vsim); write.table(NSEr, NSEr.csv, sep =,) NSErr - t(matrix(NSEr)) ## select the behavioural simulations and discard the rest Vsim - Vsim[NSErr 0.6] write.table(Vsim, Vsim.csv, sep =,) **Vsim becomes numeric[42016] rather than a double matrix of 101x416. What is the proper way to remove the columns in Vsim where the NSEr for that column is less than 0.6? I am trying to make Vsim a double matrix of 101x416. Thank-you again. Below is the rest of the code in R: ## normalise Qsim and compute the quantiles NSEr - NSEr[NSEr 0.6] write.table(NSEr, NSEr_great_0.6.csv, sep =,) NSEr - NSEr - 0.6 write.table(NSEr, NSEr_minus0.6.csv, sep =,) NSEr - NSEr/sum(NSEr) write.table(NSEr, NSEr_normalized.csv, sep =,) #NSEr = sum(NSEr) limits - apply(Vsim, 1, wtd.quantile, weights = NSEr, probs = c(0.05,0.95), normwt=F) -- View this message in context: http://r.789695.n4.nabble.com/help-on-matrix-column-removal-based-on-another-matrix-results-tp4650043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Biologist R learner
On 19-11-2012, at 15:32, andrew wrote: I am a Biologist and a beginner, please help me to solve this: please anyone..its my homework and I dont have a clue abt R yet.. Write a function which does the following tasks: (a) Calculates minimum and maximum value of a given argument x. (b) If x is positive, some new vector gets the value of TRUE, and FALSE otherwise. (c) Creates a vector where the i:th and (i-1):th values of x are always summed. First value of the new vector has the same value as the first component of x. Use the created function to some vector x to show that the function works. Please read the posting guide: http://www.r-project.org/posting-guide.html R-help is not for homework. Read the introductory manual An Introduction to R (links are on the page for the posting guide.) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Biologist R learner problem!!!help pls
On 19-11-2012, at 15:48, Anna23 wrote: . QUse the built in dataset called iris in this task. (a) Calculate the result of following formula separately in every species for all of the numerical variables: log(x)/x. (b) Calculate trimmed mean for each of the numerical variables using apply–function. Choose your own trimming percentage Please read the posting guide: http://www.r-project.org/posting-guide.html R-help is not for homework. Read the introductory manual An Introduction to R (links are on the page for the posting guide.) Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help on matrix column removal based on another matrix results
Sorry, the comma is in the wrong place, it should be Vsim[] - Vsim[ , NSErr 0.6] Rui Barradas Em 19-11-2012 16:18, Rui Barradas escreveu: Hello, Try Vsim[] - Vsim[NSErr 0.6, ] Hope this helps, Rui Barradas Em 19-11-2012 14:30, iembry escreveu: Hi everyone, now I am trying to finish writing the code (I had asked for assistance on subtracting arrays) This is what I what I am running in R: source(/home/ie/Documents/TTU/GA_Research/GLUE/R-Project/R_GLUE_Example/NSEr.R) NSEr - function (obs, sim) { {jjh - (as.vector(obs) - sim)^2 Xjjhs - apply(Xjjh, 2, sum) Yii - (obs - mean(obs))^2 Yiis - apply(Yii, 2, sum) NSEr - 1 - (Xjjhs/Yiis) } NSEr} Vsim - read.csv(1000Samples_Vsim.csv, header = TRUE, sep =,) Vsim - as.matrix(Vsim[,-1]) # remove column 1 from analysis Vobs - read.csv(Observed_Flow.csv, header = TRUE, sep =,) Vobs - as.matrix(Vobs[,-1]) # remove column 1 from analysis NSEr - NSEr(Vobs,Vsim); write.table(NSEr, NSEr.csv, sep =,) NSErr - t(matrix(NSEr)) ## select the behavioural simulations and discard the rest Vsim - Vsim[NSErr 0.6] write.table(Vsim, Vsim.csv, sep =,) **Vsim becomes numeric[42016] rather than a double matrix of 101x416. What is the proper way to remove the columns in Vsim where the NSEr for that column is less than 0.6? I am trying to make Vsim a double matrix of 101x416. Thank-you again. Below is the rest of the code in R: ## normalise Qsim and compute the quantiles NSEr - NSEr[NSEr 0.6] write.table(NSEr, NSEr_great_0.6.csv, sep =,) NSEr - NSEr - 0.6 write.table(NSEr, NSEr_minus0.6.csv, sep =,) NSEr - NSEr/sum(NSEr) write.table(NSEr, NSEr_normalized.csv, sep =,) #NSEr = sum(NSEr) limits - apply(Vsim, 1, wtd.quantile, weights = NSEr, probs = c(0.05,0.95), normwt=F) -- View this message in context: http://r.789695.n4.nabble.com/help-on-matrix-column-removal-based-on-another-matrix-results-tp4650043.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] manipulating longitudinal data in r
At 17:13 18/11/2012, Jeff Newmiller wrote: Michael, this comment doesn't seem appropriate to the question, since the sample data is a ragged array that requires the addition of NAs to fit into a wide format. It does now but we do not know whether that was how it started life. My remarks were intended as a general comment, not necessarily a solution to the OP's specific problem. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Michael Dewey i...@aghmed.fsnet.co.uk wrote: At 08:56 17/11/2012, Kemi Racheal wrote: Dear list member, I have the following data example ke - data.frame(patid=c(1,1,1,2,3,3),a=c(1,2,2,1,1,2)) I want to add another variable b, such that the max of 'a' by id is returned i.e data ke becomes ke - data.frame(patid=c(1,1,1,2,3,3),a=c(1,2,2,1,1,2),b=c(2,2,2,1,2,2)) Any help will be appreciated. Dear Kemi It is often easier to do some sorts of manipulations on the wide format of the data. I appreciate that you can always do it both ways. Oluwakemi [[alternative HTML version deleted]] Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Michael Dewey i...@aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survfit number of variables != number of variable names
I can't reproduce the problem. Tell us what version of R and what version of the survival package. Create a reproducable example. I don't know if some variables are numeric and some are factors, how/where the surv object was defined, etc. Terry Therneau On 11/17/2012 05:00 AM, r-help-requ...@r-project.org wrote: This works ok: cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = data) fit = survfit(cox, newdata=data[1:100,]) but using strata leads to problems: cox.s = coxph(surv ~ bucket*(today + accor + both) + strata(activity), data = data) fit.s = survfit(cox.s, newdata=data[1:100,]) Error in model.frame.default(data = data[1:100, ], formula = ~bucket + : number of variables != number of variable names Note that the following give rise to the same error: fit.s = survfit(cox.s, newdata=data) Error in model.frame.default(data = data, formula = ~bucket + today + : number of variables != number of variable names but if I use data implicitly, all is working fine: fit.s = survfit(cox.s) Any idea on how I could solve this? Best, and thank you, ge __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] expand time period
I'd like to expand the following data to perform a daily time series. It should cover from '2012-07-01' to '2012-10-06' with the values I actually have being the mean from one point measurement to another. Does anyone has a clue to perform this task. structure(list(Date.beg = structure(c(15635, 15617, 15615, 15610, 15609, 15605, 15604, 15601, 15593, 15593, 15586, 15581, 15580, 15577, 15572, 15565, 15552, 15540, 15530, 15516), class = Date), Date.end = structure(c(15619, 15619, 15616, 15615, 15610, 15607, 15604, 15602, 15595, 15594, 15587, 15582, 15581, 15579, 15572, 15567, 15554, 15541, 15533, 15517), class = Date), Pollster = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 4L, 1L, 2L, 1L, 1L, 2L, 1L, 4L, 1L, 2L, 2L, 1L, 3L, 1L), .Label = c(Datafolha, Ibope, Veritá, Vox Populi), class = factor), Serra.PSDB = c(24, 22, 23, 19, 22, 17, 17, 21, 19, 20, 21, 20, 22, 22, 27, 26, 26, 30, 31.4, 31), Russomanno.PRB = c(23, 22, 25, 27, 30, 34, 34, 35, 35, 32, 35, 31, 31, 31, 31, 26, 25, 26, 17.7, 24), Haddad.PT = c(20, 22, 19, 18, 18, 18, 17, 15, 15, 17, 16, 16, 14, 14, 8, 9, 6, 7, 9.5, 6), Chalita.PMDB = c(11, 11, 11, 10, 9, 7, 5, 8, 6, 8, 7, 5, 7, 5, 6, 5, 5, 6, 4.3, 6), Others = c(8, 7, 7, 4, 7, 6, 4, 5, 6, 7, 9, 5, 8, 6, 10, 12, 15, 15, 8.6, 16), Null = c(8, 8, 8, NA, 8, 10, 10, NA, 13, 9, 8, 12, 10, NA, 10, 12, 14, 11, 13.1, 11), Undecided = c(6, 8, 6, NA, 6, 8, 13, NA, 6, 7, 4, 9, 7, 13, 6, 10, 9, 6, 13.3, 5), Round = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L), .Label = c(year before, off campaign, first, second), class = factor), Stage = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(0, 1, 2), class = factor), Serra.Haddad = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c(, 1, 0), class = factor), N = c(3959, 1204, 2099, 1204, 1799, 1204, 2000, 1802, 1001, 1221, 1078, 1001, 1069, 1200, 1077, 805, 805, 1075, 1331, 1081), Err = c(2, 3, 2, 3, 2, 3, 2.2, 2, 3, 3, 3, 3, 3, 2.8, 3, 3, 3, 3, 2.7, 3)), .Names = c(Date.beg, Date.end, Pollster, Serra.PSDB, Russomanno.PRB, Haddad.PT, Chalita.PMDB, Others, Null, Undecided, Round, Stage, Serra.Haddad, N, Err), row.names = c(6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25), class = data.frame) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to subset my data and at the same time keep the balance?
Hi guys, I have 1000 rows of a dataset. In my analysis, I need 70% of the data, run my analysis and then use the remaining 30% to test my model. Could anybody kindly help me on this? Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to subset my data and at the same time keep the balance?
Hello, See the following example. x - matrix(rnorm(2000), ncol = 2) idx - sample(nrow(x), 0.7*nrow(x)) x2 - x[idx, ] nrow(x2) # 700 x3 - x[-idx, ] nrow(x3) # 300 Hope this helps, Rui Barradas Em 19-11-2012 17:16, Eddie Smith escreveu: Hi guys, I have 1000 rows of a dataset. In my analysis, I need 70% of the data, run my analysis and then use the remaining 30% to test my model. Could anybody kindly help me on this? Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to subset my data and at the same time keep the balance?
I'm not sure what you mean by balance, but you can use sample() to randomly order the values 1:1000, then use the first 700 as row indices for the first set, and the last 300 as the test set. Sarah On Mon, Nov 19, 2012 at 12:16 PM, Eddie Smith eddie...@gmail.com wrote: Hi guys, I have 1000 rows of a dataset. In my analysis, I need 70% of the data, run my analysis and then use the remaining 30% to test my model. Could anybody kindly help me on this? Cheers -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to get bootstrap estimates
Hello all, could you explain me, how to get bootstrap estimates if i have the next data: scor mec vec alg ana sta 1 77 82 67 67 81 2 63 78 80 70 81 3 75 73 71 66 81 4 55 72 63 70 68 5 63 63 65 70 63 6 53 61 72 64 73 7 51 67 65 65 68 8 59 70 68 62 56 9 62 60 58 62 70 10 64 72 60 62 45 11 52 64 60 63 54 12 55 67 59 62 44 13 50 50 64 55 63 14 65 63 58 56 37 15 31 55 60 57 73 16 60 64 56 54 40 17 44 69 53 53 53 18 42 69 61 55 45 19 62 46 61 57 45 20 31 49 62 63 62 21 44 61 52 62 46 22 49 41 61 49 64 23 12 58 61 63 67 24 49 53 49 62 47 25 54 49 56 47 53 26 54 53 46 59 44 27 44 56 55 61 36 28 18 44 50 57 81 29 46 52 65 50 35 30 32 45 49 57 64 31 30 69 50 52 45 32 46 49 53 59 37 33 40 27 54 61 61 34 31 42 48 54 68 35 36 59 51 45 51 36 56 40 56 54 35 37 46 56 57 49 32 38 45 42 55 56 40 39 42 60 54 49 33 40 40 63 53 54 25 41 23 55 59 53 44 42 48 48 49 51 37 43 41 63 49 46 34 44 46 52 53 41 40 45 46 61 46 38 41 46 40 57 51 52 31 47 49 49 45 48 39 48 22 58 53 56 41 49 35 60 47 54 33 50 48 56 49 42 32 51 31 57 50 54 34 52 17 53 57 43 51 53 49 57 47 39 26 54 59 50 47 15 46 55 37 56 49 28 45 56 40 43 48 21 61 57 35 35 41 51 50 58 38 44 54 47 24 59 43 43 38 34 49 60 39 46 46 32 43 61 62 44 36 22 42 62 48 38 41 44 33 63 34 42 50 47 29 64 18 51 40 56 30 65 35 36 46 48 29 66 59 53 37 22 19 67 41 41 43 30 33 68 31 52 37 27 40 69 17 51 52 35 31 70 34 30 50 47 36 71 46 40 47 29 17 72 10 46 36 47 39 73 46 37 45 15 30 74 30 34 43 46 18 75 13 51 50 25 31 76 49 50 38 23 9 77 18 32 31 45 40 78 8 42 48 26 40 79 23 38 36 48 15 80 30 24 43 33 25 81 3 9 51 47 40 82 7 51 43 17 22 83 15 40 43 23 18 84 15 38 39 28 17 85 5 30 44 36 18 86 12 30 32 35 21 87 5 26 15 20 20 88 0 40 21 9 14 and I have the next errors: ro 12 = ro (mec,vec) ro 34 = ro (alg,ana) ro 35 = ro (alg,sta) ro 45 = ro (ana,sta) ro 14 = ro (mec,ana) Thank you, Tania [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to subset my data and at the same time keep the balance?
HI, May be this helps: dat1-read.table(text= V1 V2 1 5 10 2 6 3 3 8 4 4 9 20 5 15 30 6 25 40 7 2 4 8 3 1 9 1 5 10 8 10 ,header=TRUE) dat2-dat1[sample(NROW(dat1),NROW(dat1)*(1-0.3)),] #70% of data dat2$newcol-TRUE dat1$newcol1-TRUE dat4-merge(dat1,dat2,by=c(V1,V2),all=TRUE) dat5-dat4[is.na(dat4$newcol),][,1:2] #remaining 30% dat5 # V1 V2 #2 2 4 #4 5 10 #8 9 20 A.K. - Original Message - From: Eddie Smith eddie...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, November 19, 2012 12:16 PM Subject: [R] How to subset my data and at the same time keep the balance? Hi guys, I have 1000 rows of a dataset. In my analysis, I need 70% of the data, run my analysis and then use the remaining 30% to test my model. Could anybody kindly help me on this? Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generated list element names
How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generated list element names
How about this (if you don't like writing two lines, encapsulate it in a function): x - list(10) names(x) - paste('f', 'oo', sep = '') str(x) List of 1 $ foo: num 10 On Mon, Nov 19, 2012 at 1:07 PM, Sam Steingold s...@gnu.org wrote: How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] additive interaction for a dichotomous dependent variable (i.e. risk difference)
On Nov 19, 2012, at 1:57 AM, wouterjohannes wrote: No, I never got any response. I used an article by Knol to solve the issue. Ik hope this is useful for you too. Best regards, Wouter Knol, M.J., van, d.T., Grobbee, D.E., Numans, M.E., Geerlings, M.I., 2007. Estimating interaction on an additive scale between continuous determinants in a logistic regression model. International Journal of Epidemiology 36, -1118. I'm a bit surprised that you did not use the binomial family with a probit link. The help page for ?family mentions the probit link as appropriate for normal CDF. It would seem possible to construct competing models both with binomial errors and compare the deviance. If you wanted to do this via embedding each model in a higher order model where the degree of multipicativity was represented by a parameter to be estimated, I remember that Breslow offered GLIM code to perform that test in one of his two volumes. My copies are temporarily in storage or I would have given a more specific citation. N.E. Breslow N. E. Day (1980) Statistical Methods in Cancer Research. Volume I - The Analysis of Case-Control Studies N.E. Breslow N. E. Day (1987) Statistical Methods in Cancer Research. Volume II - The Analysis of Cohort Studies Wait ... I can do better. The entire chapter that I was remembering appears to now be accessible online from the publisher's website: http://www.iarc.fr/en/publications/pdfs-online/stat/sp82/SP82_vol2-4.pdf Van: sarahw [via R] [mailto:ml-node+s789695n464968...@n4.nabble.com] Verzonden: vrijdag 16 november 2012 2:39 Aan: Peyrot, Wouter Onderwerp: Re: additive interaction for a dichotomous dependent variable (i.e. risk difference) Did you ever get a response to this or resolve this yourself? Many thanks! If you reply to this email, your message will be added to the discussion below: http://r.789695.n4.nabble.com/additive-interaction-for-a-dichotomous-dependent-variable-i-e-risk-difference-tp4635842p4649687.html David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generated list element names
* jim holtman wubyg...@tznvy.pbz [2012-11-19 13:14:05 -0500]: How about this (if you don't like writing two lines, encapsulate it in a function): x - list(10) names(x) - paste('f', 'oo', sep = '') str(x) List of 1 $ foo: num 10 I am sorry, how is this different from my second snippet (except that you use x and I use z and you use single quotes in paste and I use double quotes)? On Mon, Nov 19, 2012 at 1:07 PM, Sam Steingold s...@gnu.org wrote: How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://truepeace.org http://ffii.org http://think-israel.org http://jihadwatch.org http://palestinefacts.org The only time you have too much fuel is when you're on fire. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generated list element names
If you have a list and want to add a new (or replace a) named component use myList[[compName]] - compValue as in myList - list() compName - Incr compValue - function(x) x + 1 myList[[compName]] - compValue If you want to make a new list-with-names from scratch try structure(list(1, cat, function(x)x+1), names=c(One,Pet,Increment)) (structure() is a general way to make an object and add attributes to it in one statement.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Sam Steingold Sent: Monday, November 19, 2012 10:07 AM To: r-help@r-project.org Subject: [R] generated list element names How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survfit number of variables != number of variable names
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Terry, I attached a small data set to this email. This is what I get (I restricted the formula to avoid NA's): surv = with(small, Surv(time=absence, event=(censored==FALSE))) (cox.s = coxph(surv ~ bucket*(today) + strata(activity), data = small)) Call: coxph(formula = surv ~ bucket * (today) + strata(activity), data = small) coef exp(coef) se(coef) zp bucket5750.4526 1.5720.740 0.612 0.54 todayTRUE -0.0886 0.9150.676 -0.131 0.90 bucket575:todayTRUE -0.1670 0.8460.794 -0.210 0.83 Likelihood ratio test=2.32 on 3 df, p=0.509 n= 100, number of events= 100 fit = survfit(cox.s, newdata=small[1:50,]) Error in model.frame.default(data = small[1:50, ], formula = ~bucket + : number of variables != number of variable names also: R.version _ platform x86_64-redhat-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 15.1 year 2012 month 06 day22 svn rev59600 language R version.string R version 2.15.1 (2012-06-22) nickname Roasted Marshmallows package ‘survival’ is version 2.36-14 and finally, variable absence is numeric, bucket activity are factors and all other variables are logical. I tested the same formula without 'strata' and I had no problem. Best and thank you, ge On 11/19/2012 09:01 AM, Terry Therneau wrote: I can't reproduce the problem. Tell us what version of R and what version of the survival package. Create a reproducable example. I don't know if some variables are numeric and some are factors, how/where the surv object was defined, etc. Terry Therneau On 11/17/2012 05:00 AM, r-help-requ...@r-project.org wrote: This works ok: cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = data) fit = survfit(cox, newdata=data[1:100,]) but using strata leads to problems: cox.s = coxph(surv ~ bucket*(today + accor + both) + strata(activity), data = data) fit.s = survfit(cox.s, newdata=data[1:100,]) Error in model.frame.default(data = data[1:100, ], formula = ~bucket + : number of variables != number of variable names Note that the following give rise to the same error: fit.s = survfit(cox.s, newdata=data) Error in model.frame.default(data = data, formula = ~bucket + today + : number of variables != number of variable names but if I use data implicitly, all is working fine: fit.s = survfit(cox.s) Any idea on how I could solve this? Best, and thank you, ge -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQEcBAEBAgAGBQJQqoIKAAoJEDf4/Woixvcht1gH/iHxE1liaML5j/8ruEfXX85P vNeQaZHDVZnrYDCbBPxgO/SlpIUmDatUOO9vhG1vBjnalMnftHJqBJCLz8lFswNy z2CepUe2HoX/CcKI5QVlPfXvYzWHBHXbKwYmq9dI+WpNZg0qbyeP3n4ac4ZBsNN+ uzT7gjacA60/zfVOf7D+Rdno+W15Xd8ySrHZU3naPutGN7mGWdgVUlP2wwudad19 2HlTVun40OYLV9TWLJsstYgtead4PamDXvCYrQWZeC29CQesOJ0KzUpojAYWtOpb jZkeh3F+7xKIa4DuBsGQBnIvf8b+vguvSPpVfkrjLCD/6jtVDyyslp6vEISyikw= =+M15 -END PGP SIGNATURE- small.csv.gz.sig Description: PGP signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to subset my data and at the same time keep the balance?
Thanks a lot! I got some ideas from all the replies and here is the final one. newdata select - sample(nrow(newdata), nrow(newdata) * .7) data70 - newdata[select,] # select write.csv(data70, data70.csv, row.names=FALSE) data30 - newdata[-select,] # testing write.csv(data30, data30.csv, row.names=FALSE) Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice: defining grouping variable only for the upper/lower panel with splom
On Mon, Nov 19, 2012 at 5:42 AM, AnjaM a.miren...@gmail.com wrote: Using the mtcars dataset, how to define the grouping variable to be valid only for the upper or lower panel? The following doesn't work: # Code start Almost : splom(~data.frame(mpg, disp, hp, drat, wt, qsec), data=mtcars, pscales=0, auto.key=list(columns=3), upper.panel = function(...,groups){ panel.grid(...) panel.xyplot(groups=mtcars$cyl,...) } ) # Code end -- View this message in context: http://r.789695.n4.nabble.com/lattice-defining-grouping-variable-only-for-the-upper-lower-panel-with-splom-tp4650033.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help vectorizing data generation for IRT graded response model
Hello, I have code that will generate data for a 5 category IRT graded response model. However, the code could be improved through vectorizing. Here is the code below: ###Inputs N - 100 #Number of people taking test n - 10 #Number of items nCat - 5 #number of categories ###Generate Item parameters for 10 items a.1 - rlnorm(n, .25, .5) b.1 - matrix(0, n, (nCat - 1)) theta - rnorm(N, mean = 0, sd =1) ###Generate threhold parameters for b.1 GRM with 10 items there are 4 thresholds for 5 categories ###Using this method is for b dist as N(0,1) since b.1[,1] mean is -.6 ###and b.1[,2] to b.1[,4] are all .2 b.1[, 1] - rnorm(n, -.6, 1) for(j in 1:n) { b.1[j, 2] - b.1[j,1] + runif(1, .5, .9) b.1[j, 3] - b.1[j,2] + runif(1, .5, .9) b.1[j, 4] - b.1[j,3] + runif(1, .5, .9) } ###This code simulates participants taking a test and generates the 5 category item responses p - array(0,c(N,n,nCat)) pstar - array(1,c(N,n,nCat)) u - matrix(0,N,n) for (i in 1:N) { for (j in 1:n) { #Draw a random number to determine categories r - runif(1, 0, 1) for (k in 2:nCat) { pstar[i, j, k] - 1 / (1 + exp(-a.1[j] * (theta[i] - b.1[j, (k-1)]))) p[i,j,(k-1)] - pstar[i, j, (k-1)] - pstar[i, j, k] } p[i, j, nCat] - pstar[i, j, 5] #probability of last category or higher is that category if (r = p[i, j, 1]) { u[i,j] - 1 } else if (r = p[i,j,1] + p[i,j,2]) { u[i,j] - 2 } else if (r = p[i,j,1] + p[i,j,2] + p[i,j,3]) { u[i,j] - 3 } else if (r = p[i,j,1] + p[i,j,2] + p[i,j,3] + p[i,j,4]) { u[i,j] -4 } else if (r = 1) { u[i,j] -5 } } } Obviously, that is really long and hairy. I am wondering what would be the best way to write this more compactly. In conjunction with a colleague, I have a solution for a binary IRT model with response categories 0,1. Here is that code: N - 100 n - 10 a.1 - rlnorm(n, .25, .5) b.1 - rnorm(n, 0, 1) theta - rnorm(N, 0, 1) ###Function to generate 2PL data dichEq - function(a.1, b.1, theta) { 1/(1 + exp(-a.1 * (theta -b.1))) } probVal - mapply(FUN = function(x,y) dichEq(x,y,theta), a.1, b.1) u - apply(probVal, 2, function(x) rbinom(length(x), 1, x)) Can someone provide some guidance on how to generalize this to the ordered category case? Thanks, Jared -- View this message in context: http://r.789695.n4.nabble.com/Help-vectorizing-data-generation-for-IRT-graded-response-model-tp4650062.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generated list element names
I missed the last snipet; just saw the first. So you have your solution. If you want a function, try: f.newList - function(x,name){.x - list(x);names(.x) - name;.x} f.newList(10, paste('f', 'oo', sep = '')) $foo [1] 10 On Mon, Nov 19, 2012 at 1:32 PM, Sam Steingold s...@gnu.org wrote: * jim holtman wubyg...@tznvy.pbz [2012-11-19 13:14:05 -0500]: How about this (if you don't like writing two lines, encapsulate it in a function): x - list(10) names(x) - paste('f', 'oo', sep = '') str(x) List of 1 $ foo: num 10 I am sorry, how is this different from my second snippet (except that you use x and I use z and you use single quotes in paste and I use double quotes)? On Mon, Nov 19, 2012 at 1:07 PM, Sam Steingold s...@gnu.org wrote: How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek=$RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://memri.org http://truepeace.org http://ffii.org http://think-israel.org http://jihadwatch.org http://palestinefacts.org The only time you have too much fuel is when you're on fire. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot Area Dimensions
Dear colleagues, I wish to create a figure with 6 plots arranged vertically with no spacing between them as they all have a common x-axis. However, using the code below I'm unable to get the plot area the same size for each plot. The bottom plot with the x-axis label is smaller than the others, as is the top plot which has larger margins. How can I get the plot region the same size for all 6 plots, whislt still having a large enough margin for the x-axis label on the bottom plot? y-rnorm(1:100) x-rnorm(1:100) par(mfrow=c(6,1)) par(mar=c(0,5,2,5)) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(0,5,0,5)) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(4,5,0,5)) plot(y~x, xlab=x, ylab=y) Regards Richard -- View this message in context: http://r.789695.n4.nabble.com/Plot-Area-Dimensions-tp4650051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survfit number of variables != number of variable names
Hi - I've seen a similar issue going on with survfit when using strata in the model, although I get a different error message from ge. If it helps to track down the problem (rather than confusing things further) here is some code that should reproduce the issue I've seen. I'm running R 2.15.2, with survival version 2.36-14 . Cheers, Carina ## library(survival) # use aml dataset from survival create 3 imaginary possible stratas - strat3 has 3 levels, strat4 has 4 levels, strat5 has 5 levels aml$strat3-as.factor(rep(c(1:3),length=nrow(aml))) aml$strat4-as.factor(rep(c(1:4),length=nrow(aml))) aml$strat5-as.factor(rep(c(1:5),length=nrow(aml))) # create a counting process format dataset from aml - call this aml2 aml2-survSplit(aml,cut=c(5,10,50),end=time,start=start,event=status,episode=i,id=indiv) # create a dataset of 4 'new' observations - call this aml.new aml.new-aml[1:4,] aml.new$time-c(30,50,70,100) aml.new$status-1 aml.new$x[1:4]-c(rep(Maintained,2),rep(Nonmaintained,2)) aml.new$strat3[1:4]-1 aml.new$strat4[1:4]-1 aml.new$strat5[1:4]-1 # create a counting process format dataset from aml.new - call this aml2.new aml2.new-survSplit(aml.new,cut=c(5,10,50),end=time,start=start,event=status,episode=i,id=indiv) # First a model using no strata - survfit works fine on new dataset myModel-coxph(Surv(start, time, status) ~ x,data = aml2) plot(survfit(myModel,aml2.new,id=indiv)) # Now a model using strata = strat4 (which has 4 levels) - survfit again works fine on new dataset (which has 4 new individuals) myModel-coxph(Surv(start, time, status) ~ x+strata(strat4),data = aml2) plot(survfit(myModel,aml2.new,id=indiv)) # Now a model using strata = strat3 (which has 3 levels) - survfit works here too myModel-coxph(Surv(start, time, status) ~ x+strata(strat3),data = aml2) plot(survfit(myModel,aml2.new,id=indiv)) # Now a model using strata = strat5 (which has 5 levels) - survfit now does not work, with error saying # Error in survfitcoxph.fit(y, x, wt, x2, risk, newrisk, strata, se.fit, : # 'names' attribute [5] must be the same length as the vector [4] myModel-coxph(Surv(start, time, status) ~ x+strata(strat5),data = aml2) plot(survfit(myModel,aml2.new,id=indiv)) # Now recreate aml.new but with 3 rather than 4 'new' observations rm(aml.new) aml.new-aml[1:3,] aml.new$time-c(30,50,70) aml.new$status-1 aml.new$x[1:3]-c(rep(Maintained,2),rep(Nonmaintained,1)) aml.new$strat3[1:3]-1 aml.new$strat4[1:3]-1 aml.new$strat5[1:3]-1 # create a counting process format dataset from aml.new aml2.new-survSplit(aml.new,cut=c(5,10,50),end=time,start=start,event=status,episode=i,id=indiv) # Survfit on model using strat3 still works myModel-coxph(Surv(start, time, status) ~ x+strata(strat3),data = aml2) plot(survfit(myModel,aml2.new,id=indiv)) # But Survfit on model using strat4 doesn't work now myModel-coxph(Surv(start, time, status) ~ x+strata(strat4),data = aml2) plot(survfit(myModel,aml2.new,id=indiv)) # Survfit on strat5 model doesn't work either myModel-coxph(Surv(start, time, status) ~ x+strata(strat5),data = aml2) plot(survfit(myModel,aml2.new,id=indiv)) On 19 November 2012 17:01, Terry Therneau thern...@mayo.edu wrote: I can't reproduce the problem. Tell us what version of R and what version of the survival package. Create a reproducable example. I don't know if some variables are numeric and some are factors, how/where the surv object was defined, etc. Terry Therneau On 11/17/2012 05:00 AM, r-help-requ...@r-project.org wrote: This works ok: cox = coxph(surv ~ bucket*(today + accor + both) + activity, data = data) fit = survfit(cox, newdata=data[1:100,]) but using strata leads to problems: cox.s = coxph(surv ~ bucket*(today + accor + both) + strata(activity), data = data) fit.s = survfit(cox.s, newdata=data[1:100,]) Error in model.frame.default(data = data[1:100, ], formula = ~bucket + : number of variables != number of variable names Note that the following give rise to the same error: fit.s = survfit(cox.s, newdata=data) Error in model.frame.default(data = data, formula = ~bucket + today + : number of variables != number of variable names but if I use data implicitly, all is working fine: fit.s = survfit(cox.s) Any idea on how I could solve this? Best, and thank you, ge __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and
Re: [R] Biologist R learner
Please take the advice of Berend if you ever want to get help here. Also, you will need to do some basic and initial work yourself; read what he suggested, use Google to search keywords, and see great help like this: http://cran.r-project.org/doc/contrib/Short-refcard.pdf also, use the built in help pages by using a question mark like this: ?min ?max ?ifelse ?cumsum Good luck; don't let a sharp learning curve stop you from learning R. Anna23 wrote I am a Biologist and a beginner, please help me to solve this: please anyone..its my homework and I dont have a clue abt R yet.. Write a function which does the following tasks: (a) Calculates minimum and maximum value of a given argument x. (b) If x is positive, some new vector gets the value of TRUE, and FALSE otherwise. (c) Creates a vector where the i:th and (i-1):th values of x are always summed. First value of the new vector has the same value as the first component of x. Use the created function to some vector x to show that the function works. -- View this message in context: http://r.789695.n4.nabble.com/Biologist-R-learner-tp4650044p4650066.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survfit number of variables != number of variable names
Hi! In answer to: I noticed that you were using what might be called an externally created Surv object. I have a memory that Terry Therneau has criticized that practice. I cannot remember if it was in exactly this situation but I might ask if setting up the model as: cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) + activity, data = data) ... might give the survival machinery a better handle on where everything might be found. I tried to create the Surv object internally but I face the same issue: (cox.s = coxph(Surv(time=absence, event=(censored==FALSE)) ~ bucket*(today) + strata(activity), data = small)) Call: coxph(formula = Surv(time = absence, event = (censored == FALSE)) ~ bucket * (today) + strata(activity), data = small) coef exp(coef) se(coef) zp bucket5750.4526 1.5720.740 0.612 0.54 todayTRUE -0.0886 0.9150.676 -0.131 0.90 bucket575:todayTRUE -0.1670 0.8460.794 -0.210 0.83 Likelihood ratio test=2.32 on 3 df, p=0.509 n= 100, number of events= 100 fit = survfit(cox.s, newdata=small[1:50,]) Error in model.frame.default(data = small[1:50, ], formula = ~bucket + : number of variables != number of variable names Best, and thank you for the suggestion. ge -- View this message in context: http://r.789695.n4.nabble.com/survfit-number-of-variables-number-of-variable-names-tp4649834p4650080.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xts plot behavior
*Hi I have problem with plot.xts . I try to subset some data in a xts time series.* *subseting works fore more that one event* *But I receive nothing, If I try to get one event * I'm happy for every hint! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/xts-plot-behavior-tp4650069.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generated list element names
On Nov 19, 2012, at 10:46 AM, William Dunlap wrote: If you have a list and want to add a new (or replace a) named component use myList[[compName]] - compValue as in myList - list() compName - Incr compValue - function(x) x + 1 myList[[compName]] - compValue If you want to make a new list-with-names from scratch try structure(list(1, cat, function(x)x+1), names=c(One,Pet,Increment)) (structure() is a general way to make an object and add attributes to it in one statement.) I'm guessing that Sam wanted to see: myList - list() myList[[ paste0(fo, o) ]] - 10 myList $foo [1] 10 Or: structure(list(10), names=paste0(fo, o) ) $foo [1] 10 (At least that's my guess from the context of the question.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org ] On Behalf Of Sam Steingold Sent: Monday, November 19, 2012 10:07 AM To: r-help@r-project.org Subject: [R] generated list element names How can I create lists with element names created on the fly? --8---cut here---start-8--- list (foo = 10) $foo [1] 10 list (foo = 10) $foo [1] 10 list (paste(f,oo,sep=) = 10) Error: unexpected '=' in list (paste(f,oo,sep=) = --8---cut here---end---8--- I understand that tags in list() are not evaluated, but is there a more elegant way than --8---cut here---start-8--- z - list(10) names(z) - paste(f,oo,sep=) z $foo [1] 10 --8---cut here---end---8--- thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 12.04 (precise) X 11.0.11103000 http://www.childpsy.net/ http://www.memritv.org http://thereligionofpeace.com http://truepeace.org Unix roulette: `dd if=/dev/urandom of=/dev/kmem bs=1 count=1 seek= $RANDOM` __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Classification methods - which one?
Dear Max, first: Thanks a lot for your suggestion and the open words about methods in real life. I guess: Thats my problem. Regarding my analysis: Yes, thats the problem and I have to coerce to do this analysis regarding lack of time to start something/other methods. So you suggest Linear Discriminant Analysis. Is there a special packages you recommend? Nearest Shrunken Centroids i checked with the package PAMR (http://www-stat.stanford.edu/~tibs/PAM/Rdist/doc/readme.html) The example works fine but I guess i have to many rows (or in this case genes) for the analysis. My main problem is that i cannot reduce the amount of the genes because some of the bosses want to compare the output of classification methods with a ruled-based algorithm which works with all genes (after P/A calls and an alternative CDF) on the array. So an reduction of the 17 000 genes is only possible in a limited way (around 7000 genes after some pre-processing steps). For all tips and suggestions I am more than happy. Best Peter Am 19.11.2012 um 16:36 schrieb Max Kuhn mxk...@gmail.com: My suggestion is not to do any predictive modeling. Basically, the data doesn't support a sensible and reproducible model. Yes, the literature is saturated with this type of analysis but almost none of the examples have any utility in real life. Stick to differential expression analysis, investigate the results statistically and biologically then design a prospective experiment with a specific set of genes and a more refined measurement system. If you are doing this analysis to learn something from the data (as opposed to generating accurate predictions), a predictive model is one of the worst ways of going about it. If you are coerced to do this analysis, stick to linear methods (regularized LDA, nearest shrunken centroids, etc) that are less likely to over-fit and bias yourself towards those that have embedded feature selection. Max On Mon, Nov 19, 2012 at 10:16 AM, Peter Kupfer peter.kup...@me.com wrote: Dear all, i searched for some classification methods and I have no glue if i took the right once. My problem: I have a matrix with 17000 rows and 33 colums (genes and patients). The patients are grouped into 3 diseases. No I want to classify the patients and for sure i want to know which rows are more helpful for the classification than others. I tried SVM and random forest. Do you think this are the right classification methods? Maybe there are some hints you can give me. I am more familiar with the Bioconductor packages. Furthermore: This is/was not my field of study in the past but I want to understand it and I am willing to deal with this field. Would be amazing if one of the (more) mathematical people can give me a hint. Thanks and all the best Peter PS: I can upload my underlying data if somebody is interested __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Max __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple linear regression with proportion data
On 12-11-19 10:18 AM, S Ellison wrote: -Original Message- Can I use simple linear regression when I have proportion data for both dependent and independent variables? Or, should I use beta regression analysis? Or any suggestion? The distribution of the independent variable is irrelevant (in some circumstances it matters whether it is measured without error or not). Agreed that if you just want a line that goes somewhere near the data you can do pretty much anything. But don't those circumstances you referred to include 'any time you want an unbiased estimate of the slope or a reliable standard error on coefficients'? Did you see that I wrote distribution of the **independent** variable above? Ben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Area Dimensions
I think this task would be easier in lattice library(lattice) xyplot(y + y + y + y + y + y ~ x, outer=TRUE, layout=c(1,6), strip=FALSE, strip.left=TRUE, ylab=6 copies of the Y variable, main=put an interesting title here) Six different y variables instead of six copies of the same would give a more interesting plot: tmp - data.frame(matrix(rnorm(700), 100, 7, dimnames=list(1:100, c(x,y1,y2,y3,y4,y5,y6 xyplot(y1 + y2 + y3 + y4 + y5 + y6 ~ x, data=tmp, outer=TRUE, layout=c(1,6), strip=FALSE, strip.left=TRUE, ylab=6 levels of the y response, main=put an interesting title here) Rich On Mon, Nov 19, 2012 at 10:51 AM, Richard James richard.j.coo...@uea.ac.ukwrote: Dear colleagues, I wish to create a figure with 6 plots arranged vertically with no spacing between them as they all have a common x-axis. However, using the code below I'm unable to get the plot area the same size for each plot. The bottom plot with the x-axis label is smaller than the others, as is the top plot which has larger margins. How can I get the plot region the same size for all 6 plots, whislt still having a large enough margin for the x-axis label on the bottom plot? y-rnorm(1:100) x-rnorm(1:100) par(mfrow=c(6,1)) par(mar=c(0,5,2,5)) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(0,5,0,5)) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(4,5,0,5)) plot(y~x, xlab=x, ylab=y) Regards Richard -- View this message in context: http://r.789695.n4.nabble.com/Plot-Area-Dimensions-tp4650051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in Sweave but not underlying script
No, that did not resolve the issue, but thanks for the suggestion. Daniel Bush | School Finance Consultant School Financial Services | Wis. Dept. of Public Instruction daniel.bush -at- dpi.wi.gov | 608-267-9212 -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Friday, November 16, 2012 1:51 PM To: Bush, Daniel P. DPI Cc: 'r-help@r-project.org' Subject: Re: [R] Error in Sweave but not underlying script On 16/11/2012 2:26 PM, Bush, Daniel P. DPI wrote: I'm trying to use Sweave to create a dynamic report of a variety of financial data checks. I have an .R code file to pull the data from a database, manipulate and filter it, and create individual data frames for each test. My Sweave .RNW document then calls that file with source() to generate the data for the report. The .R file works fine on its own, but when I run it from within the Sweave document I get the following error message: Error in .subset(x, j) : only 0's may be mixed with negative subscripts Again, the .R code works perfectly well on its own--I only get the error when calling it through Sweave. Is there some quirk to Sweave that certain functions don't work properly? No, it's a pretty standard evaluation environment. However, it may be running R without some functions that exist in your workspace when you source the script within an R session. It's also possible (but doesn't seem likely) that RStudio is causing some problems; you could try running R CMD Sweave yourdoc.Rnw from the command line, outside of Rstudio, to see if that makes a difference. Duncan Murdoch I am using the built-in Sweave function within RStudio 0.97.168. DB Daniel Bush School Finance Consultant School Financial Services Wisconsin Department of Public Instruction PO Box 7841 | Madison, WI 53707-7841 daniel.bush -at- dpi.wi.gov | sfs.dpi.wi.gov Ph: 608-267-9212 | Fax: 608-266-2840 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in Sweave but not underlying script
On 19/11/2012 3:27 PM, Bush, Daniel P. DPI wrote: No, that did not resolve the issue, but thanks for the suggestion. Here's another possibility: your Sweave session may not be setting the same option defaults in startup code as your regular session. This one just bit me: I normally work with options(stringsAsFactors = FALSE), because I don't like to have strings automatically converted to factors. However, this option wasn't set when running Sweave externally. Duncan Murdoch Daniel Bush | School Finance Consultant School Financial Services | Wis. Dept. of Public Instruction daniel.bush -at- dpi.wi.gov | 608-267-9212 -Original Message- From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] Sent: Friday, November 16, 2012 1:51 PM To: Bush, Daniel P. DPI Cc: 'r-help@r-project.org' Subject: Re: [R] Error in Sweave but not underlying script On 16/11/2012 2:26 PM, Bush, Daniel P. DPI wrote: I'm trying to use Sweave to create a dynamic report of a variety of financial data checks. I have an .R code file to pull the data from a database, manipulate and filter it, and create individual data frames for each test. My Sweave .RNW document then calls that file with source() to generate the data for the report. The .R file works fine on its own, but when I run it from within the Sweave document I get the following error message: Error in .subset(x, j) : only 0's may be mixed with negative subscripts Again, the .R code works perfectly well on its own--I only get the error when calling it through Sweave. Is there some quirk to Sweave that certain functions don't work properly? No, it's a pretty standard evaluation environment. However, it may be running R without some functions that exist in your workspace when you source the script within an R session. It's also possible (but doesn't seem likely) that RStudio is causing some problems; you could try running R CMD Sweave yourdoc.Rnw from the command line, outside of Rstudio, to see if that makes a difference. Duncan Murdoch I am using the built-in Sweave function within RStudio 0.97.168. DB Daniel Bush School Finance Consultant School Financial Services Wisconsin Department of Public Instruction PO Box 7841 | Madison, WI 53707-7841 daniel.bush -at- dpi.wi.gov | sfs.dpi.wi.gov Ph: 608-267-9212 | Fax: 608-266-2840 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
Thanks Steve, what is the analogue of .N for min and max? i.e., what is the data.table's version of aggregate(infl$delay,by=list(infl$share.id),FUN=min) aggregate(infl$delay,by=list(infl$share.id),FUN=max) thanks! Sam. On Fri, Sep 14, 2012 at 3:40 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Fri, Sep 14, 2012 at 3:26 PM, Sam Steingold s...@gnu.org wrote: I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns). I want to get the result of table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x) alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is 24.3G, and no end in sight. both V1 and V2 are characters (not factors). Is there anything I could do to speed this up? Thanks. You might find you'll get a lot of mileage out of data.table when working with such large data.frames ... To get something close to what you're after, you can try: R library(data.table) R Z - as.data.table(Z) R setkeyv(Z, 'V2') R agg - Z[, list(count=.N), by='V2'] From here you might R tab1 - table(agg$count) I think that'll get you where you want to be ... I'm ashamed to say that I haven't really done much w/ aggregate since I mostly have used plyr and data.table like stuff, so I might be missing your end goal -- providing a reproducible example with a small data.frame from you can help here (for me at least). HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Sam Steingold http://sds.podval.org http://www.childpsy.net/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
Hi, On Mon, Nov 19, 2012 at 1:25 PM, Sam Steingold s...@gnu.org wrote: Thanks Steve, what is the analogue of .N for min and max? i.e., what is the data.table's version of aggregate(infl$delay,by=list(infl$share.id),FUN=min) aggregate(infl$delay,by=list(infl$share.id),FUN=max) thanks! It would be helpful if I could see a bit of your table (like `head(infl)`, if it's not too big), but anyway: there is no real analogue of min/max -- you just use them For instance, if you want the min and max of `delay` within each group defined by `share.id`, and let's assume `infl` is a data.frame, you can do something like so: R as.data.table(infl) R setkey(infl, share.id) R result - infl[, list(min=min(delay), max=max(delay)), by=share.id] HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Find the number of intersections according factor levels
Dear all, I have two vectors x and y, both of which are of length 1000. I created a factor as.factor(rep(1:100,each=10)). Now I want to count, for each factor level, the number of intersections between x and y. In other words, I would like to count the number of intersections between x[1:10] and y[1:10], the number of intersections between x[11:20] and y[11:20], the number of intersections between x[21:30] and y[21:30] and so on. I know I could use the match function or the %in% function in a loop, but would rather use some simpler way to do this, for example using tapply. Can anyone give a hint? Thanks a lot. Hanna [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is it possible to be sponsored by R?
Hi the list, I am a member of the organizing comity of the French Statistics Association (SFdS)'s conference. We are looking for sponsors. Some software (SAS, RITME, ...) are represented. Do you know if there is any possibility to be sponsored by R (or by an association close to R)? Do you think I can ask to the R fondation? Sincerely Christophe -- Christophe Genolini Maître de conférences en bio-statistique Vice président Communication interne et animation du campus Université Paris Ouest Nanterre La Défense [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Area Dimensions
You can also use layout() with base graphics. This example sets up a column of 14 strips and allocates 3 strips to the top and bottom graphs and 2 strips To the four middle graphs. Using Richard's tmp dataframe: layout(matrix(c(1, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 6), 14, 1)) layout.show(6) par(mar=c(0,5,2,5)) plot(y1~x, tmp, xlab=, xaxt=n, ylab=y) par(mar=c(0,5,0,5)) plot(y2~x, tmp, xlab=, xaxt=n, ylab=y) plot(y3~x, tmp, xlab=, xaxt=n, ylab=y) plot(y4~x, tmp, xlab=, xaxt=n, ylab=y) plot(y5~x, tmp, xlab=, xaxt=n, ylab=y) par(mar=c(4,5,0,5)) plot(y6~x, tmp, xlab=x, ylab=y) - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77843-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Richard M. Heiberger Sent: Monday, November 19, 2012 2:26 PM To: Richard James Cc: r-help@r-project.org Subject: Re: [R] Plot Area Dimensions I think this task would be easier in lattice library(lattice) xyplot(y + y + y + y + y + y ~ x, outer=TRUE, layout=c(1,6), strip=FALSE, strip.left=TRUE, ylab=6 copies of the Y variable, main=put an interesting title here) Six different y variables instead of six copies of the same would give a more interesting plot: tmp - data.frame(matrix(rnorm(700), 100, 7, dimnames=list(1:100, c(x,y1,y2,y3,y4,y5,y6 xyplot(y1 + y2 + y3 + y4 + y5 + y6 ~ x, data=tmp, outer=TRUE, layout=c(1,6), strip=FALSE, strip.left=TRUE, ylab=6 levels of the y response, main=put an interesting title here) Rich On Mon, Nov 19, 2012 at 10:51 AM, Richard James richard.j.coo...@uea.ac.ukwrote: Dear colleagues, I wish to create a figure with 6 plots arranged vertically with no spacing between them as they all have a common x-axis. However, using the code below I'm unable to get the plot area the same size for each plot. The bottom plot with the x-axis label is smaller than the others, as is the top plot which has larger margins. How can I get the plot region the same size for all 6 plots, whislt still having a large enough margin for the x-axis label on the bottom plot? y-rnorm(1:100) x-rnorm(1:100) par(mfrow=c(6,1)) par(mar=c(0,5,2,5)) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(0,5,0,5)) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(4,5,0,5)) plot(y~x, xlab=x, ylab=y) Regards Richard -- View this message in context: http://r.789695.n4.nabble.com/Plot-Area-Dimensions-tp4650051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] scatterplot3d and box3d
I created a 3d scatter plot and am trying to change the color of outer box lines with box3d. Anybody can help me to figure out how to do this? My example is: library(scatterplot3d) x=seq(1:6) y=seq(7:12) z=x*2 scatterplot3d(x, y,z) Thanks. Xin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Coefficient of Variation, NA, Aggregate
Hello helpers, I have a two part issue. FIRSTLY, I am attempting to write a function for coefficient of variation, using co.var - function(rowleyi) ( 100*sd(rowleyi)/mean(rowleyi) ) #where rowleyi is my data set, which has multiple columns and rows of data. This is not working because some of my columns have NAs. When I try to use co.var(rowleyi$TL, na.rm=TRUE) #where TL is one of my column names, it gives me an error message: Error in co.var(rowleyi$TL, na.rm = TRUE) : unused argument(s) (na.rm = TRUE) I do not know what this means. How can I get this function to work? SECONDLY, how can I then get that function to work within an aggragate? Do I still use aggregate(. ~ subspecies, data = rowleyi, CV, na.rm=TRUE) #where subspecies is the header for rows? This has worked for mean, std.error, sd, etc. Thank you! Amanda Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help: Meta-analysis with metacor
Trying to do a meta-analysis of correlations in R using the meta package; have tried several things and keep getting a similar error. Can anyone help explain the error? cor-c(-0.3018, 0.667, -3.8002, -0.607, -0.4885, -3.8002, -0.0701, 0.1348, -0.9505, -0.5709, -0.6127, -1.2419, -0.1511, -0.1054) n-c(3,4,3,3,3,3,16,36,30,9,3,3,30,4) library(meta) metacor(cor, n, data=NULL, subset=NULL, sm=ZCOR, level=0.95, level.comb=level, comb.fixed=TRUE, comb.random=TRUE, hakn=NULL, method.tau=DL, tau.preset=NULL, TE.tau=NULL, method.bias=linreg, title=title, complab=comparison, outclab=outcome) Error in data.frame(subset = NULL, comb.fixed = TRUE, comb.random = TRUE, : arguments imply differing number of rows: 0, 1 Thanks, Catherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Performing gage RR study in R w/more than 2 factors
Hi everyone, I'm fairly new to R, and I don't have a background in statistics, so please bear with me. ;-) I'm dealing with 2^k factorial designs, and I was just wondering if there's any way to analyze more than two factors of a gage RR study in R. For example, Minitab has an expanded gage RR function that lets you include up to eight additional factors besides the usual two that are present in gage studies (parts and operators). If I wanted to include n additional random factors, is there a package or built-in functionality that will allow me to do that? I've been experimenting with the SixSigma package, and that has a ss.rr method which works great---as long as your experiment only contains two factors. I've also been using lmer from lme4 to fit a linear model of my experiment, but the standard deviations generated by lmer don't match what I'm seeing in Minitab. Since all my factors are random, the formula I'm using looks like this: vals ~ 1 + (1|f1) + (1|f2) + (1|f3) + (1|f1:f2) + (1|f1:f3) + (1|f2:f3) What am I doing wrong, and how can I fix it? Thanks, Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Import excel file
Hi Cyril, please let me know the following details: - sessionInfo() output from R - Version of XLConnect you are using - Version of rJava you are using - Version of Java you are using (complete output of java -version on the command line) - Value of the JAVA_HOME environment variable if it is set (e.g. output of Sys.getenv(JAVA_HOME) in R) Best regards, Martin -- View this message in context: http://r.789695.n4.nabble.com/Import-excel-file-tp4649755p4650087.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is it possible to be sponsored by R?
Christophe Genolini cgenolin at u-paris10.fr writes: Hi the list, I am a member of the organizing comity of the French Statistics Association (SFdS)'s conference. We are looking for sponsors. Some software (SAS, RITME, ...) are represented. Do you know if there is any possibility to be sponsored by R (or by an association close to R)? Do you think I can ask to the R fondation? Sincerely Christophe I rather doubt it, but it depends what you're looking for. If you want financial support, I suspect the answer is almost definitely not; if you simply want some kind of symbolic support, then it's very probably not. The board of the R foundation is listed at http://www.r-project.org/foundation/board.html (with only a postal contact address ...) You might post to r-de...@r-project.org, which is read by some R-Core members (which overlaps with the membership of the R Foundation). These are only my guesses, certainly not definitive. Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ben Bolker's '‘emdbook’ Package , rbetabinom
arun4 arun.ganesh2012 at gmail.com writes: I am using rbetabinom ( to generate beta binomial random variables) function available in the emdbookpackage written by Professor. Ben Bolker for my research study. I have no questions with this function. However, I am looking for the theoretical method/algorithm of the function rbetabinom . Morris (1997), American Naturalist 150:299-327 is given as the reference in the package, But I couldn't fund any theoretical methods to generate beta binomial random variables in this article. I would like to kind request you to suggest me any journal paper or document to study the methods of generating beta binomial random variables used to write the function rbetabinom. I answered this off-list (suggesting that the OP look at the source code of the function, which is very simple). The book gives other references to the beta-binomial, including Crowder (1978), Reeve and Murdoch (1985), and Hatfield (1996), although the latter two are ecological rather than statistical examples and I doubt any of them discusses random deviate generation. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficient of Variation, NA, Aggregate
HI, For the first part, may be this helps: set.seed(5) mat1-matrix(sample(c(1:9,NA),20,replace=TRUE),ncol=5) rowleyi-data.frame(mat1) co.var-function(x) 100*(sd(x,na.rm=TRUE)/mean(x,na.rm=TRUE)) apply(rowleyi,2,function(x) co.var(x)) # X1 X2 X3 X4 X5 #53.29387 49.53113 45.82576 35.35534 34.99271 #or sapply(rowleyi,function(x) co.var(x)) # X1 X2 X3 X4 X5 #53.29387 49.53113 45.82576 35.35534 34.99271 A.K. - Original Message - From: Amanda Jones akjone...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, November 19, 2012 4:01 PM Subject: [R] Coefficient of Variation, NA, Aggregate Hello helpers, I have a two part issue. FIRSTLY, I am attempting to write a function for coefficient of variation, using co.var - function(rowleyi) ( 100*sd(rowleyi)/mean(rowleyi) ) #where rowleyi is my data set, which has multiple columns and rows of data. This is not working because some of my columns have NAs. When I try to use co.var(rowleyi$TL, na.rm=TRUE) #where TL is one of my column names, it gives me an error message: Error in co.var(rowleyi$TL, na.rm = TRUE) : unused argument(s) (na.rm = TRUE) I do not know what this means. How can I get this function to work? SECONDLY, how can I then get that function to work within an aggragate? Do I still use aggregate(. ~ subspecies, data = rowleyi, CV, na.rm=TRUE) #where subspecies is the header for rows? This has worked for mean, std.error, sd, etc. Thank you! Amanda Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate() runs out of memory
On Nov 19, 2012, at 1:25 PM, Sam Steingold wrote: Thanks Steve, what is the analogue of .N for min and max? ?seq i.e., what is the data.table's version of aggregate(infl$delay,by=list(infl$share.id),FUN=min) aggregate(infl$delay,by=list(infl$share.id),FUN=max) DT[, list( max(v)), by=x] x V1 1: a 3 2: b 6 3: c 9 thanks! Sam. On Fri, Sep 14, 2012 at 3:40 PM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Fri, Sep 14, 2012 at 3:26 PM, Sam Steingold s...@gnu.org wrote: I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns). I want to get the result of table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x) alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is 24.3G, and no end in sight. both V1 and V2 are characters (not factors). Is there anything I could do to speed this up? Thanks. You might find you'll get a lot of mileage out of data.table when working with such large data.frames ... To get something close to what you're after, you can try: R library(data.table) R Z - as.data.table(Z) R setkeyv(Z, 'V2') R agg - Z[, list(count=.N), by='V2'] From here you might R tab1 - table(agg$count) I think that'll get you where you want to be ... I'm ashamed to say that I haven't really done much w/ aggregate since I mostly have used plyr and data.table like stuff, so I might be missing your end goal -- providing a reproducible example with a small data.frame from you can help here (for me at least). HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Sam Steingold http://sds.podval.org http://www.childpsy.net/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatterplot3d and box3d
On 12-11-19 12:20 PM, x...@genome.wustl.edu wrote: I created a 3d scatter plot and am trying to change the color of outer box lines with box3d. Anybody can help me to figure out how to do this? My example is: library(scatterplot3d) x=seq(1:6) y=seq(7:12) z=x*2 scatterplot3d(x, y,z) See ?scatterplot3d. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] scatterplot3d and box3d
xgao at genome.wustl.edu writes: I created a 3d scatter plot and am trying to change the color of outer box lines with box3d. Anybody can help me to figure out how to do this? My example is: library(scatterplot3d) x=seq(1:6) y=seq(7:12) z=x*2 scatterplot3d(x, y,z) This is not going to be possible in the narrow sense: box3d is from the rgl package, which uses a completely different graphics device/protocol. You can combine plot3d with box3d (or rgl.bbox) By the way, seq(7:12) probably doesn't do what you think it does! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot Area Dimensions
The key is to not change the margins, set them once and stick with those margins. The next question then becomes how do I leave area at the top/bottom for the title and common axis? to which the answer is Set outer margins at the beginning. Modifying your code: y-rnorm(1:100) x-rnorm(1:100) par(mfrow=c(6,1), mar=c(0,5,0,5), oma=c(4,0,2,0)) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=x, ylab=y) On Mon, Nov 19, 2012 at 8:51 AM, Richard James richard.j.coo...@uea.ac.ukwrote: Dear colleagues, I wish to create a figure with 6 plots arranged vertically with no spacing between them as they all have a common x-axis. However, using the code below I'm unable to get the plot area the same size for each plot. The bottom plot with the x-axis label is smaller than the others, as is the top plot which has larger margins. How can I get the plot region the same size for all 6 plots, whislt still having a large enough margin for the x-axis label on the bottom plot? y-rnorm(1:100) x-rnorm(1:100) par(mfrow=c(6,1)) par(mar=c(0,5,2,5)) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(0,5,0,5)) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) plot(y~x, xlab=, xaxt=n, ylab=y) par(mar=c(4,5,0,5)) plot(y~x, xlab=x, ylab=y) Regards Richard -- View this message in context: http://r.789695.n4.nabble.com/Plot-Area-Dimensions-tp4650051.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficient of Variation, NA, Aggregate
HI, Your example dataset is in unreadable format. You could use dput(). set.seed(5) mat1-matrix(sample(c(1:9,NA),20,replace=TRUE),ncol=5) rowleyi-data.frame(mat1) co.var-function(x) 100*(sd(x,na.rm=TRUE)/mean(x,na.rm=TRUE)) rowleyi-data.frame(subspecies=rep(LETTERS[1:2],2),rowleyi) with(rowleyi,aggregate(cbind(X1,X2,X3,X4,X5),by=list(subspecies),function(x) co.var(x))) Group.1 X1 X2 X3 X4 X5 1 A NA 70.710678 NA 20.20305 28.28427 2 B 56.56854 8.318903 60.60915 47.14045 0.0 With your aggregate() aggregate(.~subspecies,data=rowleyi,co.var) # subspecies X1 X2 X3 X4 X5 #1 B 56.56854 8.318903 60.60915 47.14045 0 A.K. - Original Message - From: Amanda Jones akjone...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, November 19, 2012 4:01 PM Subject: [R] Coefficient of Variation, NA, Aggregate Hello helpers, I have a two part issue. FIRSTLY, I am attempting to write a function for coefficient of variation, using co.var - function(rowleyi) ( 100*sd(rowleyi)/mean(rowleyi) ) #where rowleyi is my data set, which has multiple columns and rows of data. This is not working because some of my columns have NAs. When I try to use co.var(rowleyi$TL, na.rm=TRUE) #where TL is one of my column names, it gives me an error message: Error in co.var(rowleyi$TL, na.rm = TRUE) : unused argument(s) (na.rm = TRUE) I do not know what this means. How can I get this function to work? SECONDLY, how can I then get that function to work within an aggragate? Do I still use aggregate(. ~ subspecies, data = rowleyi, CV, na.rm=TRUE) #where subspecies is the header for rows? This has worked for mean, std.error, sd, etc. Thank you! Amanda Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficient of Variation, NA, Aggregate
Fantastic, thank you! On Mon, Nov 19, 2012 at 3:44 PM, arun smartpink...@yahoo.com wrote: HI, Your example dataset is in unreadable format. You could use dput(). set.seed(5) mat1-matrix(sample(c(1:9,NA),20,replace=TRUE),ncol=5) rowleyi-data.frame(mat1) co.var-function(x) 100*(sd(x,na.rm=TRUE)/mean(x,na.rm=TRUE)) rowleyi-data.frame(subspecies=rep(LETTERS[1:2],2),rowleyi) with(rowleyi,aggregate(cbind(X1,X2,X3,X4,X5),by=list(subspecies),function(x) co.var(x))) Group.1 X1X2 X3 X4 X5 1 A NA 70.710678 NA 20.20305 28.28427 2 B 56.56854 8.318903 60.60915 47.14045 0.0 With your aggregate() aggregate(.~subspecies,data=rowleyi,co.var) # subspecies X1 X2 X3 X4 X5 #1 B 56.56854 8.318903 60.60915 47.14045 0 A.K. - Original Message - From: Amanda Jones akjone...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, November 19, 2012 4:01 PM Subject: [R] Coefficient of Variation, NA, Aggregate Hello helpers, I have a two part issue. FIRSTLY, I am attempting to write a function for coefficient of variation, using co.var - function(rowleyi) ( 100*sd(rowleyi)/mean(rowleyi) ) #where rowleyi is my data set, which has multiple columns and rows of data. This is not working because some of my columns have NAs. When I try to use co.var(rowleyi$TL, na.rm=TRUE) #where TL is one of my column names, it gives me an error message: Error in co.var(rowleyi$TL, na.rm = TRUE) : unused argument(s) (na.rm = TRUE) I do not know what this means. How can I get this function to work? SECONDLY, how can I then get that function to work within an aggragate? Do I still use aggregate(. ~ subspecies, data = rowleyi, CV, na.rm=TRUE) #where subspecies is the header for rows? This has worked for mean, std.error, sd, etc. Thank you! Amanda Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficient of Variation, NA, Aggregate
HI, No problem. You got two NA in the previous example. According to the coefficient of variaion documentation in R (http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/raster/html/cv.html) Compute the coefficient of variation (expressed as a percentage). If there is only a single value, sd is NA and cv returns NA if aszero=FALSE (the default). However, if (aszero=TRUE), cv returns 0. If I use another example: set.seed(5) mat1-matrix(sample(c(1:10,NA),30,replace=TRUE),ncol=5) rowleyi-data.frame(mat1) co.var-function(x) 100*(sd(x,na.rm=TRUE)/mean(x,na.rm=TRUE)) rowleyi-data.frame(subspecies=rep(LETTERS[1:2],3),rowleyi) with(rowleyi,aggregate(cbind(X1,X2,X3,X4,X5),by=list(subspecies),function(x) co.var(x))) # Group.1 X1 X2 X3 X4 X5 #1 A 28.28427 28.28427 25.0 52.67827 57.73503 #2 B 34.64102 61.97443 52.67827 51.50788 NA #Other way: do.call(cbind,lapply(lapply(lapply(rowleyi[,-1],function(x) data.frame(subspecies=rowleyi[,1],x)),function(x) x[complete.cases(x),]),function(x) aggregate(.~subspecies,data=x,co.var))) # X1.subspecies X1.x X2.subspecies X2.x X3.subspecies X3.x #1 A 28.28427 A 28.28427 A 25.0 #2 B 34.64102 B 61.97443 B 52.67827 X4.subspecies X4.x X5.subspecies X5.x #1 A 52.67827 A 57.73503 #2 B 51.50788 B NA A.K. - Original Message - From: Amanda Jones akjone...@gmail.com To: arun smartpink...@yahoo.com Cc: R help r-help@r-project.org Sent: Monday, November 19, 2012 5:50 PM Subject: Re: [R] Coefficient of Variation, NA, Aggregate Fantastic, thank you! On Mon, Nov 19, 2012 at 3:44 PM, arun smartpink...@yahoo.com wrote: HI, Your example dataset is in unreadable format. You could use dput(). set.seed(5) mat1-matrix(sample(c(1:9,NA),20,replace=TRUE),ncol=5) rowleyi-data.frame(mat1) co.var-function(x) 100*(sd(x,na.rm=TRUE)/mean(x,na.rm=TRUE)) rowleyi-data.frame(subspecies=rep(LETTERS[1:2],2),rowleyi) with(rowleyi,aggregate(cbind(X1,X2,X3,X4,X5),by=list(subspecies),function(x) co.var(x))) Group.1 X1 X2 X3 X4 X5 1 A NA 70.710678 NA 20.20305 28.28427 2 B 56.56854 8.318903 60.60915 47.14045 0.0 With your aggregate() aggregate(.~subspecies,data=rowleyi,co.var) # subspecies X1 X2 X3 X4 X5 #1 B 56.56854 8.318903 60.60915 47.14045 0 A.K. - Original Message - From: Amanda Jones akjone...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, November 19, 2012 4:01 PM Subject: [R] Coefficient of Variation, NA, Aggregate Hello helpers, I have a two part issue. FIRSTLY, I am attempting to write a function for coefficient of variation, using co.var - function(rowleyi) ( 100*sd(rowleyi)/mean(rowleyi) ) #where rowleyi is my data set, which has multiple columns and rows of data. This is not working because some of my columns have NAs. When I try to use co.var(rowleyi$TL, na.rm=TRUE) #where TL is one of my column names, it gives me an error message: Error in co.var(rowleyi$TL, na.rm = TRUE) : unused argument(s) (na.rm = TRUE) I do not know what this means. How can I get this function to work? SECONDLY, how can I then get that function to work within an aggragate? Do I still use aggregate(. ~ subspecies, data = rowleyi, CV, na.rm=TRUE) #where subspecies is the header for rows? This has worked for mean, std.error, sd, etc. Thank you! Amanda Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Performing gage RR study in R w/more than 2 factors
I believe that you need to consult a local statistician, as there are likely way too many statistical issues here that you do not fully understand. Alternatively, try posting to a statistical list like stats.stackexchange.com, as I think most of your issues are primarily statistical, not R related. Cheers, Bert On Mon, Nov 19, 2012 at 11:12 AM, Matt Jacob m...@jacobmail.org wrote: Hi everyone, I'm fairly new to R, and I don't have a background in statistics, so please bear with me. ;-) I'm dealing with 2^k factorial designs, and I was just wondering if there's any way to analyze more than two factors of a gage RR study in R. For example, Minitab has an expanded gage RR function that lets you include up to eight additional factors besides the usual two that are present in gage studies (parts and operators). If I wanted to include n additional random factors, is there a package or built-in functionality that will allow me to do that? I've been experimenting with the SixSigma package, and that has a ss.rr method which works great---as long as your experiment only contains two factors. I've also been using lmer from lme4 to fit a linear model of my experiment, but the standard deviations generated by lmer don't match what I'm seeing in Minitab. Since all my factors are random, the formula I'm using looks like this: vals ~ 1 + (1|f1) + (1|f2) + (1|f3) + (1|f1:f2) + (1|f1:f3) + (1|f2:f3) What am I doing wrong, and how can I fix it? Thanks, Matt __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xts plot behavior
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example RMW On Mon, Nov 19, 2012 at 5:27 PM, swiss_guy steven.stut...@swissglobal-am.com wrote: *Hi I have problem with plot.xts . I try to subset some data in a xts time series.* *subseting works fore more that one event* *But I receive nothing, If I try to get one event * I'm happy for every hint! Thanks! -- View this message in context: http://r.789695.n4.nabble.com/xts-plot-behavior-tp4650069.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Coefficient of Variation, NA, Aggregate
Hello, Just a note, you can (should?) have an argument na.rm in your function definition with a small modification, like this: co.var - function(x,na.rm=TRUE) 100*(sd(x,na.rm=na.rm)/mean(x,na.rm=na.rm)) Then you can choose to use the default TRUE or not. Hope this helps, Rui Barradas Em 19-11-2012 22:50, Amanda Jones escreveu: Fantastic, thank you! On Mon, Nov 19, 2012 at 3:44 PM, arun smartpink...@yahoo.com wrote: HI, Your example dataset is in unreadable format. You could use dput(). set.seed(5) mat1-matrix(sample(c(1:9,NA),20,replace=TRUE),ncol=5) rowleyi-data.frame(mat1) co.var-function(x) 100*(sd(x,na.rm=TRUE)/mean(x,na.rm=TRUE)) rowleyi-data.frame(subspecies=rep(LETTERS[1:2],2),rowleyi) with(rowleyi,aggregate(cbind(X1,X2,X3,X4,X5),by=list(subspecies),function(x) co.var(x))) Group.1 X1X2 X3 X4 X5 1 A NA 70.710678 NA 20.20305 28.28427 2 B 56.56854 8.318903 60.60915 47.14045 0.0 With your aggregate() aggregate(.~subspecies,data=rowleyi,co.var) # subspecies X1 X2 X3 X4 X5 #1 B 56.56854 8.318903 60.60915 47.14045 0 A.K. - Original Message - From: Amanda Jones akjone...@gmail.com To: r-help@r-project.org Cc: Sent: Monday, November 19, 2012 4:01 PM Subject: [R] Coefficient of Variation, NA, Aggregate Hello helpers, I have a two part issue. FIRSTLY, I am attempting to write a function for coefficient of variation, using co.var - function(rowleyi) ( 100*sd(rowleyi)/mean(rowleyi) ) #where rowleyi is my data set, which has multiple columns and rows of data. This is not working because some of my columns have NAs. When I try to use co.var(rowleyi$TL, na.rm=TRUE) #where TL is one of my column names, it gives me an error message: Error in co.var(rowleyi$TL, na.rm = TRUE) : unused argument(s) (na.rm = TRUE) I do not know what this means. How can I get this function to work? SECONDLY, how can I then get that function to work within an aggragate? Do I still use aggregate(. ~ subspecies, data = rowleyi, CV, na.rm=TRUE) #where subspecies is the header for rows? This has worked for mean, std.error, sd, etc. Thank you! Amanda Jones __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kinitr
Hello, I am an Intro-level R and ggplot2 user and looking for resources to self teach dynamic report generation in R using knitr. Any advice would be highly appreciated. Thanks, Pradip __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] github
Hello, I would like to learn how to set up Github/repository and upload/update files and am looking for Github for Dummies. Any help will be appreciated. Thanks, Pradip __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kinitr
Hello, Why don't you search on Internet? Regards, Pascal Le 20/11/2012 10:57, Muhuri, Pradip (SAMHSA/CBHSQ) a écrit : Hello, I am an Intro-level R and ggplot2 user and looking for resources to self teach dynamic report generation in R using knitr. Any advice would be highly appreciated. Thanks, Pradip __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] github
On Nov 19, 2012, at 6:07 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote: Hello, I would like to learn how to set up Github/repository and upload/update files and am looking for Github for Dummies. Any help will be appreciated. Wrong list for this question. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survfit number of variables != number of variable names
On Nov 19, 2012, at 5:33 PM, Georges Dupret wrote: Hi David, Sorry for the signature files... this is automatic. I should disable that. Please find in attachment a copy of small.csv.gz I found it but I suspect nobody else will. I think Terry Therneau already got a copy. when you attached it earlier. But the rest of Rhelp did not, since .gz files will get scrubbed by the list-serv. Best, ge On 11/19/2012 02:37 PM, David Winsemius wrote: On Nov 19, 2012, at 2:23 PM, David Winsemius wrote: On Nov 19, 2012, at 11:07 AM, Georges Dupret wrote: Hi! In answer to: I noticed that you were using what might be called an externally created Surv object. I have a memory that Terry Therneau has criticized that practice. I cannot remember if it was in exactly this situation but I might ask if setting up the model as: cox = coxph(Surv(stime, event) ~ bucket*(today + accor + both) + activity, data = data) ... might give the survival machinery a better handle on where everything might be found. I tried to create the Surv object internally but I face the same issue: (cox.s = coxph(Surv(time=absence, event=(censored==FALSE)) ~ bucket*(today) + strata(activity), data = small)) Call: coxph(formula = Surv(time = absence, event = (censored == FALSE)) ~ bucket * (today) + strata(activity), data = small) All of your 'censored' were FALSE so all of your events were TRUE. My guess is that you are having problems because you end up with different model designs in the different strata: with( small, table(activity, today)) today activity FALSE TRUE (100,121] 1 13 (121,149] 28 (149,196] 04 (196,1.33e+03] 18 (30,42]18 (42,55]4 12 (55,68]29 (68,83]29 (83,100] 26 [11,30]08 I do not think it matters that you levels for the factor variable will not be in the expected order: table(small$activity) (100,121] (121,149] (149,196] (196,1.33e+03](30,42] (42,55](55,68](68,83] 14 10 4 9 9 16 11 11 (83,100][11,30] 8 8 But I do also wonder if the small numbers in each strata might be causing problems. Is it really needed to stratify so finely? -- David. coef exp(coef) se(coef) zp bucket5750.4526 1.5720.740 0.612 0.54 todayTRUE -0.0886 0.9150.676 -0.131 0.90 bucket575:todayTRUE -0.1670 0.8460.794 -0.210 0.83 Likelihood ratio test=2.32 on 3 df, p=0.509 n= 100, number of events= 100 fit = survfit(cox.s, newdata=small[1:50,]) Error in model.frame.default(data = small[1:50, ], formula = ~bucket + : number of variables != number of variable names OK. Thanks for doing that. You might want to know that the only attachment that made it through to the emailing list was a file named small.csv.gz.sig That's not a format that my system knows how to decompress ( I tried downloading GnuPG and compiling it but (hit sent button too soon. ) was unable to figure out how to decompress with GnuPG either. (It's hard to imagine this needed to be encrypted.) small.csv.gz David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] method show
Hello the list, As a simple example: rm(list = ls()) setClass(tre, representation(x=numeric)) setMethod(show, tre, def = function(object) cat(object@x))[1] show setMethod(summary, tre, def = function(object) cat(This is a tre of value , object@x, \n))Creating a generic function for summary from package base in the global environment[1] summary ls()[1] summary R copies generic summary into the current environment. I understand it as: If you want R to create a specific summary method for objects of class tre R needs the generic method in the same environment (R_GloabalEnv). In fact, summary is listed by ls in in my local env. Why R does not do the same with generic method show? I know that generic method show is from package methods while summary is package base. Does it matter somehow? Can anyone point me to the right reference? Thanks in advance for your help ... Andrea [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.