Re: [R] Discrepancies in the estimates of Partial least square (PLS) in SAS and R
rakeshnb rakeshn...@gmail.com writes: I have been using R and SAS from past 6 months and i found a interesting thing while doing PLS in R and SAS is that when we use NO SCALE option in SAS and scale=FALSE in R , we see the estimates are matching but if we use scaling option in SAS and R the estimates differ to greater extent , you can try with any data set we will get very different estimates while using the scaling option. can any one help me in this issue ? My guess is that they use different scalings, which of course will give different results. However, since you don't say anything about which R package you use for PLSR (and since I don't have access to SAS), I can only guess. :) -- Regards, Bjørn-Helge Mevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Validation of logistic models in R 2.12
Hi everyone, I am trying to validate a logistic model built in R. Not my version of R is 2.12 and I cannot install ROCR. I have gone to a point where I have the predicted values using the code: pred1 = predict(trainlogit1,testdata_1, type = response) How do I proceed from here? Is there another way in which I can plot lift charts? My model output is: Call: glm(formula = Attrition_ind ~ Time.in.com + UV_LTIA_Base + as.factor(new_hire_ind) + as.factor(promotion_ind) + as.factor(Time.in.comp...5.years) + as.factor(Change.in.Job.Code) + Positioning_num + as.factor(Below.Guideline) + as.factor(Above.Guideline) + as.factor(Drop.in.Rating.in.2010) + as.factor(Increase.in.Rating.in.2010) + as.factor(job_band), family = binomial(link = logit), data = traindata_1) Deviance Residuals: Min 1Q Median 3Q Max -0.9604 -0.4888 -0.4221 -0.3514 2.9813 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)-1.425494 0.115476 -12.345 2e-16 Time.in.comp-0.022690 0.007753 -2.927 0.003425 UV_LTIA_Base -0.960578 0.210247 -4.569 4.91e-06 as.factor(new_hire_ind)Y -0.446913 0.145942 -3.062 0.002197 as.factor(promotion_ind)Y -0.739080 0.195584 -3.779 0.000158 as.factor(Time.in.comp...5.years)Y -0.248776 0.118014 -2.108 0.035029 as.factor(Change.in.Job.Code)Y -0.244962 0.118475 -2.068 0.038675 Positioning_num-1.987576 0.529700 -3.752 0.000175 as.factor(Below.Guideline)Y-0.622856 0.171635 -3.629 0.000285 as.factor(Above.Guideline)Y-0.446067 0.252602 -1.766 0.077414 as.factor(Drop.in.Rating.in.2010)Y 0.292605 0.174543 1.676 0.093659 as.factor(Increase.in.Rating.in.2010)Y -0.315004 0.101213 -3.112 0.001856 as.factor(job_band)40 0.391219 0.108038 3.621 0.000293 as.factor(job_band)45 1.228778 0.213261 5.762 8.32e-09 as.factor(job_band)50 1.603452 0.434487 3.690 0.000224 as.factor(job_band)60 2.578905 0.559087 4.613 3.97e-06 (Intercept)*** Time.in.comp ** UV_LTIA_Base *** as.factor(new_hire_ind)Y ** as.factor(promotion_ind)Y *** as.factor(Time.in.comp...5.years)Y * as.factor(Change.in.Job.Code)Y * Positioning_num*** as.factor(Below.Guideline)Y*** as.factor(Above.Guideline)Y. as.factor(Drop.in.Rating.in.2010)Y . as.factor(Increase.in.Rating.in.2010)Y ** as.factor(job_band)40 *** as.factor(job_band)45 *** as.factor(job_band)50 *** as.factor(job_band)60 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 4737.8 on 7473 degrees of freedom Residual deviance: 4607.0 on 7458 degrees of freedom (250 observations deleted due to missingness) AIC: 4639 Number of Fisher Scoring iterations: 5 Could you please help? How can I understand if my model is a good fit or not? Regards, Doy American Express made the following annotations on Thu May 03 2012 01:13:02 ** This message and any attachments are solely for the intended recipient and may contain confidential or privileged information. If you are not the intended recipient, any disclosure, copying, use, or distribution of the information included in this message and any attachments is prohibited. If you have received this communication in error, please notify us by reply e-mail and immediately and permanently delete this message and any attachments. Thank you. American Express a ajouté le commentaire suivant le Thu May 03 2012 01:13:02 Ce courrier et toute pièce jointe qu'il contient sont réservés au seul destinataire indiqué et peuvent renfermer des renseignements confidentiels et privilégiés. Si vous n'êtes pas le destinataire prévu, toute divulgation, duplication, utilisation ou distribution du courrier ou de toute pièce jointe est interdite. Si vous avez reçu cette communication par erreur, veuillez nous en aviser par courrier et détruire immédiatement le courrier et les pièces jointes. Merci. ** --- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] take data from a file to another according to their correlation coefficient
Hi Rui it's me again. I would have another question in the function process.all you explained me. But as you already helped me a lot, and as I promised I won't disturb you again, I want to ask you first if you accept to help me one more time before telling you more precisely my problem (about adding an automatic linear regression in order to have more realistic filling data in the gaps). I wrote you a personal message (don't know if you got it), because I would like to send you a present from the Alps to thank you for all the help you gave me, and maybe the new help (and so to have your home or work postal address). If you agree, let me know and send me your address by mail. I'll explain in a new post what my boss wants me know to add in your function (this function is so tricky for me to understand with my small knowledge + Google + R help). -- View this message in context: http://r.789695.n4.nabble.com/take-data-from-a-file-to-another-according-to-their-correlation-coefficient-tp4580054p4605385.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] warning with glm.predict, wrong number of data rows
Hi, I split a data set into two partitions (80 and 42), use the first as the training set in glm and the second as testing set in glm predict. But when I call glm.predict, I get the warning message: Warning message: 'newdata' had 42 rows but variable(s) found have 80 rows - s = sample(1:122) glm.my.data=glm(my.data.class[s[1:80]]~t(my.data)[s[1:80],1:60],family=binomial) pred.my.data = predict(glm.gse13355,as.data.frame(t(my.data)[s[81:122],1:60]),type=response) Warning message: 'newdata' had 42 rows but variable(s) found have 80 rows length(pred.my.data) [1] 80 Thanks Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] deparse(substitute(x)) on an object with S3 class
Dear list, can someone explain to me why deparse(substitute(x)) does not seem to work when x is of a user-defined S3 class? In my actual problem, my print method is part of a package, and the method is registered in the NAMESPACE, if that should make a difference. print.testclass - function(x,...){ xname - deparse(substitute(x)) cat(Your object name is,xname,\n) } testlist - list() testlist[[1]] - 1:10 class(testlist) - testclass # This does not work as expected: testlist Your object name is structure(list(1:10), class = testclass) # But this does : testlist2 - unclass(testlist) print.testclass(testlist2) Your object name is testlist2 thanks, Remko Duursma -- View this message in context: http://r.789695.n4.nabble.com/deparse-substitute-x-on-an-object-with-S3-class-tp4605592.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New version of the knitr package (0.5)
The knitr package version 0.5 is on CRAN now. It has gone through extensive development in the past few months, and about 200 issues were solved (https://github.com/yihui/knitr/issues) thanks to the feedback of users, which greatly improved the quality and usefulness of this package. For a complete list of changes, see https://github.com/yihui/knitr/blob/master/NEWS Most notable new features are: - chunk options can be arbitrary valid R code, e.g. echo=!TRUE, results=ifelse(x, 'asis', 'markup')=; this makes a document really programmable, and the syntax is also consistent with normal R code; http://yihui.name/knitr/demo/sweave/ - the listings package is supported via render_listings(), and the styles are based on Sweavel.sty (courtesy of Frank Harrell); http://yihui.name/knitr/demo/listings/ - for HTML/markdown documents, R plots can be automatically uploaded to Imgur to make sure the output is self-contained (no need to copy images when publishing the output); http://yihui.name/knitr/demo/upload/ - arbitrary recursive references of code chunks with label (e.g. chunk A can call chunk B which in turn calls C); http://yihui.name/knitr/demo/reference/ - for cached chunks, their dependencies can be automatically figured out by analyzing the R code for global and local variables (see the 'autodep' option) - new chunk options fig.cap and fig.scap to create figure environments with captions in LaTeX; - Sweave concordance was implemented and integrated with RStudio so that error navigation and PDF/Rnw sync become easy - more comprehensive support to markdown and reStructuredText; markdown is also well supported in RStudio (preview version); see http://yihui.name/en/2012/05/how-to-make-html5-slides-with-knitr/ - other languages like Python and Awk can also be supported, e.g. https://github.com/yihui/knitr/blob/master/inst/examples/knitr-lang.Rmd Suggestions and questions are welcome; please join the mailing list https://groups.google.com/group/knitr or file issues to Github. Thanks! Regards, Yihui -- Yihui Xie xieyi...@gmail.com Phone: 515-294-2465 Web: http://yihui.name Department of Statistics, Iowa State University 2215 Snedecor Hall, Ames, IA ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Discrepancies in the estimates of Partial least square (PLS) in SAS and R
I am using pls package but how is scaling done in R? -- View this message in context: http://r.789695.n4.nabble.com/Discrepancies-in-the-estimates-of-Partial-least-square-PLS-in-SAS-and-R-tp4605165p4605332.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple plot loop
Trying to plot multiple lines from a simple matrix with headers (eight observations per column). I will be doing a number of these, with varying numbers of columns, and do not want to enter the header names for each one (I got frustrated and just wrote them out, which did work). Data reads fine, first plot is fine, but when i use the code at the bottom for a for i loop it tells me that x and y do not match. . . One other issue is that I would prefer not to specify the first column either, but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. Should this not call the second column? Thank you very much for any comments. I know this is simple, but I appreciate any assistance. Cheers, Ben # # LOAD DATA FROM CSV library(zoo) setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting) Data = read.csv(120503_CNAT_Summary.csv, header=T) #fill in missing data from last observation Data2 - na.locf(Data) attach(Data2) # PLOT ALL ON ONE CHART plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, col=red) title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live coral / colony, xlab=Months, col.main=black, font.main=4) lines(MONTH,T162, type=o, pch=22, lty=2, col=red) lines(MONTH,T231, type=o, pch=22, lty=2, col=green) lines(MONTH,T250, type=o, pch=22, lty=2, col=green) ##(many other similar lines here, with entered column headers . . . up to 75) lines(MONTH,T373, type=o, pch=22, lty=2, col=blue) lines(MONTH,T374, type=o, pch=22, lty=2, col=blue) lines(MONTH,T377, type=o, pch=22, lty=2, col=blue) # Tried to add lines another way with for i loop, but this is the part not working for (i in 2:length(Data2)) { lines(MONTH, i, type=o, pch=22, lty=2, col=blue)) } # __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE for estimating the parameters of the TVECM in R
Dear Mr. Matthieu Stigler i so excited for your package 'tsDyn'. firstly introduce myself, i student at Gadjah Mada University,Indonesia. i'am new user of R and applying it for solving Bi-Variate ( interest rate and inflation ) with threshold vector error correction model. now, i writing my final examination about threshold vector error correction model and i use refference from paper testing for two regime threshold cointegration in vector error correction model by hansen and seo (2002) to estimate parameter. i have tried to reduce MLE , and it's succes. now i have A1(hat), A2(hat) with MLE and gamma(hat), beta(hat) with grid search from MLE. My problem, when i using function HanSeo_TVECM() in R, this function can't running, only to estimate linier cointegration (VECM). and if i using packade tsDyn version 0-8.1, function HanSeo_TVECM() not avaliable. however there are function TVECM() but this function using CLS for estimate parameter. whether the MLE and CLS estimation would result in same relative valuesââ? can u help me sir? for function HanSen_TVECM()? thanks a lot output R from function HanSeo_TVECM HanSeo_TVECM(data,lag=2,trim= 0.05,gn=300,bn=300) ###Linear VECM estimates (by Johansen MLE) Cointegrating vector 1 -8.434287 Standard error of the cointegrating value 2.616888 Parameters ECMInterceptbi -1 inf -1 bi -2 Equation bi -0.008627598 -0.006055985 0.758366 0.02512693 -0.1240975 Equation inf 0.131374112 -0.355994435 3.971587 0.20726565 -2.7661240 inf -2 Equation bi -0.007603223 Equation inf -0.224955971 Standard errors with Eicker-White covariance matrix estimator ECM Intercept bi -1 inf -1 bi -2 inf -2 Equation bi 0.004081985 0.01663758 0.08914615 0.02456267 0.08221155 0.01799844 Equation inf 0.034760850 0.08915086 1.66241171 0.19362012 0.93278372 0.06298056 Negative LL -183.1256 AIC -159.1256 BIC -160.4877Error in solve.default(t(zzj) %*% zzj) : system is computationally singular: reciprocal condition number = 1.15007e-020 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] bcrm package update
Dear all, Version 0.3 of the bcrm package is now available on CRAN. The package allows users to fit Bayesian Continuous Reassessment Method Phase I trial designs. This version has the following new developments from version 0.1: * Stopping rules have been added, allowing stopping to be based on a maximum sample size, the maximum number to be treated at the final MTD estimate, the precision of the MTD estimate, and a minumum sample size. * Implementation of escalation based on posterior toxicity intervals using loss functions. * Posterior summaries after each recruited cohort can now be plotted using the each argument of plot.bcrm. * When simulating, operating characteristics are also now presented by true regions of toxicity risk. * Simulations now run faster, as they use information from identical previous simulations to choose next dose. This is only implemented if nsims=1000, otherwise the computation time to search previous simulations becomes unmanageable. * Plot and print commands now refer to actual dose labels when they have been given by the user * Output from simulations can now be plotted as histograms using the function plot.bcrm.sim Best wishes Mike -- --- Dr Michael Sweeting MRC Biostatistics Unit Institute of Public Health Robinson Way Cambridge CB2 0SR Tel: 01223 768257 Fax: 01223 760729 ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'
As the title says, I want to apply a function (which itself returns a function) to a list (or vector), and get a list (or vector) of generated functions as the return value, but get unexpected result. Anyone with an idea about the reason of this phenomenon and a correct way to implement the requirements? Thanks very much :) -- Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid from 2010 to 2013) from a key server. signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] uneven vector length issue with read.zoo?
On Wed, May 2, 2012 at 3:55 PM, knavero knav...@gmail.com wrote: I truncated and simplified my code and the read in data that I'm working with to isolate the issue. Here is the read in data and R script respectively: http://r.789695.n4.nabble.com/file/n4604287/test.csv test.csv http://pastebin.com/rCdaDqPm Here is the terminal/R shell output that I hope the above replicates on your screen: source(elecLoad.r, echo = TRUE) #Load packages library(zoo) library(chron) #Initial assignments for format (fmt), timezone (TZ), and user #defined chron function (chr) fmt = %m/%d/%y %I:%M %p TZ = PDT chr = function(x) as.chron(x, fmt) #Read in data as zoo object using relevant arguments in read.zoo() #for details of arguments, see Kevin Navero or see ?read.zoo #and ?read.table [TRUNCATED] Error in read.zoo(http://dl.dropbox.com/u/41922443/test.csv;, skip = 1, : index has bad entries at data rows: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 I was hoping that the NULL in colClasses() would've taken care of this uneven vector length issue, however, that was not the case. Any ideas? Thanks in advance. Sorry if my post didn't follow the forum rules exactly. I tried to make small scale reproducible code and what not. I'm still a bit of a noob here and there. Try this using the same library statements, fmt and chr from ijn post: URL - http://dl.dropbox.com/u/41922443/test.csv; DF1 - read.table(URL, skip = 1, header = TRUE, sep = ,, fill = TRUE, as.is = TRUE) DF2 - na.omit(DF1[1:2]) z - read.zoo(DF2, FUN = chr) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple plot loop
On 05/03/2012 05:50 PM, Ben Neal wrote: Trying to plot multiple lines from a simple matrix with headers (eight observations per column). I will be doing a number of these, with varying numbers of columns, and do not want to enter the header names for each one (I got frustrated and just wrote them out, which did work). Data reads fine, first plot is fine, but when i use the code at the bottom for a for i loop it tells me that x and y do not match. . . One other issue is that I would prefer not to specify the first column either, but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. Should this not call the second column? Thank you very much for any comments. I know this is simple, but I appreciate any assistance. Cheers, Ben # # LOAD DATA FROM CSV library(zoo) setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting) Data = read.csv(120503_CNAT_Summary.csv, header=T) #fill in missing data from last observation Data2- na.locf(Data) attach(Data2) # PLOT ALL ON ONE CHART plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, col=red) title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live coral / colony, xlab=Months, col.main=black, font.main=4) lines(MONTH,T162, type=o, pch=22, lty=2, col=red) lines(MONTH,T231, type=o, pch=22, lty=2, col=green) lines(MONTH,T250, type=o, pch=22, lty=2, col=green) ##(many other similar lines here, with entered column headers . . . up to 75) lines(MONTH,T373, type=o, pch=22, lty=2, col=blue) lines(MONTH,T374, type=o, pch=22, lty=2, col=blue) lines(MONTH,T377, type=o, pch=22, lty=2, col=blue) # Tried to add lines another way with for i loop, but this is the part not working for (i in 2:length(Data2)) { lines(MONTH, i, type=o, pch=22, lty=2, col=blue)) } # Hi Ben, I think what you may want in your loop is this: for(column in names(Data2)[2:length(Data2)]) lines(MONTH,column,type=o,pch=22,lty=2,col=blue) But, if you want the first two lines to be green, you'll probably have to get a vector of colors: colorvec-rep(blue,length(Data2)) colorvec[1]-red colorvec[2:3]-green and change the above to: columnnames-names(Data2) for(column in 2:length(Data2)) lines(MONTH,columnnames[column],type=o,pch=22,lty=2,col=colvec[column]) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] select month data in ts objects
On Wed, May 2, 2012 at 4:27 PM, S. Georgakarakos strat...@aegean.gr wrote: In a time series ts object, like the z1.ts below: z1 = array(1:235) z1.ts = ts(z1, frequency =12) I would like to select only a certain month, for instance the February data If I transform the data to a matrix, I have the problem that 235 is not a multiple of 12 I do not like to cut or add data, or program a loop to pick out the correct data. I am wondering if exist an easier way to select month data in a ts object. Use cycle: tt - ts(1:235, frequency =12) ts(tt[cycle(tt) == 2]) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with getting values from string
Hi All, I have a doubt. I used macros and i try to pass a value to a macro by concatenating a bunch of strings. But it does not seem to work. Please help. I have written down my code and the error message please tell me how to pass the value that a string points to. Thanks in advance #macro defined machist_occ_kgfs-defmacro(a,qnu_occ,b,qnl_occ,expr={with(subset(an_ind_data_fin,income_source==a region_id==b normalised_incomeqnl_occ normalised_incomeqnu_occ),hist(normalised_income,main=paste(a,b,sep= )))}) #macro called machist_occ_kgfs(occ,paste(qnu,ri,occ,collapse=,sep=),ri,paste(qnl,ri,occ,collapse=,sep=)) Error in hist.default(normalised_income, main = paste(occ, ri, sep = ), : hist.default: pretty() error, breaks= In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf The thing is paste(qnu,ri,occ,collapse=,sep=) returns the value qnu1Business__Others but the variable - qnu1Business__Others contains an integer value which i need to be passed on to the macro. Hope i have made myself clear. Thanks in advance for your help -- View this message in context: http://r.789695.n4.nabble.com/Help-with-getting-values-from-string-tp4605632.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] braking a label in two lines when using expression()
I making an xyplot and the y label is too long and needs to be in two rows, but when I brake it there is a huge gap between the last text string and the expression, and I can't get rid of it. Any ideas? Data: structure(list(Temp = c(8L, 8L, 8L, 8L, 8L, 8L, 12L, 12L, 12L, 12L, 12L, 12L), CO2 = c(380L, 380L, 380L, 750L, 750L, 750L, 380L, 380L, 380L, 750L, 750L, 750L), Treat = structure(c(3L, 3L, 3L, 4L, 4L, 4L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c(12-380, 12-750, 8-380, 8-750), class = factor), Week = c(1L, 3L, 8L, 1L, 3L, 8L, 1L, 3L, 8L, 1L, 3L, 8L), Mean.Rate = c(2.125909389, 1.905870003, 1.417687602, 3.110439984, 2.31043989, 1.849232493, 2.546747098, 3.290235064, 3.000717599, 2.694901409, 3.852590547, 2.964084249 ), lower = c(1.846641409, 1.44072624, 1.185304427, 2.56408099, 2.02644683, 1.606374443, 2.253928482, 2.759177284, 2.49014747, 2.168437604, 3.075977559, 2.438453415), upper = c(2.405177369, 2.371013766, 1.650070777, 3.656798978, 2.59443295, 2.092090543, 2.839565714, 3.821292844, 3.511287728, 3.221365214, 4.629203535, 3.489715083), fTemp = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(8, 12), class = factor), fCO2 = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c(380, 750), class = factor), fTreat = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c(8-380, 8-750, 12-380, 12-750), class = c(ordered, factor )), fWeek = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c(1, 3, 8), class = factor)), .Names = c(Temp, CO2, Treat, Week, Mean.Rate, lower, upper, fTemp, fCO2, fTreat, fWeek), row.names = c(NA, -12L), class = data.frame) xyplot(cbind(Mean.Rate,lower,upper)~fWeek|fTreat, resp.week.mean.rate, as.table=TRUE, xlab=Week, ylab=expression(Mean Reapisration Rate (umol.*L^-1*.g (AFDM)^-1*)), scales=list(alternating=FALSE, tick.number=10, tck=c(-1,0)), layout=c(4,1), ylim=1:5, auto.key=list(title=Treatment, lines=TRUE, cex.title=1, columns=2), panel=function(x, y,...){ panel.errbars(x,y,make.grid=none,ewidth=0.2,type=p,...) panel.loess(x[resp.week.mean.rate$Treat==8-380],y[resp.week.mean.rate$Treat==8-380],span = 5, degree = 1,lwd=2,...) panel.loess(x[resp.week.mean.rate$Treat==8-750],y[resp.week.mean.rate$Treat==8-750],span = 5, degree = 1,lwd=2,...); panel.loess(x[resp.week.mean.rate$Treat==12-380],y[resp.week.mean.rate$Treat==12-380],span = 5, degree = 1,lwd=2,...); panel.loess(x[resp.week.mean.rate$Treat==12-750],y[resp.week.mean.rate$Treat==12-750],span = 5, degree = 1,lwd=2,...) } ) Beatriz de Francisco Mora PhD Student The Scottish Association for Marine Science Scottish Marine Institute Oban PA37 1QA Tel: 06131 559000 (switchboard) Fax: 01631559001 E. beatriz.defranci...@sams.ac.ukmailto:beatriz.defranci...@sams.ac.uk http://www.smi.ac.uk/beatriz-de-franciso The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has an actively trading wholly owned subsidiary company: SAMS Research Services Ltd a Limited Company (SC224404). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?
Jonathan, On some filesystems (e.g. NTFS, see below) it is possible to create 'sparse' memory-mapped files, i.e. reserving the space without the cost of actually writing initial values. Package 'ff' does this automatically and also allows to access the file in parallel. Check the example below and see how big file creation is immediate. Jens Oehlschlägel library(ff) library(snowfall) ncpus - 2 n - 1e8 system.time( + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff) + ) User System verstrichen 0.010.000.02 # check finalizer, with an explicit filename we should have a 'close' finalizer finalizer(x) [1] close # if not, set it to 'close' inorder to not let slaves delete x on slave shutdown finalizer(x) - close sfInit(parallel=TRUE, cpus=ncpus, type=SOCK) R Version: R version 2.15.0 (2012-03-30) snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 CPUs. sfLibrary(ff) Library ff loaded. Library ff loaded in cluster. Warnmeldung: In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts = TRUE, : 'keep.source' is deprecated and will be ignored sfExport(x) # note: do not export the same ff multiple times # explicitely opening avoids a gc problem sfClusterEval(open(x, caching=mmeachflush)) # opening with 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS write storms when the file is larger than RAM [[1]] [1] TRUE [[2]] [1] TRUE system.time( + sfLapply( chunk(x, length=ncpus), function(i){ + x[i] - runif(sum(i)) + invisible() + }) + ) User System verstrichen 0.000.00 30.78 system.time( + s - sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i], c(0.05, 0.95)) ) + ) User System verstrichen 0.000.004.38 # for completeness sfClusterEval(close(x)) [[1]] [1] TRUE [[2]] [1] TRUE csummary(s) 5% 95% Min.0.04998 0.95 1st Qu. 0.04999 0.95 Median 0.05001 0.95 Mean0.05001 0.95 3rd Qu. 0.05002 0.95 Max.0.05003 0.95 # stop slaves sfStop() Stopping cluster # with the close finalizer we are responsible for deleting the file explicitely (unless we want to keep it) delete(x) [1] TRUE # remove r-side metadata rm(x) # truly free memory gc() Gesendet: Donnerstag, 03. Mai 2012 um 00:23 Uhr Von: Jonathan Greenberg j...@illinois.edu An: r-help r-help@r-project.org, r-sig-...@r-project.org Betreff: [R-sig-hpc] Quickest way to make a large empty file on disk? R-helpers: What would be the absolute fastest way to make a large empty file (e.g. filled with all zeroes) on disk, given a byte size and a given number number of empty values. I know I can use writeBin, but the object in this case may be far too large to store in main memory. I'm asking because I'm going to use this file in conjunction with mmap to do parallel writes to this file. Say, I want to create a blank file of 10,000 floating point numbers. Thanks! --j -- Jonathan A. Greenberg, PhD Assistant Professor Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 415-763-5476 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 [1]http://www.geog.illinois.edu/people/JonathanGreenberg.html [[alternative HTML version deleted]] ___ R-sig-hpc mailing list r-sig-...@r-project.org [2]https://stat.ethz.ch/mailman/listinfo/r-sig-hpc References 1. http://www.geog.illinois.edu/people/JonathanGreenberg.html 2. https://stat.ethz.ch/mailman/listinfo/r-sig-hpc __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help!
So long as x is a character vector, I tend to use the following for this problem. x - c(12/31/11 23:45, 1/1/12 2:15) x.split - strsplit(x, ) x.date - sapply(x.split, function(y) return(y[1])) x.time - sapply(x.split, function(y) if (length(y) 1) return(y[2]) else NA) x.date [1] 12/31/11 1/1/12 x.time [1] 23:45 2:15 Benjamin Nutter | Biostatistician | Quantitative Health Sciences Cleveland Clinic | 9500 Euclid Ave. | Cleveland, OH 44195 | (216) 445-1365 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Alex Roth Sent: Wednesday, May 02, 2012 4:01 PM To: r-help@r-project.org Subject: [R] R help! Hello there, I was wondering if you could help me with a quick R issue. I have a data set where one of the columns has both date and time in it, e.g. 12/31/11 23:45 in one cell. I want to use R to split this column into two new columns: date and time. One of the problems with splitting here is that when the dates go into single digits there are no 0's in front of months January-September (e.g., January is represented by 1 as opposed to 01), so every entry is a different length. Therefore, splitting by the space is the only option, I think. Here's the coding I've developed thus far: z$dt - z$Date#time and date is all under z$Date foo - strsplit( , z$dt) #attempted split based on the space And then if that were to work, I would proceed use the coding: foo2 - matrix(unlist(foo), ncol = 2, byrow=TRUE) z$Date - foo[ ,1] z$Time - foo[ ,2] However, foo - strsplit( , z$dt) isn't working. Do you know what the problem is? If you could respond soon, that would be greatly appreciated! Thanks so much! Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News World Report (2010). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with 'nls' fitting logistic model (5PL)
Bert, Thank you for your thoughts. I can assure you I have plotted the data back and forth many times, and that overfitting has nothing to do with it. This is not a _statistical_ problem, but a _technical_ problem. Something that works well in ANY reliable statistical software doesn't work in R with 'nls'. This just makes me think that 'nls' is a dysfunctional piece of junk that can't handle even a very simple problem. Not to mention that 'predict()' for 'nls' doesn't give you any sort of confidence interval. I wonder if there's another package in R that could be used to fit 5P-logistic model. Any idea? In the end I may attempt to write a fitting function myself, but that would be the last resort. -- Michal J. Figurski, PhD HUP, Pathology Laboratory Medicine Biomarker Research Laboratory 3400 Spruce St. 7 Maloney S Philadelphia, PA 19104 tel. (215) 662-3413 On 5/2/2012 3:47 PM, Bert Gunter wrote: Plot the data. You're clearly overfitting. (If you don't know what this means or why it causes the problems you see, try a statistical help list or consult your local statistician). -- Bert On Wed, May 2, 2012 at 12:32 PM, Michal Figurski figur...@mail.med.upenn.edu wrote: Dear R-Helpers, I'm working with immunoassay data and 5PL logistic model. I wanted to experiment with different forms of weighting and parameter selection, which is not possible in instrument software, so I turned to R. I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the model - I started with the same model and weighting type (1/y) as in the instrument to see if I'll get similar results. However, in some instances I don't get any results - just errors. Here is an example calibration data, representative of my experiment. Instrument soft had no problem fitting it: x- structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3, St4, St5, St6, St7), class = factor), MFI = c(10755.5, 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58, 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24, 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683, 0.00152454517735542, 0.00291686922702965, 0.00308832612723904, 0.0099304865938431, 0.00996677740863787, 0.0298507462686567, 0.0332594235033259, 0.0697674418604651, 0.0767263427109974, 0.258620689655172, 0.263157894736842, 1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA, -14L), class = data.frame) And here is the nls fit: fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights, start=c(a=100, b=1, c=100, d=-1, f=1)) I've tried every possible combination of starting values, including the values fitted by the instrument soft - to no avail. I've probably seen all possible error messages from 'nls' trying to fit this. If anyone has an idea why it's not working - let me know. Best regards, -- Michal J. Figurski, PhD HUP, Pathology Laboratory Medicine Biomarker Research Laboratory 3400 Spruce St. 7 Maloney S Philadelphia, PA 19104 tel. (215) 662-3413 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two ecdf with log-scales
I've done it this way before: eX - ecdf(distribution 1) eY - ecdf(distribution 2) par(mar=c(5,5,2,1),xlog=TRUE) plot(eX, do.points=FALSE, verticals=TRUE, col=black, xlab=xlabel, xlim=c(1,10), ylab=ylabel, lty=1, cex.lab=1.5, cex.axis=1.5, main=, lwd=3,log=x) plot(eY, do.points=FALSE, verticals=TRUE, col=blue, add=TRUE, xlim=c(1,10), main=) Warning: It makes a stair-step that may be difficult to see unless you use color. I had to change how the ecdf was plotted when I made b/w figures for my publication so that different dashed lines were distinct. HTH, -Steve -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, May 02, 2012 10:17 AM To: Johannes Radinger Cc: R-help@r-project.org Subject: Re: [R] Two ecdf with log-scales On May 2, 2012, at 6:14 AM, Johannes Radinger wrote: Hi, i want to plot empirical cumulative density functions for two variables in one plot. For better visualizing the differences in the two cumulative curves I'd like to log-scale the axis. So far I found 3 possible functions to plot ecdf: 1) ecdf() from the package 'stats'. I don't know how to successfully set the log.scales? Combining two plots is not a problem: plot(ecdf(x1)) lines(ecdf(x2),col.h=red) 2) gx.ecdf() from package 'rgr'. It is easily possible to plot log- scales, but I don't know how to plot two densities? gx.ecdf(x1,log=TRUE,ifqs = TRUE) 3) Ecdf() from package 'Hmisc'. No log-option directly available and here I also don't know how to 'stack' two plots... Ecdf(x1,what=F) Probably there are many more solutions (e.g. ggplot etc.)... ...Has anyone faced a similar task and found a simple solution? Any suggestions are welcome! Have you searched the Archives? I seem to remember that the log(0) was a barrier to persons attempting this in the past. (ISTR a posting in the last few weeks.) Maybe you could also provide a test data object that has the same range as your x1 and x 2 variables. and provide commented, minimal, self-contained, reproducible code. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help!
On May 3, 2012, at 14:05 , Nutter, Benjamin wrote: So long as x is a character vector, I tend to use the following for this problem. x - c(12/31/11 23:45, 1/1/12 2:15) x.split - strsplit(x, ) x.date - sapply(x.split, function(y) return(y[1])) x.time - sapply(x.split, function(y) if (length(y) 1) return(y[2]) else NA) x.date [1] 12/31/11 1/1/12 x.time [1] 23:45 2:15 Benjamin Nutter | Biostatistician | Quantitative Health Sciences Cleveland Clinic| 9500 Euclid Ave. | Cleveland, OH 44195 | (216) 445-1365 I think this can be simplified to sapply(x.split,`[`,1) [1] 12/31/11 1/1/12 sapply(x.split,`[`,2) [1] 23:45 2:15 It's a bit inefficient, though. Other ideas: sub( .*$, , x) [1] 12/31/11 1/1/12 sub(^.* , , x) [1] 23:45 2:15 read.table(text=x) V1V2 1 12/31/11 23:45 2 1/1/12 2:15 -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a survival object with and without an event indicator
When reading about surv(). I saw the following statement Although unusual, the event indicator can be omitted, in which case all subjects are assumed to have an event. So I tried the following 1. survobj-surv(mydata$Time) vs. 2. survobj-surv(mydata$Time, mydata$Event) where mydata$Event is a column of all ones. I did not get the same answer when I ran survreg(survobj~shift, data=mydata) on each. The second case actually failed to converge. I then tried a column of all zeros just to make sure I wasn't misreading something but still got the warning message failed to converge. Can anyone explain what exactly R is doing for case 1? -- View this message in context: http://r.789695.n4.nabble.com/Creating-a-survival-object-with-and-without-an-event-indicator-tp4605945.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] estimation problem
Dear List-members, I have a problem where I have to estimate a mean, or a sum of a population but for some reason it contains a huge amount of zeros. I cannot give real data but I constructed a toy example as follows N1 - 10 N2 - 3000 x1 - rep(0,N1) x2 - rnorm(N2,300,100) x - c(x1,x2) n - 1000 x_sample - sample(x,n,replace=FALSE) I want to estimate the sum of x based on x_sample (not knowing N1 and N2 but their sum (N) only). The sample mean has a huge standard deviation I am looking for a better estimator. I was thinking about trimmed (or left trimmed as my numbers are all positive) means or something similar, but if I calculate trimmed mean I do not know N2 to multiply with. Do you have any idea or could you give me some insight? Thanks a lot: Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] DEADLINE FAST APPROACHING for useR! 2012
DEADLINE FAST APPROACHING – 8th Annual International R User Conference useR! 2012, Nashville, Tennessee USA Registration Deadlines: Early Registration: Passed Regular Registration: Mar 1- May 12 Late Registration: May 13 – June 4 On-Site Registration: June 12 – June 15 Please note: Nashville is offering several large entertainment events the month of June, and hotels are quickly selling out. It's imperative that you make your hotel accommodations for the conference as soon as possible. Please join us at the 8th Annual International R User Conference useR! 2012 in Nashville, Tennessee. A draft program schedule, and other conference details are available. Please visit http://biostat.mc.vanderbilt.edu/UseR-2012 -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University ___ r-annou...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-announce __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help!
Hi I would convert it to propper date format and then you can extract anything. dat-strptime(12/31/11 23:45, format=%m/%d/%y %H:%M) as.Date(dat) [1] 2011-12-31 format(dat, %H:%M) [1] 23:45 Regards Petr Hello there, I was wondering if you could help me with a quick R issue. I have a data set where one of the columns has both date and time in it, e.g. 12/31/11 23:45 in one cell. I want to use R to split this column into two new columns: date and time. One of the problems with splitting here is that when the dates go into single digits there are no 0's in front of months January-September (e.g., January is represented by 1 as opposed to 01), so every entry is a different length. Therefore, splitting by the space is the only option, I think. Here's the coding I've developed thus far: z$dt - z$Date#time and date is all under z$Date foo - strsplit( , z$dt) #attempted split based on the space And then if that were to work, I would proceed use the coding: foo2 - matrix(unlist(foo), ncol = 2, byrow=TRUE) z$Date - foo[ ,1] z$Time - foo[ ,2] However, foo - strsplit( , z$dt) isn't working. Do you know what the problem is? If you could respond soon, that would be greatly appreciated! Thanks so much! Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Searching for values
Hey YaoPau, what you need are basically three things. If, for and matrix-operations. So a quick overview, how you could do it. (With focus on the idea, not how the columns and rows are arranged. # 1) Building the Vectors with c() column1-c(p103,p186,p145,p187,p197) column2-c(p666,p77,p198,p674,p302) # 1) Building the vector with numbers and letters # you can use paste here (cat is also a useful function # for similar purposes). Seq is obv a sequence. # Sep= tells R to not split the letter and numbers # try sep= for the difference longvector-(paste(p,seq(1:700),sep=)) ## rep can be used to create an empty vetor, not very ## elegant though emptyvector-rep(0,length(longvector)) ## cbind for creating a matrix out of two vetors matrix-cbind(longvector,emptyvector) Now the interesting part. More difficult to explain ## Focus on the middle term. If term of column 1 equals ## term in the matrix function you want to insert 1. ### Same thing for a-1 below. ### no you want to perform this for all 700 rows = [length(longvector)] ### and for all values in column1 = [length(column1)] ## you will use for here. for (i in 1:length(longvector)){ for (j in 1: length(column1)){ if (column1[j]==matrix[,1][i]){ matrix[,2][i]=1 } if (column2[j]==matrix[,1][i]){ matrix[,2][i]=a-1 } } } -- View this message in context: http://r.789695.n4.nabble.com/Searching-for-values-tp4605213p4605722.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read ANOVA output
Hi, Take a look at http://www.khanacademy.org/ Look near the bottom of the page and there is a whole section on statistics and 3 videos on ANOVA. It is a very good introduction and I find it very useful to know how the test works so that I'm not using a Black Box Also I wrote up a little section on ANOVA that you can see at: https://sites.google.com/site/davidsstatistics/using-r/anova-nonparameteric Take Care David - Take Care David Doyle -- View this message in context: http://r.789695.n4.nabble.com/How-to-read-ANOVA-output-tp2329457p4605879.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with running probit
Hi, I am having problems with running a probit regression and don't understand where the problem comes from since with the original data set I was able to get correct estimates. To that data set I have added extra variables and upon running the regression I get now multiple estimates of the same predictor: Deviance Residuals: Min1QMedian3Q Max -1.17741 -0.1 0.0 0.1 1.17741 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -6.400e+00 1.070e+04 -0.0011.000 shares3m103,493 -6.220e-07 1.513e+04 0.0001.000 shares3m105,188 -6.220e-07 1.513e+04 0.0001.000 shares3m106,743 -6.220e-07 1.513e+04 0.0001.000 shares3m107,809 -6.220e-07 1.513e+04 0.0001.000 shares3m107,844 -6.220e-07 1.513e+04 0.0001.000 shares3m110,902 -6.220e-07 1.513e+04 0.0001.000 shares3m111,112 -6.220e-07 1.513e+04 0.0001.000 shares3m113,173 -6.220e-07 1.513e+04 0.0001.000 shares3m114,798 -6.220e-07 1.513e+04 0.0001.000 shares3m117,454 -6.220e-07 1.513e+04 0.0001.000 shares3m119,446 -6.220e-07 1.513e+04 0.0001.000 shares3m123,500 -6.220e-07 1.513e+04 0.0001.000 shares3m126,366 -6.220e-07 1.513e+04 0.0001.000 shares3m126,383 -6.216e-07 1.513e+04 0.0001.000 shares3m126,436 -6.220e-07 1.513e+04 0.0001.000 shares3m128,043 -6.220e-07 1.513e+04 0.0001.000 shares3m131,975 -6.220e-07 1.513e+04 0.0001.000 shares3m132,254 -6.220e-07 1.513e+04 0.0001.000 shares3m133,984 -6.220e-07 1.513e+04 0.0001.000 . .. Anyone knows what the problem is? -- View this message in context: http://r.789695.n4.nabble.com/problem-with-running-probit-tp4605742.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems with lme function
I try to fit a repeated measures model, in which: FA: The difference in the number of scales between the left and right sides for a given trait trait: the each of the measured features are 6 for each individual Condition: is the state of condition of each individual Individual: Each of the specimens studied By adjusting the model happens to me the following lm.FA-lme(FA~rasgo*condicion,random=~1|individuo) Error en na.fail.default(list(FA = c(-5, -5, -4, -3, -3, -3, -3, -3, -3, : missing values in object I thought it might be because there are values of FA are negative, and introduced into the model the absolute value of FA, but still the same absFA-abs(FA) lm.FA-lme(absFA~rasgo*condicion,random=~1|individuo) Error en na.fail.default(list(absFA = c(5, 5, 4, 3, 3, 3, 3, 3, 3, 3, : missing values in object What I'm doing wrong? Thank you very much in advance for your attention - Mario Garrido Escudero PhD student Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola Universidad de Salamanca -- View this message in context: http://r.789695.n4.nabble.com/problems-with-lme-function-tp4605998.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Very small random effect estimation in lmer but not in stata xtmixed
Hi folks I am using the lmer function (in the lme4 library) to analyse some data where individuals are clustered into sets (using the SetID variable) with a single fixed effect (cc - 0 or 1). The lmer model and output is shown below. Whilst the fixed effects are consistent with stata (using xtmixed, see below), the std dev of the random effect for SetID is very very small (3.5803e-05)compared to stata's (see below 1.002). Any ideas why this should be happening please? LMER MODEL summary(lmer(AnxietyScore ~ cc + (1|SetID), data=mydf)) Linear mixed model fit by REML Formula: AnxietyScore ~ cc + (1 | SetID) Data: mydf AIC BIC logLik deviance REMLdev 493.4 503.4 -242.7486.6 485.4 Random effects: Groups NameVariance Std.Dev. SetID(Intercept) 1.2819e-09 3.5803e-05 Residual 1.3352e+01 3.6540e+00 Number of obs: 90, groups: SetID, 33 Fixed effects: Estimate Std. Error t value (Intercept) 3.1064 0.5330 5.828 cc2.3122 0.7711 2.999 Correlation of Fixed Effects: (Intr) cc -0.691 STATA XTMIXED xtmixed anxietyscore cc || setid:, reml Mixed-effects REML regression Number of obs =90 Group variable: setid Number of groups =33 Log restricted-likelihood = -242.48259 Prob chi2=0.0023 -- anxietyscore | Coef. Std. Err. zP|z| [95% Conf. Interval] -+ cc | 2.289007 .7492766 3.05 0.002 .82045193.757562 _cons | 3.116074 .5464282 5.70 0.000 2.0450944.187053 -- -- Random-effects Parameters | Estimate Std. Err. [95% Conf. Interval] -+ setid: Identity | sd(_cons) | 1.002484.797775 .21071374.769382 -+ sd(Residual) | 3.515888 .3281988 2.928045 4.22175 -- with thanks best wishes M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] conducting GAM-GEE within gamm4?
Dear R-help users, I am trying to analyze some visual transect data of organisms to generate a habitat distribution model. Once organisms are sighted, they are followed as point data is collected at a given time interval. Because of the autocorrelation among these follows, I wish to utilize a GAM-GEE approach similar to that of Pirotta et al. 2011, using packages 'yags' and 'splines' (http://www.int-res.com/abstracts/meps/v436/p257-272/). Their R scripts are shown here ( http://www.int-res.com/articles/suppl/m436p257_supp/m436p257_supp1-code.r). I have used this code with limited success and multiple issues of models failing to converge. Below is the structure of my data: str(dat2) 'data.frame': 10792 obs. of 4 variables: $ dist_slag : num 26475 26340 25886 25400 24934 ... $ Depth : num -10.1 -10.5 -16.6 -22.2 -29.7 ... $ dolphin_presence: int 0 0 0 0 0 0 0 0 0 0 ... $ block : int 1 1 1 1 1 1 1 1 1 1 ... head(dat2) dist_slagDepth dolphin_presence block 1 26475.47 -10.09340 1 2 26340.47 -10.48700 1 3 25886.33 -16.57520 1 4 25399.88 -22.24740 1 5 24934.29 -29.67970 1 6 24519.90 -26.23700 1 Here is the summary of my block variable (indicating the number of groups for which autocorrelation exists within each block summary(dat2$block) Min. 1st Qu. MedianMean 3rd Qu.Max. 1.00 39.00 76.00 73.52 111.00 148.00 However, I would like to use the package 'gamm4', as I am more familiar with Professor Simon Wood's packages and functions, and it appears gamm4 might be the most appropriate. It is important to note that the models have a binary response (organism presence of absence along a transect), and thus why I think gamm4 is more appropriate than gamm. In the gamm help, it provides the following example for autocorrelation within factors: ## more complicated autocorrelation example - AR errors ## only within groups defined by `fac' e - rnorm(n,0,sig) for (i in 2:n) e[i] - 0.6*e[i-1]*(fac[i-1]==fac[i]) + e[i] y - f + e b - gamm(y~s(x,k=20),correlation=corAR1(form=~1|fac)) Following this example, the following is the code I used for my dataset b - gamm4(dolphin_presence~s(dist_slag)+s(Depth),random=(form=~1|block), family=binomial(),data=dat) However, by examining the output (summary(b$gam)) and specifically summary(b$mer)), I am either unsure of how to interpret the results, or do not believe that the autocorrelation within the group is being taken into consideration. summary(b$gam) Family: binomial Link function: logit Formula: dolphin_presence ~ s(dist_slag) + s(Depth) Parametric coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -13.968 5.145 -2.715 0.00663 ** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Approximate significance of smooth terms: edf Ref.df Chi.sq p-value s(dist_slag) 4.943 4.943 70.67 6.85e-14 *** s(Depth) 6.869 6.869 115.59 2e-16 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 R-sq.(adj) = 0.317glmer.ML score = 10504 Scale est. = 1 n = 10792 summary(b$mer) Generalized linear mixed model fit by the Laplace approximation AIC BIC logLik deviance 10514 10551 -525210504 Random effects: Groups Name Variance Std.Dev. Xr s(dist_slag) 1611344 1269.39 Xr.0 s(Depth) 98622 314.04 Number of obs: 10792, groups: Xr, 8; Xr.0, 8 Fixed effects: Estimate Std. Error z value Pr(|z|) X(Intercept) -13.968 5.145 -2.715 0.00663 ** Xs(dist_slag)Fx1 -35.871 33.944 -1.057 0.29063 Xs(Depth)Fx13.971 3.740 1.062 0.28823 --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 Correlation of Fixed Effects: X(Int) X(_)F1 Xs(dst_s)F1 0.654 Xs(Dpth)Fx1 -0.030 0.000 How do I ensure that autocorrelation is indeed being accounted for within each unique value of the block variable? What is the simplest way to interpret the output for summary(b$mer)? The results do differ from a normal gam (package mgcv) using the same variables and parameters without the correlation=... term, indicating that *something *different is occurring. However, when I use a different variable for the correlation term (season), I get the SAME output: dat2 - data.frame(dist_slag = dat$dist_slag, Depth = dat$Depth, dolphin_presence = dat$dolphin_presence, + block = dat$block, season=dat$season) head(dat2) dist_slagDepth dolphin_presence block season 1 26475.47 -10.09340 1 F 2 26340.47 -10.48700 1 F 3 25886.33 -16.57520 1 F 4 25399.88 -22.24740 1 F 5 24934.29 -29.67970 1 F 6 24519.90 -26.23700 1 F summary(dat2$season) FS 3224 7568
Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'
As i see it you will save the actual text of the function - and when you call it later on it takes the last value of x it has encountered as the value. I guess you want the x not to be saved as x, but as a or b, so, as its value. I am not sure how to do that however as of yet. Am 03.05.2012 um 12:31 schrieb Casper Ti. Vector: As the title says, I want to apply a function (which itself returns a function) to a list (or vector), and get a list (or vector) of generated functions as the return value, but get unexpected result. Anyone with an idea about the reason of this phenomenon and a correct way to implement the requirements? Thanks very much :) -- Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid from 2010 to 2013) from a key server. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interactive loop
Dear Michael and Sarah, the superfluous points arose as an error (e.g. double-click) in the measurement process. Thus, looking on the image I recognize them easily and everything what I need is to write their numbers. readline() serves well for that purpose. Thanks a lot! Ondřej On 2 May 2012 18:32, Sarah Goslee sarah.gos...@gmail.com wrote: And now we have two entirely different interpretations of the question. I think Ondřej needs to provide a more detailed explanation of the problem and intended result. Sarah On Wed, May 2, 2012 at 12:23 PM, R. Michael Weylandt michael.weyla...@gmail.com michael.weyla...@gmail.com wrote: I think readline() will do what you want. It can display a message and take user input, assigning it to a character value so you might need as.numeric() Michael On May 2, 2012, at 12:08 PM, Ondřej Mikula onmik...@gmail.com wrote: Dear R-helpers, I have a number of point configurations representing skull shapes, but some of them contain superfluous points. I want to write a loop in which each configuration is plotted and I am asked to write the numbers of points that are superfluous. However, I don't know how to introduce this interactive element. Would you give me an advice? Best regards Ondřej Mikula -- Sarah Goslee http://www.functionaldiversity.org -- Ondřej Mikula Institute of Animal Physiology and Genetics Academy of Sciences of the Czech Republic Veveri 97, 60200 Brno, Czech Republic Institute of Vertebrate Biology Academy of Sciences of the Czech Republic Studenec 122, 67502 Konesin, Czech Republic __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] add an automatized linear regression in a function
Dear R users, For the moment, I have a script and a function which calculates correlation matrices between all my data files. Then, it chooses the best correlation for each data and take it in order to fill missing data in the analysed file (so the data from the best correlation file is put automatically into the missing data gaps of the first file (because my files are containing missing values (NAs))). If the best correlated file doesn't contain data , it takes the data from the second best correlated file. The problem is that for the moment, it takes raw data from the best correlated file. So I need to adapt this raw data to the file that is going to be filled. As a consequence, I'd like to automatize the calculation of a linear regression (after the selection of the best or the second best correlated data file) between the two files. Instead of taking the raw data from the best correlated file to fill the first one, it should take the estimated data from the regression to fill it (in order to have more precise filled data). The idea is so to do an lm() between these two files, to extract the coefficients of the straight line (from the regression) and to calculate the estimated data for all my file (NA included), and finally to fill the gaps with this estimated data. Hope you've understand my problem. Here's the function: process.all - function(df.list, mat){ f - function(station) na.fill(df.list[[ station ]], df.list[[ max.cor[station] ]]) g - function(station){ x - df.list[[station]] if(any(is.na(x$data))){ mat[row(mat) == col(mat)] - -Inf nas - which(is.na(x$data)) ord - order(mat[station, ], decreasing = TRUE)[-c(1, ncol(mat))] for(i in nas){ for(y in ord){ if(!is.na(df.list[[y]]$data[i])){ x$data[i] - df.list[[y]]$data[i] break } } } } x } n - length(df.list) nms - names(df.list) max.cor - sapply(seq.int(n), get.max.cor, corhiver2008capt1) df.list - lapply(seq.int(n), f) df.list - lapply(seq.int(n), g) names(df.list) - nms df.list } I succeded for a small data.frame I've created, but I don't know how to do it in this particular case. Thanks a lot for your help! -- View this message in context: http://r.789695.n4.nabble.com/add-an-automatized-linear-regression-in-a-function-tp4606047.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error fitting coxph model
On May 2, 2012, at 3:02 PM, Jessica Myers wrote: Hi, I am using coxph from the survival package to fit a large model (100,000 observations, ~35 covariates) using both ridge regression (on binary covariates) and penalized splines (for continuous covariates). In fitting, I get a strange error: Error in if (abs((y[nx] - target)/(y[nx - 1] - target)) 0.6) doing.well - FALSE else doing.well - TRUE : missing value where TRUE/FALSE needed You appear to have missing values in 'y', 'nx', or `target`. This could be a case of the newbie error of using if(){}else{} when ifelse() should have been used. Without at least the code we probably cannot resolve those two possibilities. Unfortunately, I can't reproduce this error without handing over my entire dataset, Why not? If the removal of missing values is unsuccessful and you were not committing the error I mentioned, then run it on two halves. Pick the one with the error. Rinse, lather, repeat. but I thought it would be worth checking if anyone had any insight. I should note that the outcome that I'm using has almost everyone having an event (~98,000 events out of 100,000). I have fit other models like this with no problem, but on one particular dataset it fails. Thanks! Jessica Myers Instructor in Medicine David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'
Now.. i just tried around and this might be a bit strange way to do things.. createFunc-function(v){ v_out-NULL for(i in v){ v_out[[i]]-substitute(function(){x},list(x=i)) } return(v_out) } y-createFunc(c(a,b)) y $a function() { a } $b function() { b } eval(y$a)() [1] a eval(y$b)() [1] b Am 03.05.2012 um 12:31 schrieb Casper Ti. Vector: As the title says, I want to apply a function (which itself returns a function) to a list (or vector), and get a list (or vector) of generated functions as the return value, but get unexpected result. Anyone with an idea about the reason of this phenomenon and a correct way to implement the requirements? Thanks very much :) -- Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid from 2010 to 2013) from a key server. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] deparse(substitute(x)) on an object with S3 class
Note that print.testclass(testlist) also works as expected so it might be there's a forced evaluation somewhere inside S3 dispatch -- I was playing around with this the other day, having gotten myself confused by it and stumbled across the internal logic (or at least something that made sense to me), though I seem to have forgotten it again. Hopefully someone else can clarify. Michael On Thu, May 3, 2012 at 6:15 AM, Remko Duursma remkoduur...@gmail.com wrote: Dear list, can someone explain to me why deparse(substitute(x)) does not seem to work when x is of a user-defined S3 class? In my actual problem, my print method is part of a package, and the method is registered in the NAMESPACE, if that should make a difference. print.testclass - function(x,...){ xname - deparse(substitute(x)) cat(Your object name is,xname,\n) } testlist - list() testlist[[1]] - 1:10 class(testlist) - testclass # This does not work as expected: testlist Your object name is structure(list(1:10), class = testclass) # But this does : testlist2 - unclass(testlist) print.testclass(testlist2) Your object name is testlist2 thanks, Remko Duursma -- View this message in context: http://r.789695.n4.nabble.com/deparse-substitute-x-on-an-object-with-S3-class-tp4605592.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'
So, to get back to mapply: eval(mapply(function(x) substitute(function() z,list(z=x)), c(a, b))$a)() or like this: mapply(function(x) eval(substitute(function(i) z*i,list(z=x))), c(2,3))[[1]](2) Am 03.05.2012 um 16:02 schrieb Jessica Streicher: Now.. i just tried around and this might be a bit strange way to do things.. createFunc-function(v){ v_out-NULL for(i in v){ v_out[[i]]-substitute(function(){x},list(x=i)) } return(v_out) } y-createFunc(c(a,b)) y $a function() { a } $b function() { b } eval(y$a)() [1] a eval(y$b)() [1] b Am 03.05.2012 um 12:31 schrieb Casper Ti. Vector: As the title says, I want to apply a function (which itself returns a function) to a list (or vector), and get a list (or vector) of generated functions as the return value, but get unexpected result. Anyone with an idea about the reason of this phenomenon and a correct way to implement the requirements? Thanks very much :) -- Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid from 2010 to 2013) from a key server. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error in La.svd Lapack routine 'dgesdd'
Dear Philipp, this is just a tentative answer because debugging is really not possible without a reproducible example (or, at a very bare minimum, the output from traceback()). Anyway, thank you for reporting this interesting numerical issue; I'll try to replicate some similar behaviour on a similarly dimensioned artificial dataset when I have some time (which might not be soon). As for now, please see below my remarks with '##', I hope they are useful anyway. Bottom line: time fixed effects might be out of place here. Best wishes, Giovanni Giovanni Millo, PhD Research Dept., Assicurazioni Generali SpA Via Machiavelli 4, 34132 Trieste (Italy) tel. +39 040 671184 fax +39 040 671160 - original message Message: 8 Date: Wed, 2 May 2012 05:45:47 -0700 (PDT) From: Philipp Grueber philipp.grue...@ebs.edu To: r-help@r-project.org Subject: Re: [R] error in La.svd Lapack routine 'dgesdd' Message-ID: 1335962747113-4603097.p...@n4.nabble.com Content-Type: text/plain; charset=UTF-8 Dear R Users, I have an unbalanced panel with (on average) approx. 100 individuals over 1370 time intervals (with individual time series of different lengths, varying between 60 and 1370 time intervals). I use the following model: res1-plm(x~c+d+e,data=pdata_frame, effect=twoways, model=within, na.action=na.omit)) ## I have difficulty in understanding why you would want to introduce ca. 1470 incidental parameters... I'd rather go with a more parsimonious specification: a trend, AR(n) or else... I repeatedly get the following error (which has been discussed in the past): Error in La.svd(x, nu, nv) : error code 1 from Lapack routine ?dgesdd? I found it hard to create a reproducible example. As noted by Douglas Bates, the error might be related to the scaling of the matrix. ## Too difficult for me to tell without output, references etc., although of course I trust D.B.'s opinion. For variables x,c,d,and e in object pdata_frame, I find that all sd() are reasonably similar both among the cross-sections as well as among the variables. However, I find that extracting the demeaned data from plm(), variables demXt$d and demXt$e (i.e. the demeaned variables) have sd()s that are very small compared to those of dem_yt and demXt$c (approx. by factor 1e-15). I extract the demeaned data as follows: dem_yt-pmodel.response(res) demXt-model.matrix(res) How is this possible? What is it that plm() does with my data so that the standard deviations change? ## it demeans them... (although the scale of the reduction is impressive, yet you're estimating out 1500 constants!) I suspect effect=twoways to play a central role because plm() works fine for effect=individual. ## sure, also because individual 'just' introduces 100 more parameters. I thought about the idea that maybe, time-effects simply do not apply here. ## You know your model. Yet time effects on T=1300 seems hazardous to me. However: In order to test my regression for time-effects (which I detect for subsamples (by time) and for equation x~e at high levels of significance), I need both the model with and and the model without time effects (as otherwise, I can't compare the two models in an F-test), right? Any alternative tests? ## please see ?plmtest Another thought was that the impact of d and e changes over time (as in the subsamples I do see such a change). Any help is appreciated! ## HTH, G. Best wishes, Philipp Grueber - EBS Universitaet fuer Wirtschaft und Recht FARE Department Wiesbaden/ Germany http://www.ebs.edu/index.php?id=finaccL=0 -- View this message in context: http://r.789695.n4.nabble.com/error-in-La-svd-Lapack-routine-dgesdd-tp21 25052p4603097.html Sent from the R help mailing list archive at Nabble.com. Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Modified Cholesky decomposition for sparse matrices
I am trying to estimate a covariance matrix from the Hessian of a posterior mode. However, this Hessian is indefinite (possibly because of numerical/roundoff issues), and thus, the Cholesky decomposition does not exist. So, I want to use a modified Cholesky algorithm to estimate a Cholesky of a pseudovariance that is reasonably close to the original matrix. I know that there are R packages that contain code for Gill-Murray and Schnabel-Eskow algorithms for standard, dense, base-R matrices. But my Matrix is large (k=3), and sparse (block-arrow structure, stored as a dsCMatrix class from the Matrix package). Is anyone aware of existing code (or perhaps an algorithm that is easy to adapt) that would perform a modified Cholesky decomposition on a large, sparse indefinite matrix, preferably working on sparseMatrix classes? Alternatively, is there a way I could compute a sparse LDL' decomposition from an existing R function, and quickly modify the output? Thanks, Michael --- Michael Braun Associate Professor of Management Science MIT Sloan School of Management 100 Main St.., E62-535 Cambridge, MA 02139 bra...@mit.edu 617-253-3436 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] bwplot: using a numeric variable to position boxplots
[Env: R 2.14.2 / Win Xp] In the examples below, I'm using lattice::bwplot to plot boxplots of 4 variables, grouped by a factor 'epoch' which also corresponds to a numeric year. I'd like to modify the plots to position the boxplots according to the numeric value of year, but I can't figure out how to do this. Also, I'd to modify the strip labels that give the variable names to use longer variable labels I define below as vlab. install.packages(heplots, repos=http://R-Forge.R-project.org;) # requires heplots 0.9-12 for Skulls data library(heplots) data(Skulls) # make shorter labels for epochs Skulls$epoch - factor(Skulls$epoch, labels=sub(c,,levels(Skulls$epoch))) # create year variable Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30) str(Skulls) 'data.frame': 150 obs. of 6 variables: $ epoch: Ord.factor w/ 5 levels 4000BC3300BC..: 1 1 1 1 1 1 1 1 1 1 ... $ mb : num 131 125 131 119 136 138 139 125 131 134 ... $ bh : num 138 131 132 132 143 137 130 136 134 134 ... $ bl : num 89 92 99 96 100 89 108 93 102 99 ... $ nh : num 49 48 50 44 54 56 48 48 51 51 ... $ year : num -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 ... table(Skulls$epoch) 4000BC 3300BC 1850BC 200BC AD150 30 30 30 30 30 This is what I've tried. I reshape the data to a long format and can plot value ~ epoch or value ~ as.factor(year), but I really want to space the boxplots according to the numeric value of year. # better variable labels -- how to incorporate these in the strip labels? vlab - c(maxBreadth, basibHeight, basialLength, nasalHeight) library(lattice) library(reshape2) Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30) sklong - melt(Skulls, id=c(epoch, year)) bwplot(value ~ epoch | variable, data=sklong, scales=free, ylab=Variable value, xlab=Epoch, panel = function(x,y, ...) { panel.bwplot(x, y, ...) panel.linejoin(x,y, col=red, ...) } ) bwplot(value ~ as.factor(year) | variable, data=sklong, scales=free, ylab=Variable value, xlab=Year, panel = function(x,y, ...) { panel.bwplot(x, y, ...) panel.linejoin(x,y, col=red, ...) } ) -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] estimation problem
Although you have provided R code to illustrate your problem, it is fundamentally a statistics theory question, and belongs somewhere else like stats.stackexchange.net. When you post there, I recommend that you spend more effort to identify why the zeros are present. If they are indicators of unknown values, that will be very different than if zeros are valid members of the population. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Kehl Dániel ke...@ktk.pte.hu wrote: Dear List-members, I have a problem where I have to estimate a mean, or a sum of a population but for some reason it contains a huge amount of zeros. I cannot give real data but I constructed a toy example as follows N1 - 10 N2 - 3000 x1 - rep(0,N1) x2 - rnorm(N2,300,100) x - c(x1,x2) n - 1000 x_sample - sample(x,n,replace=FALSE) I want to estimate the sum of x based on x_sample (not knowing N1 and N2 but their sum (N) only). The sample mean has a huge standard deviation I am looking for a better estimator. I was thinking about trimmed (or left trimmed as my numbers are all positive) means or something similar, but if I calculate trimmed mean I do not know N2 to multiply with. Do you have any idea or could you give me some insight? Thanks a lot: Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] NA's when subset in a dataframe
Dear community, I'm having this silly problem. I've a linear model. After fixing it, I wanted to know which data had studentized residuals larger than 3, so i tried this: d1 - cooks.distance(lmmodel) r - sqrt(abs(rstandard(lmmodel))) rstu - abs(rstudent(lmmodel)) a - cbind( mydata, d1, r,rstu) alargerthan3 - a[rstu 3, ] And suddenly a[rstu 3, ] has 17 rows, 7 of them are new rows, where all the entries are NA's, even its rownames. Because of this I'm not sure of the dimension ofa[rstu 3, ] (Do I only have 8 entries?) Has this happened to anybody before? If so, why this extra NA rows? what's the problem? Is there any other way to know which data have studentized residuals larger than 3? if it's needed to upload my data, just tell me. Thanks in advance,show crossp...@hotmail.com as u...@host.com -- View this message in context: http://r.789695.n4.nabble.com/NA-s-when-subset-in-a-dataframe-tp4606172.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Model Averaging Help
Dear All, I'm using AIC-weighted model averaging to calculate model averaged parameter estimates and partial r-squares of each variable in a 10-variable linear regression. I've been using the MuMIn package to calculate parameter estimates, but am struggling with partial r-squares. There doesn't seem to be any function in the MuMIn package dealing with partial r-square (r.squaredLR() looks promising, but I think there is a problem updating MuMIn so I can't use this (new?) function). I know you can calculate partial r-squares using plot(anova(ols()), what = 'partial R2') but I don't know how best to do this for all 1024 models and then calculate the AIC-weighted model average. Are there any packages out there that I'm missing? Or any ideas for combining MuMIn with anova.Design and Hmisc? Thanks in advance, and apologies if my problem is badly worded. Tom __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying the particular X or Y in a sorted list
Hello, Shankar Lanke wrote Dear All, I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). Data-(myDf$B-myDf$C) sum(Data0) A B CD (B-C) 1 14 10 4 2 12 7 5 3 11 15 -4 4 8 3 5 5 1 8 -7 I appreciate your help, thank you very much in advance. Regards [[alternative HTML version deleted]] __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. If I understand it correctly, this should do it. x - read.table(text= A B C D 1 14 10 4 2 12 7 5 3 11 15 -4 4 8 3 5 5 1 8 -7 , header=TRUE) which(x$D == max(x$D)) x$A[ which(x$D == max(x$D)) ] Or you can save the values of the which() function and use them when needed. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Identifying-the-particular-X-or-Y-in-a-sorted-list-tp4605124p4606062.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple plot loop
Jim, thanks for the reply. I tried what you recommend, but I still get an error when running it, just as I did with the similar loops I was trying. Here is the error: Error in xy.coords(x, y) : 'x' and 'y' lengths differ That comes from this code: #Get file library(zoo) setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting) Data = read.csv(120503_CNAT_Summary.csv, header=T) #fill in missing data from last observation Data2 - na.locf(Data) attach(Data2) #Plot line, and make it loop for(column in names(Data2)[2:length(Data2)]) lines(MONTH,column,type=o,pch=22,lty=2,col=blue) The problem perhaps is in my data. My columns are observations over time, column headers are individual names, and the first column is the time series in months. An example is here: MONTH Tag101 0 234 2 236 4 239 8 300 10 320 This then repeats for different individuals . . . I think my problem must be that my length of MONTH is different than my length of observations of each column . . . except that it is not, as far as I can tell! Thank you for assisting me with this simple but frustrating problem - I have the greatest respect for those of you who provide these answers that so many of us read and utilize. Thank you. Ben Neal -Original Message- From: r-help-boun...@r-project.org on behalf of Jim Lemon Sent: Thu 5/3/2012 3:40 AM To: Ben Neal Cc: r-help@r-project.org Subject: Re: [R] Simple plot loop On 05/03/2012 05:50 PM, Ben Neal wrote: Trying to plot multiple lines from a simple matrix with headers (eight observations per column). I will be doing a number of these, with varying numbers of columns, and do not want to enter the header names for each one (I got frustrated and just wrote them out, which did work). Data reads fine, first plot is fine, but when i use the code at the bottom for a for i loop it tells me that x and y do not match. . . One other issue is that I would prefer not to specify the first column either, but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. Should this not call the second column? Thank you very much for any comments. I know this is simple, but I appreciate any assistance. Cheers, Ben # # LOAD DATA FROM CSV library(zoo) setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting) Data = read.csv(120503_CNAT_Summary.csv, header=T) #fill in missing data from last observation Data2- na.locf(Data) attach(Data2) # PLOT ALL ON ONE CHART plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, col=red) title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live coral / colony, xlab=Months, col.main=black, font.main=4) lines(MONTH,T162, type=o, pch=22, lty=2, col=red) lines(MONTH,T231, type=o, pch=22, lty=2, col=green) lines(MONTH,T250, type=o, pch=22, lty=2, col=green) ##(many other similar lines here, with entered column headers . . . up to 75) lines(MONTH,T373, type=o, pch=22, lty=2, col=blue) lines(MONTH,T374, type=o, pch=22, lty=2, col=blue) lines(MONTH,T377, type=o, pch=22, lty=2, col=blue) # Tried to add lines another way with for i loop, but this is the part not working for (i in 2:length(Data2)) { lines(MONTH, i, type=o, pch=22, lty=2, col=blue)) } # Hi Ben, I think what you may want in your loop is this: for(column in names(Data2)[2:length(Data2)]) lines(MONTH,column,type=o,pch=22,lty=2,col=blue) But, if you want the first two lines to be green, you'll probably have to get a vector of colors: colorvec-rep(blue,length(Data2)) colorvec[1]-red colorvec[2:3]-green and change the above to: columnnames-names(Data2) for(column in 2:length(Data2)) lines(MONTH,columnnames[column],type=o,pch=22,lty=2,col=colvec[column]) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identifying the particular X or Y in a sorted list
Dear All, I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). Data-(myDf$B-myDf$C) sum(Data0) ABCD (B-C)1141042127531115-44835518-7 I appreciate your help, thank you very much in advance. Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error fitting coxph model
Hi David, Thanks for your input. My first thought was to look for missing values, but I can tell you there are no missing values in the input. The error is occurring somewhere deep inside coxpenal.fit, so I can't identify how any NAs might be created. Also, the if/else syntax is from coxpenal.fit, so I'm assuming they've done it right. But I like your idea about searching for the error in data halves. It's also worth mentioning that when I changed the df parameter on my pspline terms to df = 3, it worked fine, but the error below occurred with both df = 2 and df = 4. Thanks, Jessica On May 3, 2012, at 10:00 AM, David Winsemius wrote: On May 2, 2012, at 3:02 PM, Jessica Myers wrote: Hi, I am using coxph from the survival package to fit a large model (100,000 observations, ~35 covariates) using both ridge regression (on binary covariates) and penalized splines (for continuous covariates). In fitting, I get a strange error: Error in if (abs((y[nx] - target)/(y[nx - 1] - target)) 0.6) doing.well - FALSE else doing.well - TRUE : missing value where TRUE/FALSE needed You appear to have missing values in 'y', 'nx', or `target`. This could be a case of the newbie error of using if(){}else{} when ifelse() should have been used. Without at least the code we probably cannot resolve those two possibilities. Unfortunately, I can't reproduce this error without handing over my entire dataset, Why not? If the removal of missing values is unsuccessful and you were not committing the error I mentioned, then run it on two halves. Pick the one with the error. Rinse, lather, repeat. but I thought it would be worth checking if anyone had any insight. I should note that the outcome that I'm using has almost everyone having an event (~98,000 events out of 100,000). I have fit other models like this with no problem, but on one particular dataset it fails. Thanks! Jessica Myers Instructor in Medicine David Winsemius, MD West Hartford, CT The information in this e-mail is intended only for the ...{{dropped:7}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pdf, pairs and subfloats
Hi there! I have found a trange problem with getting pairs()-plots to show properly in latex \subfloat environments. If i generate images of these plots with pdf() and include them in subfloats, they will either show up in grayscale, or sometimes the datapoints of the pairplots are missing. Mind you, the PDFs themselves are properly colored and look perfectly fine, both as pdf-image in acrobat and in the pdf if included them without subfloat. If i build a png() instead of pdf() it will be properly colored, but doesn't look as nice when reading the pdf (and most people that will read this read it on their PC, not printed) Any ideas? I am using eclipse with the StatEt Plugin and Sweave. I generate the document-pdf with texlipse and pdflatex. -- If requested i will try to generate the images from R console and try building with another Latex tool. That's work though, so i wanted to see first if someone knows whats going on. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] bwplot: using a numeric variable to position boxplots
Michael, I normally do this with the panel.bwplot.intermediate.hh in the HH package. This function works by plotting each box with its own call to the underlying panel.bwplot function. This example from ?HH::position uses the positioned class to determine where to place the box. library(HH) ?position starting httpd help server ... done ## boxplots coded by week tmp - data.frame(Y=rnorm(40, rep(c(20,25,15,22), 10), 5), + week=ordered(rep(1:4, 10))) position(tmp$week) - c(1, 2, 4, 8) bwplot(Y ~ week, horizontal=FALSE, + scales=list(x=list(limits=c(0,9), + at=position(tmp$week), + labels=position(tmp$week))), + data=tmp, panel=panel.bwplot.intermediate.hh) Rich On 5/3/12, Michael Friendly frien...@yorku.ca wrote: [Env: R 2.14.2 / Win Xp] In the examples below, I'm using lattice::bwplot to plot boxplots of 4 variables, grouped by a factor 'epoch' which also corresponds to a numeric year. I'd like to modify the plots to position the boxplots according to the numeric value of year, but I can't figure out how to do this. Also, I'd to modify the strip labels that give the variable names to use longer variable labels I define below as vlab. install.packages(heplots, repos=http://R-Forge.R-project.org;) # requires heplots 0.9-12 for Skulls data library(heplots) data(Skulls) # make shorter labels for epochs Skulls$epoch - factor(Skulls$epoch, labels=sub(c,,levels(Skulls$epoch))) # create year variable Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30) str(Skulls) 'data.frame': 150 obs. of 6 variables: $ epoch: Ord.factor w/ 5 levels 4000BC3300BC..: 1 1 1 1 1 1 1 1 1 1 ... $ mb : num 131 125 131 119 136 138 139 125 131 134 ... $ bh : num 138 131 132 132 143 137 130 136 134 134 ... $ bl : num 89 92 99 96 100 89 108 93 102 99 ... $ nh : num 49 48 50 44 54 56 48 48 51 51 ... $ year : num -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 -4000 ... table(Skulls$epoch) 4000BC 3300BC 1850BC 200BC AD150 30 30 30 30 30 This is what I've tried. I reshape the data to a long format and can plot value ~ epoch or value ~ as.factor(year), but I really want to space the boxplots according to the numeric value of year. # better variable labels -- how to incorporate these in the strip labels? vlab - c(maxBreadth, basibHeight, basialLength, nasalHeight) library(lattice) library(reshape2) Skulls$year - rep(c(-4000, -3300, -1850, -200, 150), each=30) sklong - melt(Skulls, id=c(epoch, year)) bwplot(value ~ epoch | variable, data=sklong, scales=free, ylab=Variable value, xlab=Epoch, panel = function(x,y, ...) { panel.bwplot(x, y, ...) panel.linejoin(x,y, col=red, ...) } ) bwplot(value ~ as.factor(year) | variable, data=sklong, scales=free, ylab=Variable value, xlab=Year, panel = function(x,y, ...) { panel.bwplot(x, y, ...) panel.linejoin(x,y, col=red, ...) } ) -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele StreetWeb: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] estimation problem
Dear Jeff, thank you for the response. Of course I know this is a theory question still I hope to get some comments on it (if somebody already dealt with alike problems might suggest a package and it would not take longer than saying this is a theoretical question) The values are counts, so 0 means those cases do not have this item, they have 0, as such it means a real zero, they are valid members. thanks, daniel 2012.05.03. 16:42 keltezéssel, Jeff Newmiller írta: Although you have provided R code to illustrate your problem, it is fundamentally a statistics theory question, and belongs somewhere else like stats.stackexchange.net. When you post there, I recommend that you spend more effort to identify why the zeros are present. If they are indicators of unknown values, that will be very different than if zeros are valid members of the population. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Kehl Dánielke...@ktk.pte.hu wrote: Dear List-members, I have a problem where I have to estimate a mean, or a sum of a population but for some reason it contains a huge amount of zeros. I cannot give real data but I constructed a toy example as follows N1- 10 N2- 3000 x1- rep(0,N1) x2- rnorm(N2,300,100) x- c(x1,x2) n- 1000 x_sample- sample(x,n,replace=FALSE) I want to estimate the sum of x based on x_sample (not knowing N1 and N2 but their sum (N) only). The sample mean has a huge standard deviation I am looking for a better estimator. I was thinking about trimmed (or left trimmed as my numbers are all positive) means or something similar, but if I calculate trimmed mean I do not know N2 to multiply with. Do you have any idea or could you give me some insight? Thanks a lot: Daniel __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-sig-hpc] Quickest way to make a large empty file on disk?
Thanks, all! I'll try these out. I'm trying to work up something that is platform independent (if possible) for use with mmap. I'll do some tests on these suggestions and see which works best. I'll try to report back in a few days. Cheers! --j 2012/5/3 Jens Oehlschlägel jens.oehlschlae...@truecluster.com Jonathan, On some filesystems (e.g. NTFS, see below) it is possible to create 'sparse' memory-mapped files, i.e. reserving the space without the cost of actually writing initial values. Package 'ff' does this automatically and also allows to access the file in parallel. Check the example below and see how big file creation is immediate. Jens Oehlschlägel library(ff) library(snowfall) ncpus - 2 n - 1e8 system.time( + x - ff(vmode=double, length=n, filename=c:/Temp/x.ff) + ) User System verstrichen 0.010.000.02 # check finalizer, with an explicit filename we should have a 'close' finalizer finalizer(x) [1] close # if not, set it to 'close' inorder to not let slaves delete x on slave shutdown finalizer(x) - close sfInit(parallel=TRUE, cpus=ncpus, type=SOCK) R Version: R version 2.15.0 (2012-03-30) snowfall 1.84 initialized (using snow 0.3-9): parallel execution on 2 CPUs. sfLibrary(ff) Library ff loaded. Library ff loaded in cluster. Warnmeldung: In library(package = ff, character.only = TRUE, pos = 2, warn.conflicts = TRUE, : 'keep.source' is deprecated and will be ignored sfExport(x) # note: do not export the same ff multiple times # explicitely opening avoids a gc problem sfClusterEval(open(x, caching=mmeachflush)) # opening with 'mmeachflush' inststead of 'mmnoflush' is a bit slower but prevents OS write storms when the file is larger than RAM [[1]] [1] TRUE [[2]] [1] TRUE system.time( + sfLapply( chunk(x, length=ncpus), function(i){ + x[i] - runif(sum(i)) + invisible() + }) + ) User System verstrichen 0.000.00 30.78 system.time( + s - sfLapply( chunk(x, length=ncpus), function(i) quantile(x[i], c(0.05, 0.95)) ) + ) User System verstrichen 0.000.004.38 # for completeness sfClusterEval(close(x)) [[1]] [1] TRUE [[2]] [1] TRUE csummary(s) 5% 95% Min.0.04998 0.95 1st Qu. 0.04999 0.95 Median 0.05001 0.95 Mean0.05001 0.95 3rd Qu. 0.05002 0.95 Max.0.05003 0.95 # stop slaves sfStop() Stopping cluster # with the close finalizer we are responsible for deleting the file explicitely (unless we want to keep it) delete(x) [1] TRUE # remove r-side metadata rm(x) # truly free memory gc() *Gesendet:* Donnerstag, 03. Mai 2012 um 00:23 Uhr *Von:* Jonathan Greenberg j...@illinois.edu *An:* r-help r-help@r-project.org, r-sig-...@r-project.org *Betreff:* [R-sig-hpc] Quickest way to make a large empty file on disk? R-helpers: What would be the absolute fastest way to make a large empty file (e.g. filled with all zeroes) on disk, given a byte size and a given number number of empty values. I know I can use writeBin, but the object in this case may be far too large to store in main memory. I'm asking because I'm going to use this file in conjunction with mmap to do parallel writes to this file. Say, I want to create a blank file of 10,000 floating point numbers. Thanks! --j -- Jonathan A. Greenberg, PhD Assistant Professor Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 415-763-5476 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 http://www.geog.illinois.edu/people/JonathanGreenberg.html [[alternative HTML version deleted]] ___ R-sig-hpc mailing list r-sig-...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc -- Jonathan A. Greenberg, PhD Assistant Professor Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 415-763-5476 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007 http://www.geog.illinois.edu/people/JonathanGreenberg.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] estimation problem
On Thu, May 03, 2012 at 03:08:00PM +0200, Kehl Dániel wrote: Dear List-members, I have a problem where I have to estimate a mean, or a sum of a population but for some reason it contains a huge amount of zeros. I cannot give real data but I constructed a toy example as follows N1 - 10 N2 - 3000 x1 - rep(0,N1) x2 - rnorm(N2,300,100) x - c(x1,x2) n - 1000 x_sample - sample(x,n,replace=FALSE) I want to estimate the sum of x based on x_sample (not knowing N1 and N2 but their sum (N) only). The sample mean has a huge standard deviation I am looking for a better estimator. Hi. I do not know the exact answer, but let me formulate the following observation. If the question is redefined to estimate the mean of nonzero numbers, then an estimate is mean(x_sample[x_sample != 0]). Its standard deviation in your situation may be estimated as res - rep(NA, times=1000) for (i in seq.int(along=res)) { x_sample - sample(x,n,replace=FALSE) res[i] - mean(x_sample[x_sample != 0]) } sd(res) [1] 18.72677 # this varies with the seed a bit The observation is that this cannot be improved much, since the estimate is based on a very small sample. The average size of the sample of nonzero values is N2/(N1+N2)*n = 29.1. So, the standard deviation should be something close to 100/sqrt(29.1) = 18.5376. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] `mapply(function(x) function() x, c(a, b))$a()' returns `b'
Thank you very much, though I still don't quite undertdand the explanation :) Nevertheless, I just found a seemingly simple (at least quiker to type) solution after try-and-error: eval(mapply(function(x) {x; function() x}, c(a, b))) Wish it may help future readers. On Thu, May 03, 2012 at 04:18:38PM +0200, Jessica Streicher wrote: So, to get back to mapply: eval(mapply(function(x) substitute(function() z,list(z=x)), c(a, b))$a)() or like this: mapply(function(x) eval(substitute(function(i) z*i,list(z=x))), c(2,3))[[1]](2) -- Using GPG/PGP? Please get my current public key (ID: 0xAEF6A134, valid from 2010 to 2013) from a key server. signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] LME4 to MCMCglmm
Hi all, I am trying to run an lme4 model (logistic regression with mixed effects) in MCMCglmm but am unsure how to implement it properly. Currently, my lme4 model formula looks as follows: outcome ~ (1 + var1 + var2 | study) + var1 + var2 In English, this means that I am fitting a random effects model, where the intercept, var1 and var2 are jointly distributed according to study. My question is now how I would translate this formula to the fixed and random terms in MCMCglmm. For the fixed part, I figured that I should make a variable nooutcome=abs(1-outcome) because it can then be modeled with a multinomial2 family as there is no binomial(logit) option available. Then, the fixed part would look as follows: cbind(outcome,nooutcome)~1+var1+var2 However, I am unsure how to specify the random effects over the intercept, var1 and var2 jointly. So far, I was able to generate the following: random=~us(var1):study+us(var2):study+us(1):study which I think corresponds to outcome ~ (1 | study) + (var1 | study) + (var2 | study) + var1 + var2 instead of outcome ~ (1 + var1 + var2 | study) + var1 + var2 I would appreciate any help. Thomas -- View this message in context: http://r.789695.n4.nabble.com/LME4-to-MCMCglmm-tp4606423.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying the particular X or Y in a sorted list
You do not seem to have suppied either code nor data. Please supply both. John Kane Kingston ON Canada -Original Message- From: shankarla...@gmail.com Sent: Wed, 2 May 2012 22:06:54 -0400 To: r-help@r-project.org Subject: [R] Identifying the particular X or Y in a sorted list Dear All, I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). Data-(myDf$B-myDf$C) sum(Data0) A B CD (B-C) 1 14 10 4 2 12 7 5 3 11 15 -4 4 8 3 5 5 1 8 -7 I appreciate your help, thank you very much in advance. Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] warning with glm.predict, wrong number of data rows
carol white wht_crl at yahoo.com writes: Hi, I split a data set into two partitions (80 and 42), use the first as the training set in glm and the second as testing set in glm predict. But when I call glm.predict, I get the warning message: Warning message: 'newdata' had 42 rows but variable(s) found have 80 rows - [snip] The warning correctly diagnoses the problem. The posting guide asks for a 'reproducible example', but you did not give us one. There is one below. Note what happens when predict() tries to reconstruct the variable 'x[1:4]' as dictated by the formula. How many elements can 'x[1:4]' have when newdata has (say) nrowsNew? Use the subset argument to select a subset of observations. y - sample(factor(1:2),80,repl=T) y - sample(factor(1:2),5,repl=T) x - 1:4 fit - glm( y[1:4] ~ x[1:4], family = binomial) fit Call: glm(formula = y[1:4] ~ x[1:4], family = binomial) Coefficients: (Intercept) x[1:4] -1.110e-160.000e+00 Degrees of Freedom: 3 Total (i.e. Null); 2 Residual Null Deviance: 5.545 Residual Deviance: 5.545AIC: 9.545 predict(fit,newdata=data.frame(x=1:2)) 1 2 3 4 -1.110223e-16 -1.110223e-16NANA Warning message: 'newdata' had 2 rows but variable(s) found have 4 rows predict(fit,newdata=data.frame(x=1:5)) 1 2 3 4 -1.110223e-16 -1.110223e-16 -1.110223e-16 -1.110223e-16 Warning message: 'newdata' had 5 rows but variable(s) found have 4 rows HTH, Chuck [rest deleted] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cannot read or write to file in Linux Ubuntu
I am the proud owner of a new laptop since my old one died the other day. Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 . I'll leave the Windows problems for another post. I know practically nothing about Linux so I am probably doing something stupid but ... at the moment I cannot seem read or write files in Ubuntu. I am not having any problem saving other documents to the hard drive and R , from my few simple tests, seems to be working okay otherwise. At the moment I am trying : mydata - read.csv(DATA/media/DATA/rdata/tt1.csv, header = TRUE) or mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) where tt1.csv is a text file on what, from my reading of the path listed in gedit is DATA/media/DATA/rdata The csv data is simply: aa, bb 2, 3 4, 5 What happens: --- 1 mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file 'DATA/rdata/tt1.csv': No such file or directory Am I totally screwing up the path? Or doing something else equally stupid? BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in the repositories and I have yet to figure out how to install it from a CRAN site. 1 sessionInfo() R version 2.14.1 (2011-12-22) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base John Kane Kingston ON Canada FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pdf, pairs and subfloats
I was bored and have tried doing it with console and texworks (also uses pdflatex) texworks preview shows it properly colored, but in acrobat reader it is black and white again. Still scratching my head.. Am 03.05.2012 um 17:14 schrieb Jessica Streicher: Hi there! I have found a trange problem with getting pairs()-plots to show properly in latex \subfloat environments. If i generate images of these plots with pdf() and include them in subfloats, they will either show up in grayscale, or sometimes the datapoints of the pairplots are missing. Mind you, the PDFs themselves are properly colored and look perfectly fine, both as pdf-image in acrobat and in the pdf if included them without subfloat. If i build a png() instead of pdf() it will be properly colored, but doesn't look as nice when reading the pdf (and most people that will read this read it on their PC, not printed) Any ideas? I am using eclipse with the StatEt Plugin and Sweave. I generate the document-pdf with texlipse and pdflatex. -- If requested i will try to generate the images from R console and try building with another Latex tool. That's work though, so i wanted to see first if someone knows whats going on. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read or write to file in Linux Ubuntu
Hi John, You're probably messing up the path, just as you suspect. If you use a relative path, like you are doing, then R looks for that location starting at R's current working directory, visible with getwd(). For linux, that's the location at which you started R if you started it from a terminal. The safest solution is to use an absolute path, which will likely be something resembling /home/john/DATA/... etc - note that it will always start with a / and go from there. If you know how to start a terminal window and cd to where your file is, pwd at the command prompt will give you the absolute path to that location, which is what you should be using until you get more comfortable with the file system. The error message means that R can't find the directory you're telling it to use. Sarah On Thu, May 3, 2012 at 12:21 PM, John Kane jrkrid...@inbox.com wrote: I am the proud owner of a new laptop since my old one died the other day. Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 . I'll leave the Windows problems for another post. I know practically nothing about Linux so I am probably doing something stupid but ... at the moment I cannot seem read or write files in Ubuntu. I am not having any problem saving other documents to the hard drive and R , from my few simple tests, seems to be working okay otherwise. At the moment I am trying : mydata - read.csv(DATA/media/DATA/rdata/tt1.csv, header = TRUE) or mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) where tt1.csv is a text file on what, from my reading of the path listed in gedit is DATA/media/DATA/rdata The csv data is simply: aa, bb 2, 3 4, 5 What happens: --- 1 mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file 'DATA/rdata/tt1.csv': No such file or directory Am I totally screwing up the path? Or doing something else equally stupid? BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in the repositories and I have yet to figure out how to install it from a CRAN site. 1 sessionInfo() R version 2.14.1 (2011-12-22) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read or write to file in Linux Ubuntu
All of your tests are with relative paths. Use getwd() identify your starting directory, and if it isn't you can use setwd() to start in the right place. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. John Kane jrkrid...@inbox.com wrote: I am the proud owner of a new laptop since my old one died the other day. Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 . I'll leave the Windows problems for another post. I know practically nothing about Linux so I am probably doing something stupid but ... at the moment I cannot seem read or write files in Ubuntu. I am not having any problem saving other documents to the hard drive and R , from my few simple tests, seems to be working okay otherwise. At the moment I am trying : mydata - read.csv(DATA/media/DATA/rdata/tt1.csv, header = TRUE) or mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) where tt1.csv is a text file on what, from my reading of the path listed in gedit is DATA/media/DATA/rdata The csv data is simply: aa, bb 2, 3 4, 5 What happens: --- 1 mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file 'DATA/rdata/tt1.csv': No such file or directory Am I totally screwing up the path? Or doing something else equally stupid? BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in the repositories and I have yet to figure out how to install it from a CRAN site. 1 sessionInfo() R version 2.14.1 (2011-12-22) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base John Kane Kingston ON Canada FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two ecdf with log-scales
Thank you Steve, thats the thing I was looking for /Johannes Original-Nachricht Datum: Thu, 3 May 2012 08:20:51 -0400 Von: Steven Wolf wolfs...@msu.edu An: \'David Winsemius\' dwinsem...@comcast.net, \'Johannes Radinger\' jradin...@gmx.at CC: R-help@r-project.org Betreff: RE: [R] Two ecdf with log-scales I've done it this way before: eX - ecdf(distribution 1) eY - ecdf(distribution 2) par(mar=c(5,5,2,1),xlog=TRUE) plot(eX, do.points=FALSE, verticals=TRUE, col=black, xlab=xlabel, xlim=c(1,10), ylab=ylabel, lty=1, cex.lab=1.5, cex.axis=1.5, main=, lwd=3,log=x) plot(eY, do.points=FALSE, verticals=TRUE, col=blue, add=TRUE, xlim=c(1,10), main=) Warning: It makes a stair-step that may be difficult to see unless you use color. I had to change how the ecdf was plotted when I made b/w figures for my publication so that different dashed lines were distinct. HTH, -Steve -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Wednesday, May 02, 2012 10:17 AM To: Johannes Radinger Cc: R-help@r-project.org Subject: Re: [R] Two ecdf with log-scales On May 2, 2012, at 6:14 AM, Johannes Radinger wrote: Hi, i want to plot empirical cumulative density functions for two variables in one plot. For better visualizing the differences in the two cumulative curves I'd like to log-scale the axis. So far I found 3 possible functions to plot ecdf: 1) ecdf() from the package 'stats'. I don't know how to successfully set the log.scales? Combining two plots is not a problem: plot(ecdf(x1)) lines(ecdf(x2),col.h=red) 2) gx.ecdf() from package 'rgr'. It is easily possible to plot log- scales, but I don't know how to plot two densities? gx.ecdf(x1,log=TRUE,ifqs = TRUE) 3) Ecdf() from package 'Hmisc'. No log-option directly available and here I also don't know how to 'stack' two plots... Ecdf(x1,what=F) Probably there are many more solutions (e.g. ggplot etc.)... ...Has anyone faced a similar task and found a simple solution? Any suggestions are welcome! Have you searched the Archives? I seem to remember that the log(0) was a barrier to persons attempting this in the past. (ISTR a posting in the last few weeks.) Maybe you could also provide a test data object that has the same range as your x1 and x 2 variables. and provide commented, minimal, self-contained, reproducible code. -- David Winsemius, MD West Hartford, CT -- Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with 'nls' fitting logistic model (5PL)
?nls.control fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights, + start=c(a=100, b=1, c=100, d=-1, f=1), control=nls.control(warnOnly=TRUE)) Warning message: In nls(MFI ~ a + b/((1 + (nom/c)^d)^f), data = x, weights = x$weights, : step factor 0.000488281 reduced below 'minFactor' of 0.000976562 fit Nonlinear regression model model: MFI ~ a + b/((1 + (nom/c)^d)^f) data: x a b c d f 165.4360 23866.124738.6157-0.4454 3.2210 weighted residual sum-of-squares: 2627977 Number of iterations till stop: 23 Achieved convergence tolerance: 48.23 Reason stopped: step factor 0.000488281 reduced below 'minFactor' of 0.000976563 summary(fit) Formula: MFI ~ a + b/((1 + (nom/c)^d)^f) Parameters: Estimate Std. Error t value Pr(|t|) a 1.654e+02 2.742e+04 0.0060.995 b 2.387e+04 3.027e+06 0.0080.994 c 3.862e+01 3.030e+04 0.0010.999 d -4.454e-01 5.754e+01 -0.0080.994 f 3.221e+00 1.008e+03 0.0030.998 Residual standard error: 540.4 on 9 degrees of freedom Number of iterations till stop: 23 Achieved convergence tolerance: 48.23 Reason stopped: step factor 0.000488281 reduced below 'minFactor' of 0.000976563 Perhaps nls is a little more stringent than ANY reliable statistical software in what, by default, it considers a fit worth reporting? Keith J Michal Figurski figur...@mail.med.upenn.edu wrote in message news:4fa2759c.3060...@mail.med.upenn.edu... Bert, Thank you for your thoughts. I can assure you I have plotted the data back and forth many times, and that overfitting has nothing to do with it. This is not a _statistical_ problem, but a _technical_ problem. Something that works well in ANY reliable statistical software doesn't work in R with 'nls'. This just makes me think that 'nls' is a dysfunctional piece of junk that can't handle even a very simple problem. Not to mention that 'predict()' for 'nls' doesn't give you any sort of confidence interval. I wonder if there's another package in R that could be used to fit 5P-logistic model. Any idea? In the end I may attempt to write a fitting function myself, but that would be the last resort. -- Michal J. Figurski, PhD HUP, Pathology Laboratory Medicine Biomarker Research Laboratory 3400 Spruce St. 7 Maloney S Philadelphia, PA 19104 tel. (215) 662-3413 On 5/2/2012 3:47 PM, Bert Gunter wrote: Plot the data. You're clearly overfitting. (If you don't know what this means or why it causes the problems you see, try a statistical help list or consult your local statistician). -- Bert On Wed, May 2, 2012 at 12:32 PM, Michal Figurski figur...@mail.med.upenn.edu wrote: Dear R-Helpers, I'm working with immunoassay data and 5PL logistic model. I wanted to experiment with different forms of weighting and parameter selection, which is not possible in instrument software, so I turned to R. I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the model - I started with the same model and weighting type (1/y) as in the instrument to see if I'll get similar results. However, in some instances I don't get any results - just errors. Here is an example calibration data, representative of my experiment. Instrument soft had no problem fitting it: x- structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3, St4, St5, St6, St7), class = factor), MFI = c(10755.5, 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58, 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24, 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683, 0.00152454517735542, 0.00291686922702965, 0.00308832612723904, 0.0099304865938431, 0.00996677740863787, 0.0298507462686567, 0.0332594235033259, 0.0697674418604651, 0.0767263427109974, 0.258620689655172, 0.263157894736842, 1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA, -14L), class = data.frame) And here is the nls fit: fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights, start=c(a=100, b=1, c=100, d=-1, f=1)) I've tried every possible combination of starting values, including the values fitted by the instrument soft - to no avail. I've probably seen all possible error messages from 'nls' trying to fit this. If anyone has an idea why it's not working - let me know. Best regards, -- Michal J. Figurski, PhD HUP, Pathology Laboratory Medicine Biomarker Research Laboratory 3400 Spruce St. 7 Maloney S Philadelphia, PA 19104 tel. (215) 662-3413 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list
[R] Help with readBin
I'm trying to read a binary file created by a fortran code using readBin and readChar. Everything reads fine (integers and strings) except for double precision numbers, they are read as huge or very small number (1E-250,...). I tried various endianness, swap, But nothing has worked so far. I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on windows XP 32 bit. Any help would be appreciated. Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with 'nls' fitting logistic model (5PL)
Thanks, Keith. I failed to cc the following reply to John Nash to the list. Your email persuaded me that it might be useful to do so. None of this changes the fact that the model is overfitted. You may be able to get convergence to some set of parameter estates, but they won't have much meaning since the confidence region will be huge (better word: essentially degenerate -- since it would have essentially 0 volume in 5d). Predictions should be ok, but if that is all that is wanted, a more parsimonious model would be better. Extrapolation is nuts of course. In my experience(only), scientists tend to overfit nonlinear models. Failure to converge seems to me to be more appropriate in these circumstances than providing meaningless parameters. And,yes I know I'm a curmudgeon. But facilitating bad practice should always be a concern. Best, Bert On Thu, May 3, 2012 at 9:38 AM, Keith Jewell k.jew...@campden.co.uk wrote: ?nls.control fit- nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights, + start=c(a=100, b=1, c=100, d=-1, f=1), control=nls.control(warnOnly=TRUE)) Warning message: In nls(MFI ~ a + b/((1 + (nom/c)^d)^f), data = x, weights = x$weights, : step factor 0.000488281 reduced below 'minFactor' of 0.000976562 fit Nonlinear regression model model: MFI ~ a + b/((1 + (nom/c)^d)^f) data: x a b c d f 165.4360 23866.1247 38.6157 -0.4454 3.2210 weighted residual sum-of-squares: 2627977 Number of iterations till stop: 23 Achieved convergence tolerance: 48.23 Reason stopped: step factor 0.000488281 reduced below 'minFactor' of 0.000976563 summary(fit) Formula: MFI ~ a + b/((1 + (nom/c)^d)^f) Parameters: Estimate Std. Error t value Pr(|t|) a 1.654e+02 2.742e+04 0.006 0.995 b 2.387e+04 3.027e+06 0.008 0.994 c 3.862e+01 3.030e+04 0.001 0.999 d -4.454e-01 5.754e+01 -0.008 0.994 f 3.221e+00 1.008e+03 0.003 0.998 Residual standard error: 540.4 on 9 degrees of freedom Number of iterations till stop: 23 Achieved convergence tolerance: 48.23 Reason stopped: step factor 0.000488281 reduced below 'minFactor' of 0.000976563 Perhaps nls is a little more stringent than ANY reliable statistical software in what, by default, it considers a fit worth reporting? Keith J Michal Figurski figur...@mail.med.upenn.edu wrote in message news:4fa2759c.3060...@mail.med.upenn.edu... Bert, Thank you for your thoughts. I can assure you I have plotted the data back and forth many times, and that overfitting has nothing to do with it. This is not a _statistical_ problem, but a _technical_ problem. Something that works well in ANY reliable statistical software doesn't work in R with 'nls'. This just makes me think that 'nls' is a dysfunctional piece of junk that can't handle even a very simple problem. Not to mention that 'predict()' for 'nls' doesn't give you any sort of confidence interval. I wonder if there's another package in R that could be used to fit 5P-logistic model. Any idea? In the end I may attempt to write a fitting function myself, but that would be the last resort. -- Michal J. Figurski, PhD HUP, Pathology Laboratory Medicine Biomarker Research Laboratory 3400 Spruce St. 7 Maloney S Philadelphia, PA 19104 tel. (215) 662-3413 On 5/2/2012 3:47 PM, Bert Gunter wrote: Plot the data. You're clearly overfitting. (If you don't know what this means or why it causes the problems you see, try a statistical help list or consult your local statistician). -- Bert On Wed, May 2, 2012 at 12:32 PM, Michal Figurski figur...@mail.med.upenn.edu wrote: Dear R-Helpers, I'm working with immunoassay data and 5PL logistic model. I wanted to experiment with different forms of weighting and parameter selection, which is not possible in instrument software, so I turned to R. I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the model - I started with the same model and weighting type (1/y) as in the instrument to see if I'll get similar results. However, in some instances I don't get any results - just errors. Here is an example calibration data, representative of my experiment. Instrument soft had no problem fitting it: x- structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3, St4, St5, St6, St7), class = factor), MFI = c(10755.5, 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58, 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24, 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683, 0.00152454517735542, 0.00291686922702965, 0.00308832612723904, 0.0099304865938431, 0.00996677740863787, 0.0298507462686567, 0.0332594235033259, 0.0697674418604651, 0.0767263427109974, 0.258620689655172, 0.263157894736842, 1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA, -14L), class = data.frame) And here is the nls fit: fit-
Re: [R] Cannot read or write to file in Linux Ubuntu
Thanks. I had not realsed there were relative paths until Sarah mentioned them. It's working now: see my post to Sarah. John Kane Kingston ON Canada -Original Message- From: jdnew...@dcn.davis.ca.us Sent: Thu, 03 May 2012 09:30:10 -0700 To: jrkrid...@inbox.com, r-help@r-project.org Subject: Re: [R] Cannot read or write to file in Linux Ubuntu All of your tests are with relative paths. Use getwd() identify your starting directory, and if it isn't you can use setwd() to start in the right place. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. John Kane jrkrid...@inbox.com wrote: I am the proud owner of a new laptop since my old one died the other day. Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 . I'll leave the Windows problems for another post. I know practically nothing about Linux so I am probably doing something stupid but ... at the moment I cannot seem read or write files in Ubuntu. I am not having any problem saving other documents to the hard drive and R , from my few simple tests, seems to be working okay otherwise. At the moment I am trying : mydata - read.csv(DATA/media/DATA/rdata/tt1.csv, header = TRUE) or mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) where tt1.csv is a text file on what, from my reading of the path listed in gedit is DATA/media/DATA/rdata The csv data is simply: aa, bb 2, 3 4, 5 What happens: --- 1 mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file 'DATA/rdata/tt1.csv': No such file or directory Am I totally screwing up the path? Or doing something else equally stupid? BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in the repositories and I have yet to figure out how to install it from a CRAN site. 1 sessionInfo() R version 2.14.1 (2011-12-22) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_CA.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base John Kane Kingston ON Canada FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks orcas on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read or write to file in Linux Ubuntu
Thanks Sarah, I suspected something like that but am still gropping around in Linux. I vaguely remember how to cd to someplace. Shades of DOS 3.2! Of was that Unixor both! Also I think I was trying to be a bit too smart-alecky in where I was placing my data folder so I moved it to my home folder to simplify figuring out the path. Still thinking in Windows terms. After a bit of trial and error: jjohn@john-K53U:~$ cd /home/john/rdata john@john-K53U:~/rdata$ dir tti.csv john@john-K53U:~/rdata$ pwd /home/john/rdata so mydata - read.csv(/home/john/rdata/tti.csv, header = TRUE) works just fine. I like the idea of staying with absolute paths. I am most appreciative. John Kane Kingston ON Canada -Original Message- From: sarah.gos...@gmail.com Sent: Thu, 3 May 2012 12:29:14 -0400 To: jrkrid...@inbox.com Subject: Re: [R] Cannot read or write to file in Linux Ubuntu Hi John, You're probably messing up the path, just as you suspect. If you use a relative path, like you are doing, then R looks for that location starting at R's current working directory, visible with getwd(). For linux, that's the location at which you started R if you started it from a terminal. The safest solution is to use an absolute path, which will likely be something resembling /home/john/DATA/... etc - note that it will always start with a / and go from there. If you know how to start a terminal window and cd to where your file is, pwd at the command prompt will give you the absolute path to that location, which is what you should be using until you get more comfortable with the file system. The error message means that R can't find the directory you're telling it to use. Sarah On Thu, May 3, 2012 at 12:21 PM, John Kane jrkrid...@inbox.com wrote: I am the proud owner of a new laptop since my old one died the other day. Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 . I'll leave the Windows problems for another post. I know practically nothing about Linux so I am probably doing something stupid but ... at the moment I cannot seem read or write files in Ubuntu. I am not having any problem saving other documents to the hard drive and R , from my few simple tests, seems to be working okay otherwise. At the moment I am trying : mydata - read.csv(DATA/media/DATA/rdata/tt1.csv, header = TRUE) or mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) where tt1.csv is a text file on what, from my reading of the path listed in gedit is DATA/media/DATA/rdata The csv data is simply: aa, bb 2, 3 4, 5 What happens: --- 1 mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file 'DATA/rdata/tt1.csv': No such file or directory Am I totally screwing up the path? Or doing something else equally stupid? BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in the repositories and I have yet to figure out how to install it from a CRAN site. 1 sessionInfo() R version 2.14.1 (2011-12-22) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Sarah Goslee http://www.functionaldiversity.org FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with readBin
On 03/05/2012 12:41 PM, kapo coulibaly wrote: I'm trying to read a binary file created by a fortran code using readBin and readChar. Everything reads fine (integers and strings) except for double precision numbers, they are read as huge or very small number (1E-250,...). I tried various endianness, swap, But nothing has worked so far. I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on windows XP 32 bit. Any help would be appreciated. As I wrote to someone else with a similar problem a couple of weeks ago: You need to see what's in the file. The hexView package can dump it in various formats; see example(viewFormat) for a couple. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to replace NA with zero (0)
Hello, When i generate data with the code below there appear NA as part of the generated data, i prefer to have zero (0) instead of NA on my data. Is there a command i can issue to replace the NA with zero (0) even if it is after generating the data? Thank you library(survival) p1-0.8;b-1.5;rr-1000 for(i in 1:rr){ r-runif(45,min=0,max=1) t-rweibull(45,p1,b) w=Surv(r,t,type=interval2) x[1:45]-(w[,1]) u-x[1:45] y[1:45]-(w[,2]) v-y[1:45] } w u v Chris G Researcher Institute for Mathematical Research UPM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Permute/bootstrap F-statistics
Hi guys, I really like the package adegenet for population structure analysis. I used the function Fst from the pegas package (wrapped within adegenet) in order to generate F-statistic estimates for my data set. However, this does not provide me with confidence intervals or p-values. I was wondering if there is a way to generate bootstrapped/permuted p values or confidence intervals? Thanks, DF -- View this message in context: http://r.789695.n4.nabble.com/Permute-bootstrap-F-statistics-tp4606264.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read or write to file in Linux Ubuntu
I like the idea of staying with absolute paths. Before you write too much R code that builds in absolute paths, please consider how difficult it will be to adjust all of those paths if you need to run on a different computer or you need to reorganize your overall directory structure. If you keep related R files in the same project directory, you can collapse all of those paths down to short relative paths, and do one setwd at the beginning, or learn to manually set your base working directory as a matter of habit before each working session. (This habit is useful in more areas than just R programming.) --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. John Kane jrkrid...@inbox.com wrote: Thanks Sarah, I suspected something like that but am still gropping around in Linux. I vaguely remember how to cd to someplace. Shades of DOS 3.2! Of was that Unixor both! Also I think I was trying to be a bit too smart-alecky in where I was placing my data folder so I moved it to my home folder to simplify figuring out the path. Still thinking in Windows terms. After a bit of trial and error: jjohn@john-K53U:~$ cd /home/john/rdata john@john-K53U:~/rdata$ dir tti.csv john@john-K53U:~/rdata$ pwd /home/john/rdata so mydata - read.csv(/home/john/rdata/tti.csv, header = TRUE) works just fine. I like the idea of staying with absolute paths. I am most appreciative. John Kane Kingston ON Canada -Original Message- From: sarah.gos...@gmail.com Sent: Thu, 3 May 2012 12:29:14 -0400 To: jrkrid...@inbox.com Subject: Re: [R] Cannot read or write to file in Linux Ubuntu Hi John, You're probably messing up the path, just as you suspect. If you use a relative path, like you are doing, then R looks for that location starting at R's current working directory, visible with getwd(). For linux, that's the location at which you started R if you started it from a terminal. The safest solution is to use an absolute path, which will likely be something resembling /home/john/DATA/... etc - note that it will always start with a / and go from there. If you know how to start a terminal window and cd to where your file is, pwd at the command prompt will give you the absolute path to that location, which is what you should be using until you get more comfortable with the file system. The error message means that R can't find the directory you're telling it to use. Sarah On Thu, May 3, 2012 at 12:21 PM, John Kane jrkrid...@inbox.com wrote: I am the proud owner of a new laptop since my old one died the other day. Currently I have a dual-boot Windows 7 Home and Ubuntu 12.04 . I'll leave the Windows problems for another post. I know practically nothing about Linux so I am probably doing something stupid but ... at the moment I cannot seem read or write files in Ubuntu. I am not having any problem saving other documents to the hard drive and R , from my few simple tests, seems to be working okay otherwise. At the moment I am trying : mydata - read.csv(DATA/media/DATA/rdata/tt1.csv, header = TRUE) or mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) where tt1.csv is a text file on what, from my reading of the path listed in gedit is DATA/media/DATA/rdata The csv data is simply: aa, bb 2, 3 4, 5 What happens: --- 1 mydata - read.csv(DATA/rdata/tt1.csv, header = TRUE) Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file 'DATA/rdata/tt1.csv': No such file or directory Am I totally screwing up the path? Or doing something else equally stupid? BTW I realise that 2.15 is out but Ubuntu as of yesterday did not have it in the repositories and I have yet to figure out how to install it from a CRAN site. 1 sessionInfo() R version 2.14.1 (2011-12-22) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Sarah Goslee http://www.functionaldiversity.org
[R] Runtime column name creation
Hello all, I have a data frame with column names s1, s2, s3s11 I have a function that gets two parameters, one is used as a subscript for the column names and another is used as an index into the chosen column. For example: my_func - function(subscr, index) { if (subscr == 1) { df$s1[index] - some value } } The problem is, I do not want to create a bunch of if statements (one for each 1:11 column names)). Instead, I want to create the column name in run time based on subscr value. I tried eval(as.name(paste(df$s,subscr,sep=)))[index] - some value and it complains that object df$s1 is not found. Could someone please help me with this? (Needless to say, I have just started programing in R) Thanks, Shankar -- View this message in context: http://r.789695.n4.nabble.com/Runtime-column-name-creation-tp4606497.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] braking a label in two lines when using expression()
Your code did not run on my computer, however the atop function should do what you are looking for I guess. This is more or less your axis, only changed a bit in your formula, think it looks better this way: e.g. par(mar=c(5,7,.5,.5), las=1, adj=.5, cex.lab=1.5) plot(1, type=n , xlab=Week , ylab=expression(atop(Mean Respisration Rate, umol%.%L^-1%.%g~(AFDM)^-1) )) -- View this message in context: http://r.789695.n4.nabble.com/braking-a-label-in-two-lines-when-using-expression-tp4605716p4606414.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] overlapping confidence bands for predicted probabilities from a logistic model
Dear list, I'm a bit perplexed why the 95% confidence bands for the predicted probabilities for units where x=0 and x=1 overlap in the following instance. I've simulated binary data to which I've then fitted a simple logistic regression model, with one covariate, and the coefficient on x is statistically significant at the 0.05 level. I've then used two different methods to generate 95% confidence bands for predicted probabilities, for each of two possible values of x. Given the result of the model, I expected the bands not to overlap… but they do. Can anyone please explain where I've gone wrong with the following code, OR why it makes sense for the bands to overlap, despite the statistically significant beta coefficient? There may be a good statistical reason for this, but I'm not aware of it. Many thanks, Malcolm Fairbrother n - 120 set.seed(030512) x - rbinom(n, 1, 0.5) dat - within(data.frame(x), ybe - rbinom(n, 1, plogis(-0.5 + x))) mod1 - glm(ybe ~ x, dat, family=binomial) summary(mod1) # coefficient on x is statistically significant at the 0.05 level… almost at the 0.01 level pred - predict(mod1, newdata=data.frame(x=c(0,1)), se.fit=T) with(pred, cbind(low = plogis(fit - 1.96*se.fit), est = plogis(fit), up = plogis(fit + 1.96*se.fit))) # confidence bands based on SEs # simulation-based confidence bands: sims - t(replicate(200, coef(glm(simulate(mod1)$sim_1 ~ x, data=dat, family=binomial pred0 - plogis(quantile(sims%*%c(1,0), c(0.025, 0.5, 0.975))) pred1 - plogis(quantile(sims%*%c(1,1), c(0.025, 0.5, 0.975))) rbind(pred0, pred1) # the upper bound of the prediction for x=0 is greater than the lower bound of the prediction for x=1, using both methods __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Creating a point pattern with cartesian co-ordinates
I have the following data from an image analysis program, in which the x and y co-ordinates are locations of the centroids of shapes on a 2 dimensional plot. The Y co-ordinates were positive, but I changed them to negative as the resulting scatterplot was upside down (the image analysis program reads from the top of the image to the bottom, so it seems) I now need these as point pattern data, as I want to subsequently add marks to them (area of shape): Can anybody assist? XY 159.163 -165.203 162.469 -233 167.77 -213.503 169.988 -174.764 172.113 -246.258 172.494 -232.865 177.855 -213.164 178.568 -250.038 179.486 -186.033 180.208 -244.185 184.28 -170.754 185.447 -181.114 185.597 -193.815 188.779 -204.507 192.978 -220.553 198.208 -224.535 199.346 -185.853 200.3 -229.917 204.028 -190.552 205.1 -217.909 208.686 -242.057 212.575 -243.511 212.942 -223.68 212.994 -259.469 215.643 -200.613 217.51 -219.069 218.193 -225.037 220.049 -172.226 220.181 -251.714 221.946 -185.973 222.738 -195.468 223.08 -177.071 225.058 -262.459 225.265 -220.209 228.767 -199.047 229.196 -188.481 230.268 -245.925 233.604 -207.356 234.708 -247.057 234.874 -213.096 239.822 -251.908 240.771 -178.647 242.287 -193.157 242.468 -249.385 243.207 -233.702 247.955 -196.673 248.178 -252.502 250.82 -147.687 252.076 -209.894 253.883 -199.777 255.635 -251.216 257.03 -78.085 257.108 -229.982 258.46 -119.812 258.641 -218.726 258.953 -166.529 259.216 -150.816 260.544 -146.563 260.557 -84.336 260.59 -67.364 260.742 -190.558 262.247 -79.629 262.316 -86.505 262.436 -227.381 263.622 -127.775 263.814 -85.251 263.874 -94.635 264.945 -75.656 265.038 -223.406 265.313 -204.995 269.082 -183.29 269.428 -209.686 269.984 -246.628 270.371 -84.1 271.263 -72.811 271.886 -195.877 272.59 -250.311 273.572 -97.019 274.086 -199.21 274.329 -235.096 276.979 -77.21 277.313 -246.859 278.096 -188.909 278.223 -68.026 278.35 -139.616 280.89 -195.992 281.413 -78.605 282.904 -92.113 284.499 -231.532 285.708 -240.821 286.862 -221.241 287.679 -135.925 290.702 -68.571 290.744 -76.063 291.361 -239.134 292.471 -89.893 292.915 -229.442 293.193 -99.059 294.381 -81.307 295.605 -141.156 298.543 -100.885 298.689 -239.704 299.711 -137.435 299.925 -221.066 300.618 -130.093 302.588 -93.782 304.273 -203.95 304.766 -194.382 305.091 -110.146 306.04 -88.153 306.761 -186.272 307.554 -173.64 307.778 -80.67 310.273 -87.523 310.368 -103.525 310.621 -71.728 311.168 -193.219 311.29 -198.237 311.338 -168.259 311.925 -115.413 312.178 -203.015 312.577 -98.959 314.495 -81.542 315.035 -185.075 315.256 -92.839 316.889 -196.098 317.982 -95.894 318.453 -181.028 320.048 -67.681 320.888 -88.357 321.327 -75.035 322.997 -104.616 323.373 -95.794 324.414 -193.768 325.789 -84.277 327.046 -89.805 328.459 -78.971 329.2 -251.549 329.913 -72.674 332.794 -263.695 333.451 -200.654 333.756 -124.516 334.524 -174.331 335.301 -213.95 335.759 -166.324 335.848 -210.629 336.07 -75.13 337.105 -87.3 337.78 -178.097 338.997 -65.673 340.501 -184.499 341.346 -193.458 341.55 -230.801 342.054 -80.684 343.082 -202.06 344.873 -69.301 345.798 -232.187 347.177 -201.136 347.53 -103.792 348.168 -87.125 349.156 -236.411 349.531 -252.794 350.503 -92.074 350.79 -116.177 350.953 -177.364 354.308 -123.398 354.417 -237.45 355.424 -197.235 358.092 -164.202 361.405 -240.484 363.009 -213.804 364.806 -243.099 -- View this message in context: http://r.789695.n4.nabble.com/Creating-a-point-pattern-with-cartesian-co-ordinates-tp4606343.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with readBin
I believe here is the structure of the file I'm trying to read: record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes each), one string (16 bytes or 16 characters), 3 integers (4 bytes each), 1 record marker (4 bytes) and a big array of doubles (8 bytes each). Everything in the file is read correctly except for the doubles. If any indication, I've read similar file before with readBin the only difference is this one was created with a code compiled with gfortran in linux 64 bit. I was able to read the same output binary file when the fortran source code was compiled in windows xp 32 bit. The values I'm expecting should be between 0 and about 32. The code I used is: # Loading Required libraries library(tcltk) # Tk inputbox function inputBox-function() { tt-tktoplevel() Zmin-tclVar(0) Zmax-tclVar(0) dZ-tclVar(0) entry.Zmin-tkentry(tt,width=20,textvariable=Zmin) entry.Zmax-tkentry(tt,width=20,textvariable=Zmax) entry.dZ-tkentry(tt,width=20,textvariable=dZ) lbl.Zmin-tklabel(tt,text=Number of layers) lbl.Zmax-tklabel(tt,text=Number of Stress Periods) lbl.dZ-tklabel(tt,text=dZ) tkgrid(lbl.Zmin,entry.Zmin) tkgrid(entry.Zmin) tkgrid(lbl.Zmax,entry.Zmax) tkgrid(entry.Zmax) #tkgrid(lbl.dZ,entry.dZ) #tkgrid(entry.dZ) OnOK - function() { # NameVal - c(tclvalue(Zmin),tclvalue(Zmax),tclvalue(dZ)) tkdestroy(tt) } OK.but -tkbutton(tt,text= OK ,command=OnOK) # tkbind(entry.Name, Return,OnOK) tkgrid(OK.but,columnspan=3) tkfocus(tt) tkwait.window(tt) res-as.numeric(c(tclvalue(Zmin),tclvalue(Zmax)))#,tclvalue(dZ))) return(res) } # Main program # Model Parameters input (number of layers and stress periods) param-inputBox() # Select and open Modflow Binary file for reading fich-tclvalue(tkgetOpenFile(title=Modflow Binary File,filetypes={{hds binary Files} {.hds}} {{All files} *})) zz - file(fich, rb) # Cycling thru time steps and layers for (k in 1:param[2]) { for (i in 1:param[1]) { readBin(zz,what=numeric,n=1,size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=integer,n=2,size=4)-N1 readBin(zz,what=double,n=2,size=8)-N2 readChar(zz,16)-txt1 print(txt1) readBin(zz,what=integer,n=3,size=4)-N3 tnber-N3[1]*N3[2] readBin(zz,what=integer,n=1,size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=real(),n=tnber,size=4)-N4 readBin(zz,what=integer,n=2,size=4) # record marker typical of fortran access=sequential in gfortran print(N4[1:10]) } } close(zz) On Thu, May 3, 2012 at 1:26 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 03/05/2012 12:41 PM, kapo coulibaly wrote: I'm trying to read a binary file created by a fortran code using readBin and readChar. Everything reads fine (integers and strings) except for double precision numbers, they are read as huge or very small number (1E-250,...). I tried various endianness, swap, But nothing has worked so far. I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on windows XP 32 bit. Any help would be appreciated. As I wrote to someone else with a similar problem a couple of weeks ago: You need to see what's in the file. The hexView package can dump it in various formats; see example(viewFormat) for a couple. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error with the 'segmented' package for R
Hi everyone, I have encountered this problem while using 'segmented' plugin for R i386 2.15.0 (for Windows 32bit OS) and I just cannot find neither explanation nor solution for it. I am trying to run this data gpp temp 1.661 5 5.028 10 9.772 15 8.692 20 5.693 25 6.293 30 7.757 5 4.604 10 8.763 15 8.134 20 4.616 25 8.417 30 3.483 5 5.046 10 8.306 15 9.142 20 4.686 25 7.301 30 and with the 'segmented' I wanted to find the brakepoint for the curve, however I get this error mesasge: Error in 1:ncol(U) : argument of length 0 In addition: Warning message: In o0$boot.restart - ris : Coercing LHS to a list I was using the following code: library(segmented) curva-read.table(gppdata.txt, header=T) attach(curva) fit.glm-glm(gpp~temp, weight=NULL, family=gaussian) fit.seg-segmented(fit.glm, seg.Z=~temp, psi=15) summary(fit.seg) Is anyone able to help me and direct me for the source of the problem? Thanks for your help. Regards, Szymon __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Runtime column name creation
Hello, Don't cross post, please. You could even have saved you some time: the answer is already given in R-devel. Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/Runtime-column-name-creation-tp4606497p4606561.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to replace NA with zero (0)
On Thu, May 3, 2012 at 10:43 AM, Christopher Kelvin chris_kelvin2...@yahoo.com wrote: Is there a command i can issue to replace the NA with zero (0) even if it is after generating the data? Chris, I didn't try your example code, so this suggestion is far more general, but you might try something along the lines of: x[which(is.na(x))] - 0 Best, James __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read or write to file in Linux Ubuntu
On Thu, May 3, 2012 at 1:53 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I like the idea of staying with absolute paths. Before you write too much R code that builds in absolute paths, please consider how difficult it will be to adjust all of those paths if you need to run on a different computer or you need to reorganize your overall directory structure. If you keep related R files in the same project directory, you can collapse all of those paths down to short relative paths, and do one setwd at the beginning, or learn to manually set your base working directory as a matter of habit before each working session. (This habit is useful in more areas than just R programming.) I agree with this, which is why I suggested you use absolute paths *until you get more comfortable with the file system.* It's a good way to diagnose your problem and figure out how to deal with paths on Linux, but not a good long-term strategy unless you expect that you will never ever move anything or change to a new computer. An intermediate solution that I use a lot is to put something like this at the beginning of R script file: basepath = /home/sarahg/whatever and then load files using something like read.table(paste(basepath, plantdata.csv, sep=/)) This eases portability between computers: I exchange a lot of analyses with postdocs, students and techs, and somehow they've not all become convinced that my way of organizing directories is the best one. Using an object with the absolute path for the data means that we can pass things back and forth with only changing one value within the script rather than all input/output commands. Sarah -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlapping confidence bands for predicted probabilities from a logistic model
Your post suggests statistical confusion on several levels. But as this concerns statistics, not R, consult your local statistician or post on a statistical help list (e.g. stats.stackexchange.com). Incidentally, the short answer to your question about the overlap is: why shouldn't they? You will hopefully receive a more informative longer answer from the above suggested (or similar) resources. -- Bert On Thu, May 3, 2012 at 9:37 AM, Malcolm Fairbrother m.fairbrot...@bristol.ac.uk wrote: Dear list, I'm a bit perplexed why the 95% confidence bands for the predicted probabilities for units where x=0 and x=1 overlap in the following instance. I've simulated binary data to which I've then fitted a simple logistic regression model, with one covariate, and the coefficient on x is statistically significant at the 0.05 level. I've then used two different methods to generate 95% confidence bands for predicted probabilities, for each of two possible values of x. Given the result of the model, I expected the bands not to overlap… but they do. Can anyone please explain where I've gone wrong with the following code, OR why it makes sense for the bands to overlap, despite the statistically significant beta coefficient? There may be a good statistical reason for this, but I'm not aware of it. Many thanks, Malcolm Fairbrother n - 120 set.seed(030512) x - rbinom(n, 1, 0.5) dat - within(data.frame(x), ybe - rbinom(n, 1, plogis(-0.5 + x))) mod1 - glm(ybe ~ x, dat, family=binomial) summary(mod1) # coefficient on x is statistically significant at the 0.05 level… almost at the 0.01 level pred - predict(mod1, newdata=data.frame(x=c(0,1)), se.fit=T) with(pred, cbind(low = plogis(fit - 1.96*se.fit), est = plogis(fit), up = plogis(fit + 1.96*se.fit))) # confidence bands based on SEs # simulation-based confidence bands: sims - t(replicate(200, coef(glm(simulate(mod1)$sim_1 ~ x, data=dat, family=binomial pred0 - plogis(quantile(sims%*%c(1,0), c(0.025, 0.5, 0.975))) pred1 - plogis(quantile(sims%*%c(1,1), c(0.025, 0.5, 0.975))) rbind(pred0, pred1) # the upper bound of the prediction for x=0 is greater than the lower bound of the prediction for x=1, using both methods __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying the particular X or Y in a sorted list
Dear All, Thank you very much in advance. I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). A 1 2 3 4 5 B 7 2 3 6 9 C 4 6 9 2 5 (B-C) 3 -4 -6 4 4 DF1-list(A,B,C) DF1 DF2-(DF1$C-DF1$B) length(DF2) sum(DF20) #I want to subtract B from C to see and identify how many patients have greater concentrations and who are these patients (A). On Thu, May 3, 2012 at 12:15 PM, John Kane jrkrid...@inbox.com wrote: You do not seem to have suppied either code nor data. Please supply both. John Kane Kingston ON Canada -Original Message- From: shankarla...@gmail.com Sent: Wed, 2 May 2012 22:06:54 -0400 To: r-help@r-project.org Subject: [R] Identifying the particular X or Y in a sorted list Dear All, I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). Data-(myDf$B-myDf$C) sum(Data0) A B CD (B-C) 1 14 10 4 2 12 7 5 3 11 15 -4 4 8 3 5 5 1 8 -7 I appreciate your help, thank you very much in advance. Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk and most webmails -- Regards, Shankar Lanke Ph.D. Assistant Professor Department of Pharmaceutical Sciences College of Pharmacy The University of Findlay (C) 678-232-3567 (O) 419-434-5448 Fax# 419-434-4390 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Export an R outcome to an Excel Spreadsheet
The others have made suggestions for csv output. For xlsx output go to CRAN, click on the packages link at the left, then select packages listed by name. Then use your browser to search for the string 'xls. There are several packages that offer this capability. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 5/1/12 1:41 PM, Paul Bernal paulberna...@gmail.com wrote: Hello R community, I basically created a normal distribution with mean 2500 and standard deviation = 450 with a sample of size 50 and assigned that to a variable named genvar2 with the following command: genvar2-rnorm(mean=2500, sd=450, n=50) Now, the output of genvar2 generates de following: [1] 2478.126 2671.259 2163.879 2440.796 2702.234 1871.514 2525.127 2830.688 [9] 2704.148 3464.478 2609.795 3368.288 2661.613 2731.901 2535.846 2165.461 [17] 1870.069 3513.533 2053.342 2447.887 2605.913 2188.192 2514.004 2965.374 [25] 3550.454 1783.323 2568.323 2324.673 2528.994 2433.895 2751.111 2727.282 [33] 1837.081 1896.721 3445.993 1357.462 2348.177 2368.423 2029.738 2500.372 [41] 2000.008 3088.112 3003.325 2763.740 2475.636 1860.988 2292.909 2134.172 [49] 2291.116 2851.066 I want to export all of these numbers into a column in an excel file (in an xlsx and .csv format). How can I do this? Any help will be greatly appreciated, Best regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with 'nls' fitting logistic model (5PL)
On Wed, May 2, 2012 at 3:32 PM, Michal Figurski figur...@mail.med.upenn.edu wrote: Dear R-Helpers, I'm working with immunoassay data and 5PL logistic model. I wanted to experiment with different forms of weighting and parameter selection, which is not possible in instrument software, so I turned to R. I am using R 2.14.2 under Win7 64bit, and the 'nls' library to fit the model - I started with the same model and weighting type (1/y) as in the instrument to see if I'll get similar results. However, in some instances I don't get any results - just errors. Here is an example calibration data, representative of my experiment. Instrument soft had no problem fitting it: x - structure(list(SPL = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 5L, 6L, 6L, 7L, 7L), .Label = c(St1, St2, St3, St4, St5, St6, St7), class = factor), MFI = c(10755.5, 9839, 5142.5, 4857, 1510.5, 1505, 502.5, 451, 215, 195.5, 58, 57, 15, 15), nom = c(206, 206, 125, 125, 68, 68, 38, 38, 24, 24, 13, 13, 6.5, 6.5), weights = c(0.0013946353028683, 0.00152454517735542, 0.00291686922702965, 0.00308832612723904, 0.0099304865938431, 0.00996677740863787, 0.0298507462686567, 0.0332594235033259, 0.0697674418604651, 0.0767263427109974, 0.258620689655172, 0.263157894736842, 1, 1)), .Names = c(SPL, MFI, nom, weights), row.names = c(NA, -14L), class = data.frame) And here is the nls fit: fit - nls(MFI~a + b/((1+(nom/c)^d)^f), data=x, weights=x$weights, start=c(a=100, b=1, c=100, d=-1, f=1)) This looks quite linear to me: fo - log(MFI) ~ log(nom) plot(fo, x) abline(lm(fo, x)) Try using that as the basis for an alternate model. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with readBin
On 03/05/2012 1:57 PM, kapo coulibaly wrote: I believe here is the structure of the file I'm trying to read: record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes each), one string (16 bytes or 16 characters), 3 integers (4 bytes each), 1 record marker (4 bytes) and a big array of doubles (8 bytes each). Everything in the file is read correctly except for the doubles. If any indication, I've read similar file before with readBin the only difference is this one was created with a code compiled with gfortran in linux 64 bit. I was able to read the same output binary file when the fortran source code was compiled in windows xp 32 bit. The values I'm expecting should be between 0 and about 32. The first two doubles read properly as 1, and the next one as 33.674, when I follow your description above on the dump you sent me privately. I'm on a system with endian=little; you might want to specify that explicitly in your readBin calls. And please, in future, spend some time making your questions easier to answer: put the dump in the public message. Don't post irrelevant code, extract just the bit that's doing the reading. Duncan Murdoch The code I used is: # Loading Required libraries library(tcltk) # Tk inputbox function inputBox-function() { tt-tktoplevel() Zmin-tclVar(0) Zmax-tclVar(0) dZ-tclVar(0) entry.Zmin-tkentry(tt,width=20,textvariable=Zmin) entry.Zmax-tkentry(tt,width=20,textvariable=Zmax) entry.dZ-tkentry(tt,width=20,textvariable=dZ) lbl.Zmin-tklabel(tt,text=Number of layers) lbl.Zmax-tklabel(tt,text=Number of Stress Periods) lbl.dZ-tklabel(tt,text=dZ) tkgrid(lbl.Zmin,entry.Zmin) tkgrid(entry.Zmin) tkgrid(lbl.Zmax,entry.Zmax) tkgrid(entry.Zmax) #tkgrid(lbl.dZ,entry.dZ) #tkgrid(entry.dZ) OnOK- function() { # NameVal- c(tclvalue(Zmin),tclvalue(Zmax),tclvalue(dZ)) tkdestroy(tt) } OK.but-tkbutton(tt,text= OK ,command=OnOK) # tkbind(entry.Name, Return,OnOK) tkgrid(OK.but,columnspan=3) tkfocus(tt) tkwait.window(tt) res-as.numeric(c(tclvalue(Zmin),tclvalue(Zmax)))#,tclvalue(dZ))) return(res) } # Main program # Model Parameters input (number of layers and stress periods) param-inputBox() # Select and open Modflow Binary file for reading fich-tclvalue(tkgetOpenFile(title=Modflow Binary File,filetypes={{hds binary Files} {.hds}} {{All files} *})) zz- file(fich, rb) # Cycling thru time steps and layers for (k in 1:param[2]) { for (i in 1:param[1]) { readBin(zz,what=numeric,n=1,size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=integer,n=2,size=4)-N1 readBin(zz,what=double,n=2,size=8)-N2 readChar(zz,16)-txt1 print(txt1) readBin(zz,what=integer,n=3,size=4)-N3 tnber-N3[1]*N3[2] readBin(zz,what=integer,n=1,size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=real(),n=tnber,size=4)-N4 readBin(zz,what=integer,n=2,size=4) # record marker typical of fortran access=sequential in gfortran print(N4[1:10]) } } close(zz) On Thu, May 3, 2012 at 1:26 PM, Duncan Murdochmurdoch.dun...@gmail.comwrote: On 03/05/2012 12:41 PM, kapo coulibaly wrote: I'm trying to read a binary file created by a fortran code using readBin and readChar. Everything reads fine (integers and strings) except for double precision numbers, they are read as huge or very small number (1E-250,...). I tried various endianness, swap, But nothing has worked so far. I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on windows XP 32 bit. Any help would be appreciated. As I wrote to someone else with a similar problem a couple of weeks ago: You need to see what's in the file. The hexView package can dump it in various formats; see example(viewFormat) for a couple. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying the particular X or Y in a sorted list
Have you read An Introduction to R? If I understand correctly, your question is very basic and suggests that you have not yet made even a minimal effort on your own to learn R before asking for help from this list. -- Bert answer: A[CB] On Thu, May 3, 2012 at 11:14 AM, Shankar Lanke shankarla...@gmail.com wrote: Dear All, Thank you very much in advance. I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). A 1 2 3 4 5 B 7 2 3 6 9 C 4 6 9 2 5 (B-C) 3 -4 -6 4 4 DF1-list(A,B,C) DF1 DF2-(DF1$C-DF1$B) length(DF2) sum(DF20) #I want to subtract B from C to see and identify how many patients have greater concentrations and who are these patients (A). On Thu, May 3, 2012 at 12:15 PM, John Kane jrkrid...@inbox.com wrote: You do not seem to have suppied either code nor data. Please supply both. John Kane Kingston ON Canada -Original Message- From: shankarla...@gmail.com Sent: Wed, 2 May 2012 22:06:54 -0400 To: r-help@r-project.org Subject: [R] Identifying the particular X or Y in a sorted list Dear All, I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). Data-(myDf$B-myDf$C) sum(Data0) A B CD (B-C) 1 14 10 4 2 12 7 5 3 11 15 -4 4 8 3 5 5 1 8 -7 I appreciate your help, thank you very much in advance. Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails -- Regards, Shankar Lanke Ph.D. Assistant Professor Department of Pharmaceutical Sciences College of Pharmacy The University of Findlay (C) 678-232-3567 (O) 419-434-5448 Fax# 419-434-4390 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying the particular X or Y in a sorted list
I'm sorry, it's still not clear what you are doing but perhaps this is close? mydata - data.frame( a = c(1, 2, 3, 4 , 5), b = c(7, 2, 3, 6, 9), c = c(4, 6, 9, 2, 5)) mydata$d - mydata$b - mydata$c mydata subset(mydata, mydata$d ==max(mydata$d)) John Kane Kingston ON Canada -Original Message- From: shankarla...@gmail.com Sent: Thu, 3 May 2012 14:14:17 -0400 To: jrkrid...@inbox.com Subject: Re: [R] Identifying the particular X or Y in a sorted list Dear All, Thank you very much in advance. I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). A 1 2 3 4 5 B 7 2 3 6 9 C 4 6 9 2 5 (B-C) 3 -4 -6 4 4 DF1-list(A,B,C) DF1 DF2-(DF1$C-DF1$B) length(DF2) sum(DF20) #I want to subtract B from C to see and identify how many patients have greater concentrations and who are these patients (A). On Thu, May 3, 2012 at 12:15 PM, John Kane [1]jrkrid...@inbox.com wrote: You do not seem to have suppied either code nor data. Please supply both. John Kane Kingston ON Canada -Original Message- From: [2]shankarla...@gmail.com Sent: Wed, 2 May 2012 22:06:54 -0400 To: [3]r-help@r-project.org Subject: [R] Identifying the particular X or Y in a sorted list Dear All, I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). Data-(myDf$B-myDf$C) sum(Data0) A B CD (B-C) 1 14 10 4 2 12 7 5 3 11 15 -4 4 8 3 5 5 1 8 -7 I appreciate your help, thank you very much in advance. Regards [[alternative HTML version deleted]] __ [4]R-help@r-project.org mailing list [5]https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide [6]http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IMEMAIL - Learn more at [7]http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails -- Regards, Shankar Lanke Ph.D. Assistant Professor Department of Pharmaceutical Sciences College of Pharmacy The University of Findlay (C) 678-232-3567 (O) 419-434-5448 Fax# 419-434-4390 _ Free Online Photosharing - Share your photos online with your friends and family! Visit [8]http://www.inbox.com/photosharing to find out more! References 1. mailto:jrkrid...@inbox.com 2. mailto:shankarla...@gmail.com 3. mailto:r-help@r-project.org 4. mailto:R-help@r-project.org 5. https://stat.ethz.ch/mailman/listinfo/r-help 6. http://www.R-project.org/posting-guide.html 7. http://www.inbox.com/smileys 8. http://www.inbox.com/photosharing __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot read or write to file in Linux Ubuntu
Thanks Jeff and Sarah. I was thinking mainly of using the base path and paste routine which is something I do in Windows It will take me a while to figrue out relative paths. John Kane Kingston ON Canada -Original Message- From: sarah.gos...@gmail.com Sent: Thu, 3 May 2012 14:07:12 -0400 To: jdnew...@dcn.davis.ca.us Subject: Re: [R] Cannot read or write to file in Linux Ubuntu On Thu, May 3, 2012 at 1:53 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I like the idea of staying with absolute paths. Before you write too much R code that builds in absolute paths, please consider how difficult it will be to adjust all of those paths if you need to run on a different computer or you need to reorganize your overall directory structure. If you keep related R files in the same project directory, you can collapse all of those paths down to short relative paths, and do one setwd at the beginning, or learn to manually set your base working directory as a matter of habit before each working session. (This habit is useful in more areas than just R programming.) I agree with this, which is why I suggested you use absolute paths *until you get more comfortable with the file system.* It's a good way to diagnose your problem and figure out how to deal with paths on Linux, but not a good long-term strategy unless you expect that you will never ever move anything or change to a new computer. An intermediate solution that I use a lot is to put something like this at the beginning of R script file: basepath = /home/sarahg/whatever and then load files using something like read.table(paste(basepath, plantdata.csv, sep=/)) This eases portability between computers: I exchange a lot of analyses with postdocs, students and techs, and somehow they've not all become convinced that my way of organizing directories is the best one. Using an object with the absolute path for the data means that we can pass things back and forth with only changing one value within the script rather than all input/output commands. Sarah -- Sarah Goslee http://www.functionaldiversity.org TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5 Capture screenshots, upload images, edit and send them to your friends through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with readBin
I did mention in my initial email that I tried little, big, swap and .Platform$endian without any success, I keep getting the same very small numbers. Thanks On Thu, May 3, 2012 at 2:36 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 03/05/2012 1:57 PM, kapo coulibaly wrote: I believe here is the structure of the file I'm trying to read: record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes each), one string (16 bytes or 16 characters), 3 integers (4 bytes each), 1 record marker (4 bytes) and a big array of doubles (8 bytes each). Everything in the file is read correctly except for the doubles. If any indication, I've read similar file before with readBin the only difference is this one was created with a code compiled with gfortran in linux 64 bit. I was able to read the same output binary file when the fortran source code was compiled in windows xp 32 bit. The values I'm expecting should be between 0 and about 32. The first two doubles read properly as 1, and the next one as 33.674, when I follow your description above on the dump you sent me privately. I'm on a system with endian=little; you might want to specify that explicitly in your readBin calls. And please, in future, spend some time making your questions easier to answer: put the dump in the public message. Don't post irrelevant code, extract just the bit that's doing the reading. Duncan Murdoch The code I used is: # Loading Required libraries library(tcltk) # Tk inputbox function inputBox-function() { tt-tktoplevel() Zmin-tclVar(0) Zmax-tclVar(0) dZ-tclVar(0) entry.Zmin-tkentry(tt,width=**20,textvariable=Zmin) entry.Zmax-tkentry(tt,width=**20,textvariable=Zmax) entry.dZ-tkentry(tt,width=**20,textvariable=dZ) lbl.Zmin-tklabel(tt,text=**Number of layers) lbl.Zmax-tklabel(tt,text=**Number of Stress Periods) lbl.dZ-tklabel(tt,text=dZ) tkgrid(lbl.Zmin,entry.Zmin) tkgrid(entry.Zmin) tkgrid(lbl.Zmax,entry.Zmax) tkgrid(entry.Zmax) #tkgrid(lbl.dZ,entry.dZ) #tkgrid(entry.dZ) OnOK- function() { # NameVal- c(tclvalue(Zmin),tclvalue(**Zmax),tclvalue(dZ)) tkdestroy(tt) } OK.but-tkbutton(tt,text= OK ,command=OnOK) # tkbind(entry.Name, Return,OnOK) tkgrid(OK.but,columnspan=3) tkfocus(tt) tkwait.window(tt) res-as.numeric(c(tclvalue(**Zmin),tclvalue(Zmax)))#,**tclvalue(dZ))) return(res) } ##**##** # Main program ##**##** # Model Parameters input (number of layers and stress periods) param-inputBox() # Select and open Modflow Binary file for reading fich-tclvalue(tkgetOpenFile(**title=Modflow Binary File,filetypes={{hds binary Files} {.hds}} {{All files} *})) zz- file(fich, rb) # Cycling thru time steps and layers for (k in 1:param[2]) { for (i in 1:param[1]) { readBin(zz,what=numeric,n=1,**size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=integer,n=2,**size=4)-N1 readBin(zz,what=double,n=2,**size=8)-N2 readChar(zz,16)-txt1 print(txt1) readBin(zz,what=integer,n=3,**size=4)-N3 tnber-N3[1]*N3[2] readBin(zz,what=integer,n=1,**size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=real(),n=**tnber,size=4)-N4 readBin(zz,what=integer,n=2,**size=4) # record marker typical of fortran access=sequential in gfortran print(N4[1:10]) } } close(zz) On Thu, May 3, 2012 at 1:26 PM, Duncan Murdochmurdoch.duncan@gmail.**commurdoch.dun...@gmail.com wrote: On 03/05/2012 12:41 PM, kapo coulibaly wrote: I'm trying to read a binary file created by a fortran code using readBin and readChar. Everything reads fine (integers and strings) except for double precision numbers, they are read as huge or very small number (1E-250,...). I tried various endianness, swap, But nothing has worked so far. I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on windows XP 32 bit. Any help would be appreciated. As I wrote to someone else with a similar problem a couple of weeks ago: You need to see what's in the file. The hexView package can dump it in various formats; see example(viewFormat) for a couple. Duncan Murdoch [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org
Re: [R] deparse(substitute(x)) on an object with S3 class
On Fri, May 4, 2012 at 2:02 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: Note that print.testclass(testlist) also works as expected so it might be there's a forced evaluation somewhere inside S3 dispatch Indeed. Without evaluating the argument, the S3 method dispatch can't work out which method to dispatch to. In principle it could go back and get a copy of the unevaluated promise, but it doesn't (and that would only move the problem somewhere else, since the evaluation could have side effects that would then happen multiple times) Section 5.4 of the R Language Definition says, in part, When the method is invoked it is called with arguments that are the same in number and have the same names as in the call to the generic. They are matched to the arguments of the method according to the standard R rules for argument matching. However the object, i.e. the first argument has been evaluated. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Very small random effect estimation in lmer but not in stata xtmixed
On Thu, May 3, 2012 at 11:50 PM, Mohammed Mohammed m.a.moham...@bham.ac.uk wrote: Hi folks I am using the lmer function (in the lme4 library) to analyse some data where individuals are clustered into sets (using the SetID variable) with a single fixed effect (cc - 0 or 1). The lmer model and output is shown below. Whilst the fixed effects are consistent with stata (using xtmixed, see below), the std dev of the random effect for SetID is very very small (3.5803e-05)compared to stata's (see below 1.002). Any ideas why this should be happening please? You're not fitting the same model. Like SAS, Stata by default assumes that random effects are independent of each other, so your Stata model has correlation between the random effects forced to zero. The R model estimates the correlation, and finds it to be far from zero (-0.69). -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying the particular X or Y in a sorted list
Dear John, Thank you very much for your response, I appreciate your input. I am able to subtract the two columns, (B - C) , the subset information I need is how many As and who are the A. For example P,Q,R,S,T, persons earned $ 7, 2, 3, 6, 9 in 1 st month and $ 4, 6, 9, 2, 5 in 2nd month. I want to identify who earned more on 1st month + the difference (only if it is positive). In this case P,S,T earned $3,4,4, in 2nd month. mydata - data.frame( a = c(1, 2, 3, 4 , 5), b = c(7, 2, 3, 6, 9), c = c(4, 6, 9, 2, 5)) mydata$d - mydata$b - mydata$c mydata subset(mydata, mydata$d ==max(mydata$d)) On Thu, May 3, 2012 at 2:47 PM, John Kane jrkrid...@inbox.com wrote: ** I'm sorry, it's still not clear what you are doing but perhaps this is close? mydata - data.frame( a = c(1, 2, 3, 4 , 5), b = c(7, 2, 3, 6, 9), c = c(4, 6, 9, 2, 5)) mydata$d - mydata$b - mydata$c mydata subset(mydata, mydata$d ==max(mydata$d)) John Kane Kingston ON Canada -Original Message- *From:* shankarla...@gmail.com *Sent:* Thu, 3 May 2012 14:14:17 -0400 *To:* jrkrid...@inbox.com *Subject:* Re: [R] Identifying the particular X or Y in a sorted list Dear All, Thank you very much in advance. I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). A 1 2 3 4 5 B 7 2 3 6 9 C 4 6 9 2 5 (B-C) 3 -4 -6 4 4 DF1-list(A,B,C) DF1 DF2-(DF1$C-DF1$B) length(DF2) sum(DF20) #I want to subtract B from C to see and identify how many patients have greater concentrations and who are these patients (A). On Thu, May 3, 2012 at 12:15 PM, John Kane jrkrid...@inbox.com wrote: You do not seem to have suppied either code nor data. Please supply both. John Kane Kingston ON Canada -Original Message- From: shankarla...@gmail.com Sent: Wed, 2 May 2012 22:06:54 -0400 To: r-help@r-project.org Subject: [R] Identifying the particular X or Y in a sorted list Dear All, I have a data sets as shown below A (Patient ID ), B and C are the Concentration of drug in blood on day 1 and day 4, D is the difference in conc. To do this in R I have written a code as follows, identified the number of patients who have more concentration on day 4 . Here I want to identify specifically the patient ID (is he patient 1 or 2 or 5 and 7), whose concentration is more. How to write a code to get the list of A (patient ID whose difference is more on day 4). Data-(myDf$B-myDf$C) sum(Data0) A B CD (B-C) 1 14 10 4 2 12 7 5 3 11 15 -4 4 8 3 5 5 1 8 -7 I appreciate your help, thank you very much in advance. Regards [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk and most webmails -- Regards, Shankar Lanke Ph.D. Assistant Professor Department of Pharmaceutical Sciences College of Pharmacy The University of Findlay (C) 678-232-3567 (O) 419-434-5448 Fax# 419-434-4390 -- Free Online Photosharing - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more! -- Regards, Shankar Lanke Ph.D. Assistant Professor Department of Pharmaceutical Sciences College of Pharmacy The University of Findlay (C) 678-232-3567 (O) 419-434-5448 Fax# 419-434-4390 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with readBin
You can do the following to allow others to recreate your problem. yourFileBytes - readBin(yourFile, what=integer, size=1, n=300) # is 300 bytes enough to see the problem? dput(yourFileBytes) Put the output of dput(yourFileBytes) in your mail. Someone can (and you should) recreate the problem with bytes - ... copy 'n paste the printout of dput(bytes) here ... tf - tempfile() stopifnot(is.integer(bytes) all(abs(bytes)=128)) # to make sure bytes was copied correctly writeBin(bytes, con=tf, size=1) Then show just the commands needed to read a couple of rows of your file, along with the expected output, as precisely and you can. E.g., con - file(tf, rb) readBin(con, what=integer, size=4, n=2) # expect 3 then something less than 10 readBin(con, what=numeric, size=8, n=3) # expect 2 numbers in range (0, 32] then 2.57 ... Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of kapo coulibaly Sent: Thursday, May 03, 2012 10:57 AM To: r-help@r-project.org Subject: Re: [R] Help with readBin I believe here is the structure of the file I'm trying to read: record marker (4 bytes), 2 integers (4 bytes each), 2 doubles (8 bytes each), one string (16 bytes or 16 characters), 3 integers (4 bytes each), 1 record marker (4 bytes) and a big array of doubles (8 bytes each). Everything in the file is read correctly except for the doubles. If any indication, I've read similar file before with readBin the only difference is this one was created with a code compiled with gfortran in linux 64 bit. I was able to read the same output binary file when the fortran source code was compiled in windows xp 32 bit. The values I'm expecting should be between 0 and about 32. The code I used is: # Loading Required libraries library(tcltk) # Tk inputbox function inputBox-function() { tt-tktoplevel() Zmin-tclVar(0) Zmax-tclVar(0) dZ-tclVar(0) entry.Zmin-tkentry(tt,width=20,textvariable=Zmin) entry.Zmax-tkentry(tt,width=20,textvariable=Zmax) entry.dZ-tkentry(tt,width=20,textvariable=dZ) lbl.Zmin-tklabel(tt,text=Number of layers) lbl.Zmax-tklabel(tt,text=Number of Stress Periods) lbl.dZ-tklabel(tt,text=dZ) tkgrid(lbl.Zmin,entry.Zmin) tkgrid(entry.Zmin) tkgrid(lbl.Zmax,entry.Zmax) tkgrid(entry.Zmax) #tkgrid(lbl.dZ,entry.dZ) #tkgrid(entry.dZ) OnOK - function() { # NameVal - c(tclvalue(Zmin),tclvalue(Zmax),tclvalue(dZ)) tkdestroy(tt) } OK.but -tkbutton(tt,text= OK ,command=OnOK) # tkbind(entry.Name, Return,OnOK) tkgrid(OK.but,columnspan=3) tkfocus(tt) tkwait.window(tt) res-as.numeric(c(tclvalue(Zmin),tclvalue(Zmax)))#,tclvalue(dZ))) return(res) } # Main program # Model Parameters input (number of layers and stress periods) param-inputBox() # Select and open Modflow Binary file for reading fich-tclvalue(tkgetOpenFile(title=Modflow Binary File,filetypes={{hds binary Files} {.hds}} {{All files} *})) zz - file(fich, rb) # Cycling thru time steps and layers for (k in 1:param[2]) { for (i in 1:param[1]) { readBin(zz,what=numeric,n=1,size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=integer,n=2,size=4)-N1 readBin(zz,what=double,n=2,size=8)-N2 readChar(zz,16)-txt1 print(txt1) readBin(zz,what=integer,n=3,size=4)-N3 tnber-N3[1]*N3[2] readBin(zz,what=integer,n=1,size=4) # record marker typical of fortran access=sequential in gfortran readBin(zz,what=real(),n=tnber,size=4)-N4 readBin(zz,what=integer,n=2,size=4) # record marker typical of fortran access=sequential in gfortran print(N4[1:10]) } } close(zz) On Thu, May 3, 2012 at 1:26 PM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 03/05/2012 12:41 PM, kapo coulibaly wrote: I'm trying to read a binary file created by a fortran code using readBin and readChar. Everything reads fine (integers and strings) except for double precision numbers, they are read as huge or very small number (1E-250,...). I tried various endianness, swap, But nothing has worked so far. I also tried on R 64 bit for linux and windows (R 2.14) and R 2.11 on windows XP 32 bit. Any help would be appreciated. As I wrote to someone else with a similar problem a couple of weeks ago: You need to see what's in the file. The hexView package can dump it in various formats; see example(viewFormat) for a couple. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
[R] Identifying case by groups in a data frame
Hi everyone, I would like to identify the case by groups that is just bigger that avg plus sd. For example, using species as group and petal.wid as my variable in the iris data. What's the better way to doit? creating a function? So,the question is to identify the single element of each species that is just larger than a cut-off point (i.e. larger than mean + sd) I made this, but I can not make a relation to to orginal data for identifiying the cases.If you have some lights please share it. data(iris) library(plyr) petal.wid.avg -ddply(iris,.(Species),function(df) return(c(petal.wid.avg=mean(df$Petal.Width),petal.wid.sd=sd(df$Petal.Width))) ) petal.wid.avg$avgsd -petal.wid.avg$petal.wid.avg +petal.wid.avg$petal.wid.sd petal.wid.avg Thanks in advance: José [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] NA's when subset in a dataframe
Le jeudi 03 mai 2012 à 07:37 -0700, agent dunham a écrit : Dear community, I'm having this silly problem. I've a linear model. After fixing it, I wanted to know which data had studentized residuals larger than 3, so i tried this: d1 - cooks.distance(lmmodel) r - sqrt(abs(rstandard(lmmodel))) rstu - abs(rstudent(lmmodel)) a - cbind( mydata, d1, r,rstu) alargerthan3 - a[rstu 3, ] And suddenly a[rstu 3, ] has 17 rows, 7 of them are new rows, where all the entries are NA's, even its rownames. Because of this I'm not sure of the dimension ofa[rstu 3, ] (Do I only have 8 entries?) Has this happened to anybody before? If so, why this extra NA rows? what's the problem? Is there any other way to know which data have studentized residuals larger than 3? if it's needed to upload my data, just tell me. A small reproducible example would have been better. Anyway, see page 88 of The R Inferno. In your case, the simplest solutions are to do: alargerthan3 - a[which(rstu 3),] or alargerthan3 - subset(a, rstu 3) Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] May-July 2012 ***R/S-PLUS Courses***by XLSolutions Corp at 9 USA locations
Seattle May 17-18 * XLSolutions May-July 2012 R/S-PLUS courses schedule is now available online at 9 USA cities for with 13 new courses: *** Suggest a future course date/city (1) R-PLUS: A Point-and-Click Approach to R (2) S-PLUS / R : Programming Essentials. (3) R/S+ Fundamentals and Programming Techniques (4) R/S-PLUS Functions by Example. (5) S/R-PLUS Programming 3: Advanced Techniques and Efficiencies. (6) R/S+ System: Advanced Programming. (7) R/S-PLUS Graphics: Essentials. (8) R/S-PLUS Graphics for SAS Users (9) R/S-PLUS Graphical Techniques for Marketing Research. (10) Multivariate Statistical Methods in R/S-PLUS: Practical Research Applications (11) Introduction to Applied Econometrics with R/S-PLUS (12) Exploratory Analysis for Large and Complex Problems in R/S-PLUS (13) Determining Power and Sample Size Using R/S-PLUS. (14) R/S-PLUS: Data Preparation for Data Mining (15) Data Cleaning Techniques in R/S-PLUS (16) R/S-PLUS: Applied Clustering Techniques More on website http://www.xlsolutions-corp.com/courselistlisting.aspx Ask for group discount and reserve your seat Now - Earlybird Rates. Payment due after the class! Email Sue Turner: sue at xlsolutions-corp.com Phone: 206-686-1578 Please let us know if you and your colleagues are interested in this class to take advantage of group discount. Register now to secure your seat. Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com elvis at xlsolutions-corp.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying case by groups in a data frame
Dear José, Here is one way: # aux. function foo - function(x, ...){ m - mean(x, ...) S - sd(x, ...) x m + S } # result iris$rule - with(iris, ave(Petal.Width, list(Species), FUN = foo)) head(iris) HTH, Jorge.- On Thu, May 3, 2012 at 5:19 PM, Jose Bustos Melo wrote: Hi everyone, I would like to identify the case by groups that is just bigger that avg plus sd. For example, using species as group and petal.wid as my variable in the iris data. What's the better way to doit? creating a function? So,the question is to identify the single element of each species that is just larger than a cut-off point (i.e. larger than mean + sd) I made this, but I can not make a relation to to orginal data for identifiying the cases.If you have some lights please share it. data(iris) library(plyr) petal.wid.avg -ddply(iris,.(Species),function(df) return(c(petal.wid.avg=mean(df$Petal.Width),petal.wid.sd =sd(df$Petal.Width))) ) petal.wid.avg$avgsd -petal.wid.avg$petal.wid.avg +petal.wid.avg$ petal.wid.sd petal.wid.avg Thanks in advance: José [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GAM, how to set qr=TRUE
Hello, I don't understand what went wrong or how to fix this. How do I set qr=TRUE for gam? When I produce a fit using gam like this: fit = gam(y~s(x),data=as.data.frame(l_yx),family=family,control = list(keepData=T)) ...then try to use predict: (see #1 below in the traceback() ) traceback() 6: stop(lm object does not have a proper 'qr' component.\n Rank zero or should not have used lm(.., qr=FALSE).) at #81 5: qr.lm(object) at #81 4: summary.glm(object, dispersion = dispersion) at #81 3: summary(object, dispersion = dispersion) at #81 2: predict.glm(fit, data.frame(x = xx), type = response, se.fit = T, col = prediction_col, lty = prediction_ln) at #81 1: predict(fit, data.frame(x = xx), type = response, se.fit = T, col = prediction_col, lty = prediction_ln) at #81 ...I get this error: Error in qr.lm(object) : lm object does not have a proper 'qr' component. Rank zero or should not have used lm(.., qr=FALSE). I read this post: http://tolstoy.newcastle.edu.au/R/devel/06/04/5133.html So I tried adding qr=T to the gam call but it didn't make any difference. This is how I did it: fit = gam(y~s(x),data=as.data.frame(l_yx),family=family,control = list(keepData=T),qr=T) Its all very strange because I've produced fits with this data may times before with no issues (and never having to do anything with the qr parameter. I don't understand why this is coming up or how to fix it. PS - I don't think this matters, but I am calling a script called FunctionGamFit.r like this: err = system(paste('C:\\Program Files\\R\\R-2.14.1\\bin\\R.exe', 'CMD BATCH FunctionGamFit.r'), wait = T) ...to produce the fit. Thanks for any help! ben [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple plot loop
Ben, I think that your original for-loop would work if you just replaced the 'i' in the lines() call with 'Data2[,i]': for (i in 2:length(Data2)) { lines(MONTH, Data2[, i], type=o, pch=22, lty=2, col=blue) } Peter Ehlers On 2012-05-03 07:04, Ben Neal wrote: Jim, thanks for the reply. I tried what you recommend, but I still get an error when running it, just as I did with the similar loops I was trying. Here is the error: Error in xy.coords(x, y) : 'x' and 'y' lengths differ That comes from this code: #Get file library(zoo) setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting) Data = read.csv(120503_CNAT_Summary.csv, header=T) #fill in missing data from last observation Data2- na.locf(Data) attach(Data2) #Plot line, and make it loop for(column in names(Data2)[2:length(Data2)]) lines(MONTH,column,type=o,pch=22,lty=2,col=blue) The problem perhaps is in my data. My columns are observations over time, column headers are individual names, and the first column is the time series in months. An example is here: MONTH Tag101 0 234 2 236 4 239 8 300 10 320 This then repeats for different individuals . . . I think my problem must be that my length of MONTH is different than my length of observations of each column . . . except that it is not, as far as I can tell! Thank you for assisting me with this simple but frustrating problem - I have the greatest respect for those of you who provide these answers that so many of us read and utilize. Thank you. Ben Neal -Original Message- From: r-help-boun...@r-project.org on behalf of Jim Lemon Sent: Thu 5/3/2012 3:40 AM To: Ben Neal Cc: r-help@r-project.org Subject: Re: [R] Simple plot loop On 05/03/2012 05:50 PM, Ben Neal wrote: Trying to plot multiple lines from a simple matrix with headers (eight observations per column). I will be doing a number of these, with varying numbers of columns, and do not want to enter the header names for each one (I got frustrated and just wrote them out, which did work). Data reads fine, first plot is fine, but when i use the code at the bottom for a for i loop it tells me that x and y do not match. . . One other issue is that I would prefer not to specify the first column either, but when I enter Plot(MONTH, Data2[2] . . . it also does not plot. Should this not call the second column? Thank you very much for any comments. I know this is simple, but I appreciate any assistance. Cheers, Ben # # LOAD DATA FROM CSV library(zoo) setwd(/Users/benjaminneal/Documents/All Panama/1110_Panama/CNAT_Segmenting) Data = read.csv(120503_CNAT_Summary.csv, header=T) #fill in missing data from last observation Data2- na.locf(Data) attach(Data2) # PLOT ALL ON ONE CHART plot(MONTH,T102, type=o, ann=False, ylim=c(1, 100), pch=22, lty=2, col=red) title(main=5m and 10 m Colpophylia natans colonies over time, ylab=% live coral / colony, xlab=Months, col.main=black, font.main=4) lines(MONTH,T162, type=o, pch=22, lty=2, col=red) lines(MONTH,T231, type=o, pch=22, lty=2, col=green) lines(MONTH,T250, type=o, pch=22, lty=2, col=green) ##(many other similar lines here, with entered column headers . . . up to 75) lines(MONTH,T373, type=o, pch=22, lty=2, col=blue) lines(MONTH,T374, type=o, pch=22, lty=2, col=blue) lines(MONTH,T377, type=o, pch=22, lty=2, col=blue) # Tried to add lines another way with for i loop, but this is the part not working for (i in 2:length(Data2)) { lines(MONTH, i, type=o, pch=22, lty=2, col=blue)) } # Hi Ben, I think what you may want in your loop is this: for(column in names(Data2)[2:length(Data2)]) lines(MONTH,column,type=o,pch=22,lty=2,col=blue) But, if you want the first two lines to be green, you'll probably have to get a vector of colors: colorvec-rep(blue,length(Data2)) colorvec[1]-red colorvec[2:3]-green and change the above to: columnnames-names(Data2) for(column in 2:length(Data2)) lines(MONTH,columnnames[column],type=o,pch=22,lty=2,col=colvec[column]) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained,
[R] Difference between 10 and 10L
Good Evening We have been searching through the R documentation manuals without success on this one. What is the purpose or result of the L in the following? n=10 and n=10L or c(5,10) versus c(5L,10L) Thanks Joe Thanks Joe [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.