Re: [R] R looks for a folder not specified
On 01/17/2011 01:31 PM, l.chhay wrote: Dear R community, I have been getting this warning message after running a function sourced from an R script, and can't seem to work out why R is looking for a folder that wasn't even specified (it attaches a \NA to the specified directory, where assess_rev has not asked to do so at all. R code has been tested by another user and that works fine). I have tried saving the files in a new folder and onto a different drive, but that doesn't seem to fix the problem.. I have also checked to make sure that I've the appropriate access options. Has anyone encountered this problem where R starts looking for this non-existent \NA folder? source(S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\assess_rev3.R) assess_rev(data.dir=S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\NEPAL_JUL81_B1\\lam0\\,rev.lag=13,series=NEPAL_JUL81_B1,comp=adj) Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'S:\research\boxcox\Series_to_experiment_with\Revisions_analysis\NEPAL_JUL81_B1\lam0\NA': No such file or directory Hi Leanne, I thought it was the line break in the directory string, but the line wasn't broken when I started to answer. My next guess is that you don't need the trailing \\ on the directory string. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Truetype and Opentype font in pdf device
On Sat, 15 Jan 2011, Kohske Takahashi wrote: Deal all, I want to know if truetype or opentype fonts are available in pdf device (i.e., pdf() or dev.copy2pdf()), and if so, how to do it? They are not in general available in PDF, the language. The cairo-based device embed individual glyph information from such fonts (perhaps as vectors and perhaps as bitmaps). Now I can do as followings: 1. convert ttf to afm using ttf2afm, e.g.: $ ttf2afm Impact.ttf Impact.afm 2. put the afm file in $R_HOME/library/grDevices/afm 3. register a new type1 font: pdfFonts(Impact=Type1Font(Impact, rep(Impact.afm, 4), encoding = TeXtext.enc)) 4. specify the fontfamily in gpar: grid.text('hello grid world', gp=gpar(fontfamily=Impact)) but obviously, it is better if truetype or opentype fonts are directly available without conversion to type1 font. But you would still need to make the font available to your PDF viewer/printer, and in general that does need coversion (to a font type supported in PDF or to bitmaps/vectors). Also, I found that Cairo package can handle truetype or opnetype font. However, the package seems not to support fontfamily, hence I cannot use it through gpar. Have you not considered the built-in and fully featured cairo_pdf device? (Not Windows, but you didn't tell us your OS.) Does anyone know about this topic? Thank you in advance. -- Kohske Takahashi takahashi.koh...@gmail.com Research Center for Advanced Science and Technology, The University of Tokyo, Japan. http://www.fennel.rcast.u-tokyo.ac.jp/profilee_ktakahashi.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Finding NAs in DF
Hi, What is an efficient way to take this DF data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) and get c(NA,TWO,BOTH,ONE) as the result, where NA corresponds to a row without NAs, TWO indicates NA in the second and ONE in the first column. Thanks for any pointers. Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding NAs in DF
Hi, I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I have no idea what you're looking for... But would that do? df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) apply(df,1, FUN=function(x) length(x[is.na(x)])) [1] 0 1 2 1 There might be better ways to do it, but it works HTH, Ivan Le 1/17/2011 11:01, Johannes Graumann a écrit : Hi, What is an efficient way to take this DF data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) and get c(NA,TWO,BOTH,ONE) as the result, where NA corresponds to a row without NAs, TWO indicates NA in the second and ONE in the first column. Thanks for any pointers. Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] median by geometric mean
Will this do? x - runif(20, 1, 100) exp( median( log( x) ) ) S Ellison Skull Crossbones witch.of.agne...@gmail.com 15/01/2011 16:26 Hi All, I need to calculate the median for even number of data points.However instead of calculating the arithmetic mean of the two middle values,I need to calculate their geometric mean. Though I can code this in R, possibly in a few lines, but wondering if there is already some built in function. Can somebody give a hint? Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PANEL DATA SIMULATION
Dear R community,and especially Giovanni Millo, For my master's thesis i need to simulate a panel data with the fixed effects correlated with the predicor, so i run the the following code: set.seed(1970) ###Panel data simulation with alphai correlated with xi# n - 5 t - 4 nt - n*t pData - data.frame(id = rep(paste(JohnDoe, 1:n, sep = _), each = t),time = rep(1981:1984, n)) rho -0.95 alphai - rnorm(n,mean=0,sd=1)#alphai simulation x- as.matrix(rnorm(nt,1))#xi simulation akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix cormat.chold - chol(cormat)#choleski transformation of correlation matrix akrox - cbind(akro,x) ax - akrox%*%cormat.chold ai - as.matrix(ax[,1]) pData$alphai-as.vector(ai) xcorr - as.matrix(ax[,2:(1+ncol(x))]) pData$xcorrei-as.vector(xcorr) pData$yi - 5 + pData$alphai + 5* pData$xcorrei + rnorm(nt) ##panel data frame## library(plm) pData - pdata.frame(pData, c(id, time)) pData I think the panel is correctly generated, but my doubt is about the simulation of the correlated variables: alphai - rnorm(n,mean=0,sd=1)#alphai simulation x- as.matrix(rnorm(nt,1))#xi simulation akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix cormat.chold - chol(cormat)#choleski transformation of correlation matrix akrox - cbind(akro,x) ax - akrox%*%cormat.chold ai - as.matrix(ax[,1]) pData$alphai-as.vector(ai) xcorr - as.matrix(ax[,2:(1+ncol(x))]) This method is correct or is there a better way to do this? Must generate a variable xi correlated with the alphai, for various values of rho: For example rho=(0,0.5,0.6,0.8,0.95,0.99) how do I simulate the xi associated with each value of rho and put in the data frame at once? tried various ways without success. Please give your opinion and suggestions to improve my simulation. Tank you, best regards Carlos Brás Confidencialidade: Esta mensagem (e eventuais ficheiros anexos) é destinada exclusivamente às pessoas nela indicadas e tem natureza confidencial. Se receber esta mensagem por engano, por favor contacte o remetente e elimine a mensagem e ficheiros, sem tomar conhecimento do respectivo conteúdo e sem reproduzi-la ou divulgá-la. Confidentiality Warning: This e-mail message (and any attached files) is confidential and is intended solely for the use of the individual or entity to whom it is addressed. lf you are not the intended recipient of this message please notify the sender and delete and destroy all copies immediately. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding NAs in DF
Simpler would be: rowSums(is.na(df)) On 17/01/2011 10:13, Ivan Calandra wrote: Hi, I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I have no idea what you're looking for... But would that do? df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) apply(df,1, FUN=function(x) length(x[is.na(x)])) [1] 0 1 2 1 There might be better ways to do it, but it works HTH, Ivan Le 1/17/2011 11:01, Johannes Graumann a écrit : Hi, What is an efficient way to take this DF data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) and get c(NA,TWO,BOTH,ONE) as the result, where NA corresponds to a row without NAs, TWO indicates NA in the second and ONE in the first column. Thanks for any pointers. Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com twitter: @portfolioprobe http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Queries about the statistic methodology used by R-Intergration on the data with normal distribution
Dear Sir, Our bank purchased an product called Inforsense years ago, there is a department using it to conduct a data analysis using the Bestfit function called General R. As it is understood that the General-R is the R-Intergration plugin developed by R-Project, we would like to ask some questions on the difficulty that we encounter. During our statistics re-performance, we found that even if the data are considered to be normally distributed by General-R, the statistic result on the confidence interval determination is still greatly different from the result calculated by EXCEL spreadsheet*. As we are not very familiar with the Statistics and not sure how many kinds of statistic models for data of normal distribution there are, thus we would like to inquire if the R is using different methodology on the confidence interval determination from the EXCEL? Thank you very much. *P/S- We just simply use the Excel Function (Mean, Stdev and Confidence) to calculate the confidence interval for the data. -- Thanks Regards, Daniel Cheng Internal Audit Department Wing Lung Bank Limited Direct Line: 2710 4101 Fax : 2783 7292 Email :danielch...@winglungbank.com --- DISCLAIMER: E-mail transmission cannot be guaranteed to ...{{dropped:5}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What's wrong with Omegahat?
Omegahat project page seems to be down or registeration of omegahat domain has ended? http://www.omegahat.org/ Where I can find the RCurl package? -J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting the first occurrence of a value after an occurrence of a different value
On Sun, Jan 16, 2011 at 11:09:58PM -0800, surreyj wrote: Hello, Back again, I thought the problem was solved but I realised that the only reason I was getting the correct answer was because my data set happened to only have two rfts to choose from, so it looked correct. I have been using: onlyfirstresponseafterrft-which(!diff(as.numeric(factor(Stat, levels = c(MagDwn, Resp) [...] to get my results and what is being delivered is the rows at which a resp occurs after a magdwn except I only want the first resp after a mag down... This seems simple to figure out and I have tried a lot of things but its just not happening! Is it required to select positions with Resp, which are the end-points of subsequences of the form MagDwn other^* Resp ? For the sequence 1 MagDwn 2 other 3 MagDwn 4 Resp 5 other 6 Resp 7 MagDwn 8 other 9 Resp this would be positions 4 and 9. The positions of Resp in these end-points may be computed, for example Vals - c(MagDwn, Resp, other) Stat - Vals[c(1, 3, 1, 2, 3, 2, 1, 3, 2)] ind - which(Stat %in% c(MagDwn, Resp)) Reduced - Stat[ind] ind[which(diff(Reduced == Resp) == 1) + 1] # [1] 4 9 The positions of the corresponding MagDwn are ind[which(diff(Reduced == Resp) == 1)] # [1] 3 7 Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting the first occurrence of a value after an occurrence of a different value
Hi Freddy, I have a long column of event codes e.g. below (in multiple files) that I am trying to analyse. OutMag FirstResp InMag MagUp OutMag MagDwn Resp Resp Resp InMag MagUp OutMag InMag OutMag InMag OutMag InMag OutMag InMag MagDwn OutMag Resp MagUp InMag MagDwn OutMag Resp MagUp With these files I have been using which to select the appropriate event codes so I can do analysis on timing between events using the time appropriate time coloumn. This has worked well so far and now I am faced with the problem that I need to select the first Resp that occurs after each MagDwn. Sometimes there will be just one Resp in between a MagDwn and sometimes there will be many e.g. up to 500. This code onlyfirstresponseafterrft-which(!diff(as.numeric(factor(Stat, levels = c(MagDwn, Resp) allowed me to select the first Resp (or so I thought) but in the file I tried it on there were only two occurences of Resp inbetween each MagDwn and now that I have tried it on files with more Resp it is actually selecting all but the last one that happens before the Mag Dwn but I only need the first one. Hope that makes sense, thanks for your help. -- View this message in context: http://r.789695.n4.nabble.com/Selecting-the-first-occurrence-of-a-value-after-an-occurrence-of-a-different-value-tp3217340p3220852.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about svm(e1071)
Dear Prof. Ligges, Thank you for the reply. Is an order of calculation changed when samples are shuffled? Does that happen because of Sequential Minimal Optimization(SMO)? I noticed that when I set scale=F, SVs were identical. However, differences between coefs are sometimes relatively large. Best, Hiro ### Script start ### set.seed(50) s - sample(ncol(data)) m - svm(x=t(data), y=factor(data.cl ), scale=F, type=C-classification, kernel=linear) m.s - svm(x=t(data[,s]), y=factor(data.cl[s]), scale=F, type=C-classification, kernel=linear) sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),])) [1] 0 sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))])) [1] 0.3227749 ### Script end ### -Original Message- From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] Sent: Saturday, January 15, 2011 3:10 AM To: 武藤裕紀(創薬資源研究部1GBIOINF) Cc: r-help@r-project.org Subject: Re: [R] question about svm(e1071) Looking at your results suggests that differences are probably based on expected minor numerical inaccuracies and the possibly alternating sign of the support vectors. Best, Uwe Ligges On 13.01.2011 01:28, muto...@chugai-pharm.co.jp wrote: Dear all, I executed svm calculation using e1071 library with a microarray data (http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt). Then, I shuffled the data samples and executed svm calculation again. The results of 2 calculation were different (in SV, coefs and weights). I attached the script below. Could please tell me why this happens? If possible please tell me how to make them equal. Best regards, Hiro ### Script start ### library(e1071) data- read.table('http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt', header=TRUE, row.names=1, sep=\t, quote=) data.cl- rep(NA,ncol(data)) data.cl[grep('Normal',colnames(data))]- 'Normal' data.cl[grep('Tumour',colnames(data))]- 'Tumour' s- sample(ncol(data)) m- svm(x=t(data), y=factor(data.cl ), scale=T, type=C-classification,kernel=linear) m.s- svm(x=t(data[,s]), y=factor(data.cl[s]), scale=T, type=C-classification, kernel=linear) w- t(m $coefs) %*% m$SV w.s- t(m.s$coefs) %*% m.s$SV # SV and coefs are slightly different sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),])) sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))])) # rank of weight are not identical all(rank(w)==rank(w.s)) ### Script end ### [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help for R plot
Hi all, How to plot as the coordinate as in my attachment? I want to trim the coordinate and one of plot as the figure in attachment. Does any one have such example? Thanks. attachment: Screen shot 2011-01-17 at 11.22.20 AM.png__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What's wrong with Omegahat?
Hi, strange indeed, try this url: http://cran.r-project.org/web/packages/RCurl/index.html kind regards, Vanessa 2011/1/17 johannes rara johannesr...@gmail.com Omegahat project page seems to be down or registeration of omegahat domain has ended? http://www.omegahat.org/ Where I can find the RCurl package? -J __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot: modify axis tick marks
Thanks for all your great suggestions! I've learnt a lot about graphics now.. Kang Min On Jan 17, 1:46 am, Hugo Mildenberger hugo.mildenber...@web.de wrote: Using lattice and the rainfall$Time series as proposed below by Dennis gives also a nice result: rainfall$Time - seq(from = as.Date('1993-01-01'), to = as.Date('2007-12-01'), by = 'month') xyplot(rainfall~Time,data=rainfall,type=c(g,p,l,smooth)) On Sunday 16 January 2011 17:33:18 Dennis Murphy wrote: Hi: Try this, since your data have no missing months: rainfall$Time - seq(from = as.Date('1993-01-01'), to = as.Date('2007-12-01'), by = 'month') g - ggplot(rainfall, aes(x = Time, y = rainfall)) g + geom_path() HTH, Dennis On Sun, Jan 16, 2011 at 5:01 AM, Kang Min ngokang...@gmail.com wrote: Hi, I would like to plot time against rainfall data (data is at the end) using xyplot. The basic code looks like this: xyplot(rainfall~time, type=a) When I do this, the graph looks ok except that the x-axis has too many values. I would just like to display the years and not the months on the x-axis. I've been fiddling around with 'scales', and read previous posts about axis problems but I still can't find the solution. Thanks in advance. Kang Min time rainfall Jan1993 176.4 Feb1993 69.2 Mar1993 250.5 Apr1993 283.9 May1993 129.9 Jun1993 115.5 Jul1993 240 Aug1993 106.8 Sep1993 61.7 Oct1993 175.5 Nov1993 250.8 Dec1993 308.5 Jan1994 56.9 Feb1994 133.5 Mar1994 288.2 Apr1994 154 May1994 169.6 Jun1994 184.7 Jul1994 53.8 Aug1994 45.1 Sep1994 23.7 Oct1994 84.7 Nov1994 322.2 Dec1994 425.4 Jan1995 349.4 Feb1995 334 Mar1995 67.7 Apr1995 242.3 May1995 84.4 Jun1995 63.7 Jul1995 173.6 Aug1995 211.6 Sep1995 29.5 Oct1995 101.1 Nov1995 372.8 Dec1995 302.5 Jan1996 173.2 Feb1996 180.2 Mar1996 129.7 Apr1996 178.2 May1996 107.5 Jun1996 265.8 Jul1996 162.3 Aug1996 258.4 Sep1996 297 Oct1996 300 Nov1996 180.2 Dec1996 185.5 Jan1997 15.4 Feb1997 105.4 Mar1997 34.3 Apr1997 118.4 May1997 41.6 Jun1997 78.9 Jul1997 18.6 Aug1997 86.6 Sep1997 31.1 Oct1997 78.4 Nov1997 158.3 Dec1997 351.9 Jan1998 268.8 Feb1998 32.5 Mar1998 58.8 Apr1998 187.7 May1998 370.8 Jun1998 198.8 Jul1998 259.2 Aug1998 195 Sep1998 258.2 Oct1998 222.7 Nov1998 107.2 Dec1998 463.4 Jan1999 193.9 Feb1999 67.4 Mar1999 181.4 Apr1999 88.5 May1999 157.1 Jun1999 103.4 Jul1999 225.4 Aug1999 204 Sep1999 125.9 Oct1999 205 Nov1999 241.5 Dec1999 340.5 Jan2000 275.2 Feb2000 237.8 Mar2000 238.3 Apr2000 311.6 May2000 96.8 Jun2000 157.5 Jul2000 116.1 Aug2000 113.5 Sep2000 81.1 Oct2000 120.9 Nov2000 385.7 Dec2000 236 Jan2001 425.8 Feb2001 86.6 Mar2001 297.3 Apr2001 203.3 May2001 164.9 Jun2001 137.1 Jul2001 111.3 Aug2001 158.3 Sep2001 162 Oct2001 252.2 Nov2001 175.3 Dec2001 609 Jan2002 221.2 Feb2002 50.8 Mar2002 55.6 Apr2002 116.5 May2002 236.6 Jun2002 83.1 Jul2002 233.7 Aug2002 54.2 Sep2002 124.2 Oct2002 10.8 Nov2002 307.2 Dec2002 255 Jan2003 444.2 Feb2003 172.9 Mar2003 154.6 Apr2003 159.9 May2003 81.8 Jun2003 50.3 Jul2003 170.4 Aug2003 193.6 Sep2003 205.3 Oct2003 351.4 Nov2003 133.8 Dec2003 273 Jan2004 600.9 Feb2004 31.9 Mar2004 269.4 Apr2004 57.1 May2004 137.6 Jun2004 127.2 Jul2004 166.6 Aug2004 185.2 Sep2004 128.9 Oct2004 125.6 Nov2004 166.2 Dec2004 139.8 Jan2005 163.2 Feb2005 8.4 Mar2005 82.4 Apr2005 81.7 May2005 331.1 Jun2005 82.3 Jul2005 104 Aug2005 58.5 Sep2005 175.7 Oct2005 314.5 Nov2005 362.9 Dec2005 166 Jan2006 454.4 Feb2006 115.5 Mar2006 83.1 Apr2006 239.8 May2006 205.7 Jun2006 236.8 Jul2006 153.8 Aug2006 127.3 Sep2006 83.3 Oct2006 102 Nov2006 185.6 Dec2006 765.9 Jan2007 450.1 Feb2007 105.5 Mar2007 269.1 Apr2007 240.2 May2007 127.2 Jun2007 139 Jul2007 141.7 Aug2007 190.7 Sep2007 149 Oct2007 237.2 Nov2007 367.9 Dec2007 468.6 __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org
[R] cannot allocate vector of size ... in RHLE5 PAE kernel
Dear R community, I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a PAE kernel, as you can see here: $ uname -a Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010 i686 i686 i386 GNU/Linux When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940, ncol=9000) ), I got the following error: Error: cannot allocate vector of size 238.3 Mb However, the amount of free memory in my machine seems to be much larger than this: system(free) \ total used free sharedbuffers cached Mem: 1246623663541166112120 0 675962107556 -/+ buffers/cache:41789648287272 Swap: 12582904 0 12582904 I tried to increase the memory limit available for R by using: $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M but it didn't work. Any hint about how can I get R using all the memory available in the machine ? Thanks in advance, Mauricio -- === Linux user #454569 -- Ubuntu user #17469 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding NAs in DF
Both versions do not do what I am looking for, as they do not differentiate where the NA is, if there is just one. My original wished for result therefore holts, but should probably be rewritten c(NA,B,AB,A) Joh On Monday 17 January 2011 14:06:30 Patrick Burns wrote: Simpler would be: rowSums(is.na(df)) On 17/01/2011 10:13, Ivan Calandra wrote: Hi, I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I have no idea what you're looking for... But would that do? df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) apply(df,1, FUN=function(x) length(x[is.na(x)])) [1] 0 1 2 1 There might be better ways to do it, but it works HTH, Ivan Le 1/17/2011 11:01, Johannes Graumann a écrit : Hi, What is an efficient way to take this DF data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) and get c(NA,TWO,BOTH,ONE) as the result, where NA corresponds to a row without NAs, TWO indicates NA in the second and ONE in the first column. Thanks for any pointers. Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. signature.asc Description: This is a digitally signed message part. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding NAs in DF
Try this: factor(sapply(apply(is.na(df), 1, which), sum), labels = c(NA, TWO, BOTH, ONE)) On Mon, Jan 17, 2011 at 9:23 AM, Johannes Graumann johannes_graum...@web.de wrote: Both versions do not do what I am looking for, as they do not differentiate where the NA is, if there is just one. My original wished for result therefore holts, but should probably be rewritten c(NA,B,AB,A) Joh On Monday 17 January 2011 14:06:30 Patrick Burns wrote: Simpler would be: rowSums(is.na(df)) On 17/01/2011 10:13, Ivan Calandra wrote: Hi, I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I have no idea what you're looking for... But would that do? df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) apply(df,1, FUN=function(x) length(x[is.na(x)])) [1] 0 1 2 1 There might be better ways to do it, but it works HTH, Ivan Le 1/17/2011 11:01, Johannes Graumann a écrit : Hi, What is an efficient way to take this DF data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) and get c(NA,TWO,BOTH,ONE) as the result, where NA corresponds to a row without NAs, TWO indicates NA in the second and ONE in the first column. Thanks for any pointers. Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel
MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com on Mon, 17 Jan 2011 11:46:44 +0100 writes: MZ Dear R community, MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a MZ PAE kernel, as you can see here: MZ $ uname -a MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010 MZ i686 i686 i386 GNU/Linux MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940, MZ ncol=9000) ), I got the following error: Error: cannot allocate vector of size 238.3 Mb MZ However, the amount of free memory in my machine seems to be much MZ larger than this: MZ system(free) MZ \ total used free sharedbuffers cached MZ Mem: 1246623663541166112120 0 67596 2107556 MZ -/+ buffers/cache:41789648287272 MZ Swap: 12582904 0 12582904 MZ I tried to increase the memory limit available for R by using: MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M MZ but it didn't work. MZ Any hint about how can I get R using all the memory available in the machine ? Install a 64-bit version of Linux, i.e., ubuntu in your case and work from there. I don't think there's a way around that. Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
Dear R community,and especially Giovanni Millo, For my master's thesis i need to simulate a panel data with the fixed effects correlated with the predicor, so i run the the following code: set.seed(1970) ###Panel data simulation with alphai correlated with xi# n - 5 t - 4 nt - n*t pData - data.frame(id = rep(paste(JohnDoe, 1:n, sep = _), each = t),time = rep(1981:1984, n)) rho -0.95 alphai - rnorm(n,mean=0,sd=1)#alphai simulation x- as.matrix(rnorm(nt,1))#xi simulation akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix cormat.chold - chol(cormat)#choleski transformation of correlation matrix akrox - cbind(akro,x) ax - akrox%*%cormat.chold ai - as.matrix(ax[,1]) pData$alphai-as.vector(ai) xcorr - as.matrix(ax[,2:(1+ncol(x))]) pData$xcorrei-as.vector(xcorr) pData$yi - 5 + pData$alphai + 5* pData$xcorrei + rnorm(nt) ##panel data frame## library(plm) pData - pdata.frame(pData, c(id, time)) pData I think the panel is correctly generated, but my doubt is about the simulation of the correlated variables: alphai - rnorm(n,mean=0,sd=1)#alphai simulation x- as.matrix(rnorm(nt,1))#xi simulation akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix cormat.chold - chol(cormat)#choleski transformation of correlation matrix akrox - cbind(akro,x) ax - akrox%*%cormat.chold ai - as.matrix(ax[,1]) pData$alphai-as.vector(ai) xcorr - as.matrix(ax[,2:(1+ncol(x))]) This method is correct or is there a better way to do this? Must generate a variable xi correlated with the alphai, for various values of rho: For example rho=(0,0.5,0.6,0.8,0.95,0.99) how do I simulate the xi associated with each value of rho and put in the data frame at once? tried various ways without success Please give your opinion and suggestions to improve my simulation. Tank you, best regards Carlos Brás __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] PANEL DATA SIMULATION(sorry for my previous email with no subject)
Dear R community,and especially Giovanni Millo, For my master's thesis i need to simulate a panel data with the fixed effects correlated with the predicor, so i run the the following code: set.seed(1970) ###Panel data simulation with alphai correlated with xi# n - 5 t - 4 nt - n*t pData - data.frame(id = rep(paste(JohnDoe, 1:n, sep = _), each = t),time = rep(1981:1984, n)) rho -0.95 alphai - rnorm(n,mean=0,sd=1)#alphai simulation x- as.matrix(rnorm(nt,1))#xi simulation akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix cormat.chold - chol(cormat)#choleski transformation of correlation matrix akrox - cbind(akro,x) ax - akrox%*%cormat.chold ai - as.matrix(ax[,1]) pData$alphai-as.vector(ai) xcorr - as.matrix(ax[,2:(1+ncol(x))]) pData$xcorrei-as.vector(xcorr) pData$yi - 5 + pData$alphai + 5* pData$xcorrei + rnorm(nt) ##panel data frame## library(plm) pData - pdata.frame(pData, c(id, time)) pData I think the panel is correctly generated, but my doubt is about the simulation of the correlated variables: alphai - rnorm(n,mean=0,sd=1)#alphai simulation x- as.matrix(rnorm(nt,1))#xi simulation akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix cormat.chold - chol(cormat)#choleski transformation of correlation matrix akrox - cbind(akro,x) ax - akrox%*%cormat.chold ai - as.matrix(ax[,1]) pData$alphai-as.vector(ai) xcorr - as.matrix(ax[,2:(1+ncol(x))]) This method is correct or is there a better way to do this? Must generate a variable xi correlated with the alphai, for various values of rho: For example rho=(0,0.5,0.6,0.8,0.95,0.99) how do I simulate the xi associated with each value of rho and put in the data frame at once? tried various ways without success Please give your opinion and suggestions to improve my simulation. Tank you, best regards Carlos Brás __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R looks for a folder not specified
On 16/01/2011 9:31 PM, l.chhay wrote: Dear R community, I have been getting this warning message after running a function sourced from an R script, and can't seem to work out why R is looking for a folder that wasn't even specified (it attaches a \NA to the specified directory, where assess_rev has not asked to do so at all. R code has been tested by another user and that works fine). I have tried saving the files in a new folder and onto a different drive, but that doesn't seem to fix the problem.. I have also checked to make sure that I've the appropriate access options. Has anyone encountered this problem where R starts looking for this non-existent \NA folder? source(S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\assess_rev3.R) assess_rev(data.dir=S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\NEPAL_JUL81_B1\\lam0\\,rev.lag=13,series=NEPAL_JUL81_B1,comp=adj) Error in file(file, r) : cannot open the connection In addition: Warning message: In file(file, r) : cannot open file 'S:\research\boxcox\Series_to_experiment_with\Revisions_analysis\NEPAL_JUL81_B1\lam0\NA': No such file or directory (I am working from a Windows7 machine, using R 2.9.2.) It looks as though something in the assess_rev function (which was presumably defined in that script you sourced) constructs a path from the data.dir argument and makes use of a missing value. The missing value is converted to NA and is appended to the directory, and you see the error message. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] intercept point coordinates
Hi List, Can someone help me to calculate the coordinates of the red and green points? In this example I found their approximate location by trying, but as I have to analyse many similar curves, I’d rather calculate the exact location. data- c(0.008248005, 0.061242387, 0.099095516, 0.189943027, 0.227796157, 0.258078661, 0.280790538, 0.303502416, 0.386779301, 0.454914934, 0.545762445, 0.591186201, 0.682033712, 0.757739971, 0.825875604, 0.848587482, 0.803163726, 0.833446230, 0.878869985, 0.871299359, 0.878869985, 0.947005619, 1.0, 0.992429374, 0.954576245, 0.894011237, 0.765310597, 0.621468704, 0.492768064, 0.333784920, 0.258078661, 0.174801775, 0.099095516, 0.008248005) plot(data, type=l) abline(h=0.9) points(21.35,.9, pch=20, col=red) points(26,.9, pch=20, col=green) Thank you, Tonja ___ Neu: WEB.DE De-Mail - Einfach wie E-Mail, sicher wie ein Brief! Jetzt De-Mail-Adresse reservieren: https://produkte.web.de/go/demail02 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] t-test calculation correct?
A multinomial logit model (N=192) revealed (besides others) the following statistics for the outcome, y, and one predictor, x: - y = A (baseline, n=34) - y = B (n=26), B(x)=0.7323 (SE=0.2384) - y = C (n=132), B(x)=0.6535 (SE=0.2041) With a t-test I want to explore whether the two predictors differ significantly, and I use the following calculation (according to Bortz, 2005, p.140): ## dm - 0.7323 - 0.6535 se.dm - sqrt( (0.2384 / (34 + 26)) + (0.2041 / (34 + 132)) ) t.val - dm / se.dm pval - (1 - pt(t.val, df=(34+26+132) )) * 2 ## My question is where this calculation is wrong and why. Ref.: Bortz, J. (2005). Statistik für Human- und Sozialwissenschaftler (6. Aufl). Berlin: Springer. -- Sascha Vieweg, saschav...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] transform a df with a condition
Hi Try the following... df - data.frame(A = c(1,1,3,2,2,3,3), B = c(2,1,1,2,7,8,7), K = c(a.1, d.2, f.3, a.1, k.4, f.9, f.5)) df$ID-rownames(df) df$K-as.character(as.character(df$K)) changefunction-function(z) { tmp - lapply(split(z, z[,4]), function(x) within(x, if(A==3)B - 5 )) dat2-tmp df-unsplit(dat2,df$ID) tmp - lapply(split(df, df[,4]), function(x) within(x, if(A==3)K - chartr(f,m,K))) dat2-tmp df-unsplit(dat2,df$ID) return(df) } dfnew-changefunction(df) df$ID-NULL dfnew$ID-NULL dfnew Regards Vijayan Padmanabhan What is expressed without proof can be denied without proof - Euclide. Can you avoid printing this? Think of the environment before printing the email. --- Please visit us at www.itcportal.com ** This Communication is for the exclusive use of the intended recipient (s) and shall not attach any liability on the originator or ITC Ltd./its Subsidiaries/its Group Companies. If you are the addressee, the contents of this email are intended for your use only and it shall not be forwarded to any third party, without first obtaining written authorisation from the originator or ITC Ltd./its Subsidiaries/its Group Companies. It may contain information which is confidential and legally privileged and the same shall not be used or dealt with by any third party in any manner whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group Companies. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding NAs in DF
Maybe something along the lines: apply(df,1, FUN=function(x) which(is.na(x))) It's not exactly what you want, but it might work combined with the other solutions HTH, Ivan Le 1/17/2011 12:23, Johannes Graumann a écrit : Both versions do not do what I am looking for, as they do not differentiate where the NA is, if there is just one. My original wished for result therefore holts, but should probably be rewritten c(NA,B,AB,A) Joh On Monday 17 January 2011 14:06:30 Patrick Burns wrote: Simpler would be: rowSums(is.na(df)) On 17/01/2011 10:13, Ivan Calandra wrote: Hi, I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I have no idea what you're looking for... But would that do? df- data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) apply(df,1, FUN=function(x) length(x[is.na(x)])) [1] 0 1 2 1 There might be better ways to do it, but it works HTH, Ivan Le 1/17/2011 11:01, Johannes Graumann a écrit : Hi, What is an efficient way to take this DF data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) and get c(NA,TWO,BOTH,ONE) as the result, where NA corresponds to a row without NAs, TWO indicates NA in the second and ONE in the first column. Thanks for any pointers. Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA PhD Student University of Hamburg Biozentrum Grindel und Zoologisches Museum Abt. Säugetiere Martin-Luther-King-Platz 3 D-20146 Hamburg, GERMANY +49(0)40 42838 6231 ivan.calan...@uni-hamburg.de ** http://www.for771.uni-bonn.de http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help on strange installation process of JRI / rJava 0.9.0
Hi Nidhi, ... On the other hand, I also found that JRI.jar is missing from both of these (piodev...) installations. ... this is resolved now in the /mnt/tools/r installation. Background: When compiling R, one needs to provide an option --enable-R-shlib in order that R is capable of dynamically linking libraries. Now the desired JRI package (Java calls into R) is part of the rJava package. However, only if R was compiled with the above option does installation of the rJava package include also the JRI package. Technically this autodetect feature, as they call it, might make sense, but from the user's and admin's perspective a more visible warning from the installation of the rJava package certainly is desirable, if JRI is not going to be provided. Hence I recompiled and reinstalled R-2-12-1 and rJava 0.9.0 accordingly. Best, Mirko [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding NAs in DF
building on the previous responses, does this give you what you want: x A B 1 1 1 2 2 NA 3 NA NA 4 NA 4 # determine where the NAs are row.na - apply(x, 1, is.na) # now convert to list of columns with NAs apply(row.na, 2, function(a) paste(colnames(x)[a], collapse = ',')) [1] B A,B A On Mon, Jan 17, 2011 at 5:01 AM, Johannes Graumann johannes_graum...@web.de wrote: Hi, What is an efficient way to take this DF data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4)) and get c(NA,TWO,BOTH,ONE) as the result, where NA corresponds to a row without NAs, TWO indicates NA in the second and ONE in the first column. Thanks for any pointers. Joh __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using summaryBy with weighted data
You might use the plyr package to get group-wise weighted means library(plyr) ddply(mydata,~group,summarise, b=mean(weights), c=weighted.mean(response,weights)) hth david freedman -- View this message in context: http://r.789695.n4.nabble.com/Using-summaryBy-with-weighted-data-tp3220761p3221212.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data frame column name change
or d = data.frame(Col1=c(1,2,3),Col2=c(2,3,4),Col3=c(3,4,5)) names(d) names(d)[1] = NewName1 names(d) HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/data-frame-column-name-change-tp3220684p3221214.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survfit: why different survival curves but same parameter estimates?
begin included message I'm trying to estimate a Cox proportional hazard model with time-varying covariates using coxph. The parameter estimates are fine but there is something wrong with the survival curves I get with survfit (results are not plausible). -- end inclusion This sounds wrong to me also. Could you give more information so that I can verify the issue? (See the posting guide). What version of R and of the survival package? Is is possible to send me a copy of the example that fails? Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
If I have understood your question correctly, how about the following ... m = matrix(c(7,11,15,17,10,19,4,18,18), nrow = 3, ncol=3) sum_m = sum(m) new_m = summ-m HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221215.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
typo ... should have been m = matrix(c(7,11,15,17,10,19,4,18,18), nrow = 3, ncol=3) sum_m = sum(m) new_m = sum_m-m -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221216.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help for R plot
On 2011-01-17 02:26, Fabrice Tourre wrote: Hi all, How to plot as the coordinate as in my attachment? I want to trim the coordinate and one of plot as the figure in attachment. Does any one have such example? Thanks. Maybe you're looking for something like axis.break or gap.plot in the plotrix package? Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] intercept point coordinates
On 2011-01-17 04:14, Tonja Krueger wrote: Hi List, Can someone help me to calculate the coordinates of the red and green points? In this example I found their approximate location by trying, but as I have to analyse many similar curves, I’d rather calculate the exact location. data- c(0.008248005, 0.061242387, 0.099095516, 0.189943027, 0.227796157, 0.258078661, 0.280790538, 0.303502416, 0.386779301, 0.454914934, 0.545762445, 0.591186201, 0.682033712, 0.757739971, 0.825875604, 0.848587482, 0.803163726, 0.833446230, 0.878869985, 0.871299359, 0.878869985, 0.947005619, 1.0, 0.992429374, 0.954576245, 0.894011237, 0.765310597, 0.621468704, 0.492768064, 0.333784920, 0.258078661, 0.174801775, 0.099095516, 0.008248005) plot(data, type=l) abline(h=0.9) points(21.35,.9, pch=20, col=red) points(26,.9, pch=20, col=green) Thank you, Tonja You can do this with graphics, using the locator() function: plot(data, type=l) abline(h=0.9) v - locator(2) Now click on the intersection points and then extract v$x. It's best to blow up the relevant region of the plot (perhaps for each point separately) with appropriate xlim/ylim settings and to maximize your plot window. Alternatively, you can identify that the points lie in data[21:22] and in data[25:26]. Then use the approx() function: approx(data[21:22], 21:22, xout = 0.9) approx(data[25:26], 25:26, xout = 0.9) Your points were close - just a bit high: approx gives 21.31012 and 25.90112. Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] effects packages for mixed model?
Dear John, I've wanted to extend the effects package to mixed-effects models for some time now. The basics are quite simple and you should be able to do the computations yourself using the estimated fixed effects and their covariance matrix. The tricky computations are for models that have data-dependent bases, such as those including regression spline or orthogonal polynomial terms. In the limited time I've had to look at the problem, I haven't figured out how to get so-called safe predictions for mixed models. Simply using predict() isn't sufficient, since the effect() function has to manipulate the model matrix directly. Regards, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of array chip Sent: January-17-11 1:08 AM To: r-help@r-project.org Subject: [R] effects packages for mixed model? Hi, I am wondering if there is a similar effects package for mixed models, just like what effects package does for linear, generalized linear models? Specifically I am looking for a way to calculate the SAS-co-called least squared means (LS means) in mixed models (I understand there is a substantial debate on whether such adjusted means should be computed in the first place). Thank you, John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem about for loop
Hi everyones, my function like; e - rnorm(n=50, mean=0, sd=sqrt(0.5625)) x0 - c(rep(1,50)) x1 - rnorm(n=50,mean=2,sd=1) x2 - rnorm(n=50,mean=2,sd=1) x3 - rnorm(n=50,mean=2,sd=1) x4 - rnorm(n=50,mean=2,sd=1) y - 1+ 2*x1+4*x2+3*x3+2*x4+e x2[1] = 10 #influential observarion y[1] = 10 #influential observarion data.x - matrix(c(x0,x1,x2,x3,x4),ncol=5) data.y - matrix(y,ncol=1) data.k - cbind(data.x,data.y) dataX - data.k[,1:5] dataY - data.k[,6] theta - function(data) { B.cap - solve(crossprod(dataX)) %*% crossprod(dataX,dataY) P - dataX %*% solve(crossprod(dataX)) %*% t(dataX) Y.cap - P %*% dataY e - dataY - Y.cap dX - nrow(dataX) - ncol(dataX) var.cap - crossprod(e) / (dX) ei - as.vector(dataY - dataX %*% B.cap) pi - diag(P) var.cap.i - (((dX) * var.cap) / (dX - 1)) - (ei^2 / ((dX-1) * (1 - pi))) ti - ei / sqrt(var.cap * (1 - pi)) Ci - (ti^2 / (ncol(dataX))) * (pi / (1 - pi)) output.i - mean(Ci)} result - list() for ( i in 1:5){ data - replicate(1, data.k[sample(50,50,replace=T),], simplify = FALSE) output.j - theta(data) result - c(result,(list(output.j))) } table - do.call(rbind.data.frame,result) names(table)=c(cooks) table This function give same results each time, the data is changing every time but mean(Ci)s are always same. Does anyone have an idea about how to be? Thanks for any idea -- View this message in context: http://r.789695.n4.nabble.com/Problem-about-for-loop-tp3221210p3221210.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel
Thanks for your answer Martin, but -unfortunately- the decision about installing a 32 bits OS in the 64 bits machine, was taken by the IT guys of my work and not by me. By the way, due to strong limitations about software installation in my work place, this problem didn't happen in Ubuntu, but in Red Hat Enterprise 5. At home I have Ubuntu 10.10 32 bits, but I can not run the code I need in that machine. Cheers, Mauricio -- === Linux user #454569 -- Ubuntu user #17469 === 2011/1/17 Martin Maechler maech...@stat.math.ethz.ch: MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com on Mon, 17 Jan 2011 11:46:44 +0100 writes: MZ Dear R community, MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a MZ PAE kernel, as you can see here: MZ $ uname -a MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010 MZ i686 i686 i386 GNU/Linux MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940, MZ ncol=9000) ), I got the following error: Error: cannot allocate vector of size 238.3 Mb MZ However, the amount of free memory in my machine seems to be much MZ larger than this: MZ system(free) MZ \ total used free shared buffers cached MZ Mem: 12466236 6354116 6112120 0 67596 2107556 MZ -/+ buffers/cache: 4178964 8287272 MZ Swap: 12582904 0 12582904 MZ I tried to increase the memory limit available for R by using: MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M MZ but it didn't work. MZ Any hint about how can I get R using all the memory available in the machine ? Install a 64-bit version of Linux, i.e., ubuntu in your case and work from there. I don't think there's a way around that. Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to doulbe all the value on a matrix
Hi, Is there an expression to double the values of a matrix - without using a loop? What I need is this: Suppose we have this matrix m [,1] [,2] [,3] [1,]7 174 [2,] 11 10 18 [3,] 15 19 18 and I want this matrix [,1] [,2] [,3] [1,] 112 102 115 [2,] 108 109 101 [3,] 104 100 101 where for instance, m[1,1] was obtained by adding (7+17+4+11+10+18+15+19+18)-7 with this loop I am able to get the result I need but I wanted to know if a more R way of doing this. a-matrix(c(7,17,4,11,10,18,15,19,18),3,3,T) m=a for(i in 1:9){ + m[c(i)]-sum(a)-a[c(i)] + } m thanks AD -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221213.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem about for loop
Hi, Looks like the function theta takes a variable data, but that variable is not being used in the body of the function (you are using the global dataX and dataY, which will be the same each time the function is called). Martyn -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of ufuk beyaztas Sent: 17 January 2011 13:21 To: r-help@r-project.org Subject: [R] Problem about for loop Hi everyones, my function like; e - rnorm(n=50, mean=0, sd=sqrt(0.5625)) x0 - c(rep(1,50)) x1 - rnorm(n=50,mean=2,sd=1) x2 - rnorm(n=50,mean=2,sd=1) x3 - rnorm(n=50,mean=2,sd=1) x4 - rnorm(n=50,mean=2,sd=1) y - 1+ 2*x1+4*x2+3*x3+2*x4+e x2[1] = 10 #influential observarion y[1] = 10 #influential observarion data.x - matrix(c(x0,x1,x2,x3,x4),ncol=5) data.y - matrix(y,ncol=1) data.k - cbind(data.x,data.y) dataX - data.k[,1:5] dataY - data.k[,6] theta - function(data) { B.cap - solve(crossprod(dataX)) %*% crossprod(dataX,dataY) P - dataX %*% solve(crossprod(dataX)) %*% t(dataX) Y.cap - P %*% dataY e - dataY - Y.cap dX - nrow(dataX) - ncol(dataX) var.cap - crossprod(e) / (dX) ei - as.vector(dataY - dataX %*% B.cap) pi - diag(P) var.cap.i - (((dX) * var.cap) / (dX - 1)) - (ei^2 / ((dX-1) * (1 - pi))) ti - ei / sqrt(var.cap * (1 - pi)) Ci - (ti^2 / (ncol(dataX))) * (pi / (1 - pi)) output.i - mean(Ci)} result - list() for ( i in 1:5){ data - replicate(1, data.k[sample(50,50,replace=T),], simplify = FALSE) output.j - theta(data) result - c(result,(list(output.j))) } table - do.call(rbind.data.frame,result) names(table)=c(cooks) table This function give same results each time, the data is changing every time but mean(Ci)s are always same. Does anyone have an idea about how to be? Thanks for any idea -- View this message in context: http://r.789695.n4.nabble.com/Problem-about-for-loop-tp3221210p3221210.h tml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rootogram for normal distributions
I was distracted enough by the possibility of hijacking hist() for this to give it a go. The following code implements a basic hanging rootogram based on a normal density with hist() breaks used as bins and bin midpoints used as the hanging location (not exact, I suspect, but perhaops good enough). Extensions to other distributions are reasonably obvious. S Ellison rootonorm - function(x, breaks=Sturges, col=lightgrey, gap=0.2, ...) { h-hist(x, breaks=breaks) nbins-length(h$counts) mu-mean(x) s-sd(x) normdens-dnorm(h$mids, mu, s) plot.range - range(pretty(h$breaks)) plot(z - seq(plot.range[1], plot.range[2], length.out=200), dens-dnorm(z, mu,s), type=n, ...) d.gap - min(diff(h$breaks)) * gap /2 for(i in 1:nbins) { rect(h$breaks[i]+d.gap, normdens[i]-h$density[i], h$breaks[i+1]-d.gap, normdens[i], col=col) } lines(z, dens, lwd=2) points(h$mids, normdens) } set.seed(17*13) y - rnorm(500, 10,3) rootonorm(y) Deepayan Sarkar deepayan.sar...@gmail.com 17/01/2011 05:06:54 On Sun, Jan 16, 2011 at 11:58 AM, Hugo Mildenberger hugo.mildenber...@web.de wrote: Thank you very much for your qualified answers, and also for the link to the Tukey paper. I appreciate Tukey's writings very much. Yes, thanks to Hadley for the nice reference, I hadn't seen it before. Looking at the lattice code (below), a possible implementation might involve binning, not so? I see a problematic part here: xx - sort(unique(x)) Unique certainly works well with Poisson distributed data, but is essentially a no-op when confronted with continous floating-point numbers. True, but as Achim said, rootogram() is intended to work with data arising from discrete distributions, not continuous ones. I see now that this is not as explicit as it could be in the help page (although frequency distribution gives a hint), which I will try to improve. I don't think automatic handling of continuous distributions is simple (because it is not clear how you would specify the reference distribution). However, a little preliminary work will get you close with the current implementation: xnorm - rnorm(1000) ## 'discretize' by binning and replacing data by bin midpoints h - hist(xnorm, plot = FALSE) # add arguments for more control xdisc - with(h, rep(mids, counts)) ## Option 1: Assume bin probabilities proportional to dnorm() norm.factor - sum(dnorm(h$mids, mean(xnorm), sd(xnorm))) rootogram(~ xdisc, dfun = function(x) { dnorm(x, mean(xnorm), sd(xnorm)) / norm.factor }) ## Option 2: Compute probabilities explicitly using pnorm() ## pdisc - diff(pnorm(h$breaks)) ## or estimated: pdisc - diff(pnorm(h$breaks, mean = mean(xnorm), sd = sd(xnorm))) pdisc - pdisc / sum(pdisc) rootogram(~ xdisc, dfun = function(x) { f - factor(x, levels = h$mids) pdisc[f] }) -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] median by geometric mean
On 2011-01-17 02:19, S Ellison wrote: Will this do? x- runif(20, 1, 100) exp( median( log( x) ) ) S Ellison That's what Hadley proposed, too. It's fine for your example, but there is potentially a small problem with this method: the data must be positive. Since it's not unusual to see data with some zeros, the log() would fail. Depending on what type of data I was going to use this modification of the median for, I would consider modifying the (quite short) median.default function, with appropriate additional data checks. Peter Ehlers Skull Crossboneswitch.of.agne...@gmail.com 15/01/2011 16:26 Hi All, I need to calculate the median for even number of data points.However instead of calculating the arithmetic mean of the two middle values,I need to calculate their geometric mean. Though I can code this in R, possibly in a few lines, but wondering if there is already some built in function. Can somebody give a hint? Thanks in advance [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-test calculation correct?
As this is apparently a post hoc test, this is wrong. The results are biased. You have provided a nice example of how to do irreproducible science. Consult a local statistician for what this means if you do not know. -- Bert Gunter On Mon, Jan 17, 2011 at 4:35 AM, Sascha Vieweg saschav...@gmail.com wrote: A multinomial logit model (N=192) revealed (besides others) the following statistics for the outcome, y, and one predictor, x: - y = A (baseline, n=34) - y = B (n=26), B(x)=0.7323 (SE=0.2384) - y = C (n=132), B(x)=0.6535 (SE=0.2041) With a t-test I want to explore whether the two predictors differ significantly, and I use the following calculation (according to Bortz, 2005, p.140): ## dm - 0.7323 - 0.6535 se.dm - sqrt( (0.2384 / (34 + 26)) + (0.2041 / (34 + 132)) ) t.val - dm / se.dm pval - (1 - pt(t.val, df=(34+26+132) )) * 2 ## My question is where this calculation is wrong and why. Ref.: Bortz, J. (2005). Statistik für Human- und Sozialwissenschaftler (6. Aufl). Berlin: Springer. -- Sascha Vieweg, saschav...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics 467-7374 http://devo.gene.com/groups/devo/depts/ncb/home.shtml __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] median by geometric mean
I've been reminded by Prof. Brian Ripley that R's log() function will indeed handle zeros appropriately. Apologies to S Ellison and Hadley Wickham. Peter Ehlers On 2011-01-17 06:55, Peter Ehlers wrote: On 2011-01-17 02:19, S Ellison wrote: Will this do? x- runif(20, 1, 100) exp( median( log( x) ) ) S Ellison That's what Hadley proposed, too. It's fine for your example, but there is potentially a small problem with this method: the data must be positive. Since it's not unusual to see data with some zeros, the log() would fail. Depending on what type of data I was going to use this modification of the median for, I would consider modifying the (quite short) median.default function, with appropriate additional data checks. Peter Ehlers Skull Crossboneswitch.of.agne...@gmail.com 15/01/2011 16:26 Hi All, I need to calculate the median for even number of data points.However instead of calculating the arithmetic mean of the two middle values,I need to calculate their geometric mean. Though I can code this in R, possibly in a few lines, but wondering if there is already some built in function. Can somebody give a hint? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
ADias wrote: Is there an expression to double the values of a matrix - without using a loop? Why so complicated? Dieter m = matrix(rep(1,20),nrow=4) m [,1] [,2] [,3] [,4] [,5] [1,]11111 [2,]11111 [3,]11111 [4,]11111 m*3 [,1] [,2] [,3] [,4] [,5] [1,]33333 [2,]33333 [3,]33333 [4,]33333 -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221231.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help for R plot
Fabrice Tourre wrote: How to plot as the coordinate as in my attachment? I want to trim the coordinate and one of plot as the figure in attachment. Does any one have such example? http://markmail.org/message/3jn2sqoep36ckswb (for a lattice-lookalike) and package plotrix Dieter -- View this message in context: http://r.789695.n4.nabble.com/Help-for-R-plot-tp3221035p3221232.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with TeachingDemos package
What happens if you just load the R2wd package then run wdGet() yourself? Also what OS, version of R, version of TeachingDemos, and version of R2wd are you using? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of gaiarrido Sent: Saturday, January 15, 2011 3:30 AM To: r-help@r-project.org Subject: Re: [R] Problems with TeachingDemos package R2wd is working but i received an alarm: wdtxtStart() Error en R2wd::wdGet() : tentativa de aplicar una no-función The translation is attempt to apply a no-function - Mario Garrido Escudero PhD student Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola Universidad de Salamanca -- View this message in context: http://r.789695.n4.nabble.com/Problems- with-TeachingDemos-package-tp3218266p3218935.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
try ... new_m = m[c(2,7,8),c(1,4,6,7)] HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221234.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] median by geometric mean
Just in case some of x are negative (the desired median still exists, as long as the two middle values are non -ve), how about: x - runif(20, -1, 100) exp(median(log(pmax(0,x It'll give -Inf if the two middle values are negative, when I guess we should get NaN, but I can't see a 1-line way to handle that! Keith J Peter Ehlers ehl...@ucalgary.ca wrote in message news:4d3468ef.5010...@ucalgary.ca... I've been reminded by Prof. Brian Ripley that R's log() function will indeed handle zeros appropriately. Apologies to S Ellison and Hadley Wickham. Peter Ehlers On 2011-01-17 06:55, Peter Ehlers wrote: On 2011-01-17 02:19, S Ellison wrote: Will this do? x- runif(20, 1, 100) exp( median( log( x) ) ) S Ellison That's what Hadley proposed, too. It's fine for your example, but there is potentially a small problem with this method: the data must be positive. Since it's not unusual to see data with some zeros, the log() would fail. Depending on what type of data I was going to use this modification of the median for, I would consider modifying the (quite short) median.default function, with appropriate additional data checks. Peter Ehlers Skull Crossboneswitch.of.agne...@gmail.com 15/01/2011 16:26 Hi All, I need to calculate the median for even number of data points.However instead of calculating the arithmetic mean of the two middle values,I need to calculate their geometric mean. Though I can code this in R, possibly in a few lines, but wondering if there is already some built in function. Can somebody give a hint? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] median by geometric mean -- are we missing what's important?
Folks: I know this may be overreaching, but are we missing what's important? WHY do the zeros occur? Are they values less then a known or unknown LOD? -- and/or is there positive mass on zero? In either case, using logs to calculate a geometric mean may not make sense. Paraphrasing Greg Snow, what is the scientific question? What is the model? Cheers, Bert On Mon, Jan 17, 2011 at 9:13 AM, Keith Jewell k.jew...@campden.co.uk wrote: Just in case some of x are negative (the desired median still exists, as long as the two middle values are non -ve), how about: x - runif(20, -1, 100) exp(median(log(pmax(0,x It'll give -Inf if the two middle values are negative, when I guess we should get NaN, but I can't see a 1-line way to handle that! Keith J Peter Ehlers ehl...@ucalgary.ca wrote in message news:4d3468ef.5010...@ucalgary.ca... I've been reminded by Prof. Brian Ripley that R's log() function will indeed handle zeros appropriately. Apologies to S Ellison and Hadley Wickham. Peter Ehlers On 2011-01-17 06:55, Peter Ehlers wrote: On 2011-01-17 02:19, S Ellison wrote: Will this do? x- runif(20, 1, 100) exp( median( log( x) ) ) S Ellison That's what Hadley proposed, too. It's fine for your example, but there is potentially a small problem with this method: the data must be positive. Since it's not unusual to see data with some zeros, the log() would fail. Depending on what type of data I was going to use this modification of the median for, I would consider modifying the (quite short) median.default function, with appropriate additional data checks. Peter Ehlers Skull Crossboneswitch.of.agne...@gmail.com 15/01/2011 16:26 Hi All, I need to calculate the median for even number of data points.However instead of calculating the arithmetic mean of the two middle values,I need to calculate their geometric mean. Though I can code this in R, possibly in a few lines, but wondering if there is already some built in function. Can somebody give a hint? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] median by geometric mean -- are we missing what's important?
On Mon, Jan 17, 2011 at 9:23 AM, Bert Gunter gunter.ber...@gene.com wrote: Folks: I know this may be overreaching, but are we missing what's important? WHY do the zeros occur? Are they values less then a known or unknown LOD? -- and/or is there positive mass on zero? In either case, using logs to calculate a geometric mean may not make sense. Paraphrasing Isn't this a bit of a general problem with the geometric mean if there are 0s or an odd number of negative numbers it becomes 0 or imaginary (please do correct me if I'm wrong)? sqrt(prod(c(2, 0, 54))) sqrt(prod(c(-2, 2))) Greg Snow, what is the scientific question? What is the model? Cheers, Bert On Mon, Jan 17, 2011 at 9:13 AM, Keith Jewell k.jew...@campden.co.uk wrote: Just in case some of x are negative (the desired median still exists, as long as the two middle values are non -ve), how about: x - runif(20, -1, 100) exp(median(log(pmax(0,x It'll give -Inf if the two middle values are negative, when I guess we should get NaN, but I can't see a 1-line way to handle that! Keith J Peter Ehlers ehl...@ucalgary.ca wrote in message news:4d3468ef.5010...@ucalgary.ca... I've been reminded by Prof. Brian Ripley that R's log() function will indeed handle zeros appropriately. Apologies to S Ellison and Hadley Wickham. Peter Ehlers On 2011-01-17 06:55, Peter Ehlers wrote: On 2011-01-17 02:19, S Ellison wrote: Will this do? x- runif(20, 1, 100) exp( median( log( x) ) ) S Ellison That's what Hadley proposed, too. It's fine for your example, but there is potentially a small problem with this method: the data must be positive. Since it's not unusual to see data with some zeros, the log() would fail. Depending on what type of data I was going to use this modification of the median for, I would consider modifying the (quite short) median.default function, with appropriate additional data checks. Peter Ehlers Skull Crossboneswitch.of.agne...@gmail.com 15/01/2011 16:26 Hi All, I need to calculate the median for even number of data points.However instead of calculating the arithmetic mean of the two middle values,I need to calculate their geometric mean. Though I can code this in R, possibly in a few lines, but wondering if there is already some built in function. Can somebody give a hint? Thanks in advance __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CSV value not being read as it appears
David Scott wrote: As a further note, this is a reminder that whenever you get data via a spreadsheet the first thing to do is examine it and clean up any problems. A basic requirement is to tabulate any categorical variable. I like using the ‘describe’ function in the ‘Hmisc’ package for this. If you run the result through the ‘latex’ function, you get an even nicer output, with small histograms for each numerical variable. -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CSV value not being read as it appears
Peter Ehlers wrote: It is hardly R's fault that Excel users routinely commit crimes against data. A ‘fortune’ candidate? -- Karl Ove Hufthammer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sampling question
Dear list i have a sample question I have a dataframe of 1500 species and 13 life history traits. small example code: traits - data.frame(letters[1:9], sample(letters, 9), sample(letters, 9), sample(letters, 9), sample(letters, 9), sample(letters, 9), sample(letters, 9), sample(letters, 9), sample(letters, 9)) colnames(traits) - c(species, 1:8) What i want to do is: Sample a number of species from the data frame in integers of 50: - 50 species, 100 species ,150,200... up-to 1500, when i sample them i also want the traits associated with them to be kept intact. For each species number i would like a 1000 repetitions. So i would like 50 species with their life history traits randomly sampled 1000 times, then 100 species with their life history traits sampled 1000 times. I appreciate that as i get to the higher numbers i.e 1500 species this will only be sampled once, therefore i will need to use replace = yes. Then i have a function i want to run on the sample so for the 50 species i want to run a function which requires the name of the sample GFD(50species_sample1) GFD(50species_sample2) etc to GFD(50species_sample1000) Then GFD(100species_sample1) etc. With the reults put into a data-frame. I am relatively new to R, i could probably hack together a code but i am unsure how to join it up so i sample, retain the data and then use it in a function? Any help would be greatly appreciated. I appreciate this is a lot to ask so any help would be greatly appreciated. Thanks in advance, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] to append a column to a data frame, has I use loop/if in my case?
days=Sys.Date()-1:70 price=abs(rnorm(70)) regular=rep(c(0,0,0,0,1,0,1,0,0,1),c(7,7,7,7,7,7,7,7,7,7)) y=data.frame(cbind(days,price,regular)) y is like days price regular 1 14990 0.16149463 0 2 14989 1.69519358 0 3 14988 1.57821998 0 4 14987 0.47614311 0 5 14986 0.87016180 0 6 14985 2.55679229 0 7 14984 0.89753533 0 the output I want: have another column appended to y, whose value is the max price in the recent 2 **regular** weeks. So if the current row is today, then get the max price of the past 14 days (including today) if the last 2 week are regular weeks, if one of the last 2 weeks is not regular week, then I need to go back further to find the max price, as I need the max price for the last 2 **regular** weeks. How can I do that? Or I have to use loop/if to do it? BTW, why the days is like 14990,14989, after cbind(days,price,regular)? before the cbind, days is like the format 2010-12-23. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Log difference in a dataframe column
What am I doing wrong here ? And what's the right way to calculate the log differences in a column in a df ? # first 3 rows of 5000 rows y[1:3,] Date Open High Low Close 1 1983-03-30 29.96 30.51 29.96 30.35 2 1983-03-31 30.35 30.55 30.20 30.24 3 1983-04-04 30.25 30.65 30.24 30.39 #equation in question ...why is this giving zeros ? y1 - 100*log(y[,5]/(lag(y[,5],1))) # first 10 values from the equation...all zeros head(y1,10) [1] 0 0 0 0 0 0 0 0 0 0 -- View this message in context: http://r.789695.n4.nabble.com/Log-difference-in-a-dataframe-column-tp3221225p3221225.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using anova() with glmmPQL()
Dear R HELP, ABOUT glmmPQL and the anova command. Here is an example of a repeated-measures ANOVA focussing on the way starling masses vary according to (i) roost situation and (ii) time (two time points only). library(nlme);library(MASS) stmass=c(78,88,87,88,83,82,81,80,80,89,78,78,85,81,78,81,81,82,76,74,79,73,79,75,77,78,80,78,83,84,77,68,75,70,74,84,80,75,76,75,85,88,86,95,100,87,98,86,89,94,84,88,91,96,86,87,93,87,94,96,91,90,87,84,86,88,92,96,83,85,90,87,85,81,84,86,82,80,90,77) roostsitu=factor(c(tree,tree,tree,tree,tree,tree,tree,tree,tree,tree,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,inside,inside,inside,inside,inside,inside,inside,inside,inside,inside,other,other,other,other,other,other,other,other,other,other,tree,tree,tree,tree,tree,tree,tree,tree,tree,tree,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,inside,inside,inside,inside,inside,inside,inside,inside,inside,inside,other,other,other,other,other,other,other,other,other,other),levels=c(tree,nest-box,inside,other)) mnth=factor(c(rep(Nov,times=40),rep(Jan,times=40)),levels=c(Nov,Jan)) subjectnum=c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10) subject=factor(paste(roostsitu,subjectnum,sep=)) dataf=data.frame(mnth,roostsitu,subjectnum,subject,stmass) lmeres=lme(fixed=stmass~mnth*roostsitu,random=~1|subject/mnth,na.action=na.exclude) anova(object=lmeres,test=Chisq) numDF denDF F-value p-value (Intercept)136 31143.552 .0001 mnth 13695.458 .0001 roostsitu 33610.614 .0001 mnth:roostsitu 336 0.657 0.5838 I can conclude from this that variation with both roost situation and month are significant, but with no interaction term. So far so good. However, say I were interested only in whether or not those starlings were heavier or lighter than 78g: seemingly, I could change my response variable and analyse like this - stmassheavy=ifelse(stmass78,1,0) lmeres1=lme(fixed=stmassheavy~mnth*roostsitu,random=~1|subject/mnth,na.action=na.exclude,family=binomial) anova(object=lmeres1,test=Chisq) but I get errors doing that. After a certain amount of web searching, I find that I'm supposed to use glmmPQL for this so I tried: lmeres2=glmmPQL(fixed=stmassheavy~mnth*roostsitu,random=~1|subject/mnth,na.action=na.exclude,family=binomial) anova(object=lmeres2,test=Chisq) The glmmPQL command runs, but I get Error in anova.glmmPQL(object = lmeres, test = Chisq) : 'anova' is not available for PQL fits. Looking into this, I find that I am not supposed to use the anova command in conjunction with glmmPQL (several posts from Brian Ripley http://r.789695.n4.nabble.com/R-glmmPQL-in-2-3-1-td808574.html and http://www.biostat.wustl.edu/archives/html/s-news/2002-06/msg00055.html and http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg46894.html ) even though it appears that the earlier versions of glmmPQL did allow the anova command to work (before ~2004). However, I couldn't find any other way to run a repeated-measures ANOVA with famiy=binomial. After a while longer on Google, I found a 'workaround' from Spencer Graves (on http://markmail.org/message/jddj6aq66wdidrog#query:how%20to%20use%20anova%20with%20glmmPQL+page:1+mid:jddj6aq66wdidrog+state:results ): class(lmeres2)=lme anova(object=lmeres2,test=Chisq) numDF denDF F-value p-value (Intercept)136 182.84356 .0001 mnth 136 164.57288 .0001 roostsitu 336 17.79263 .0001 mnth:roostsitu 336 3.26912 0.0322 Which does give me a result and tells me that the interaction term is significant here. HOWEVER, on that link Douglas Bates told Spencer Graves that this wasn't an approprate method. I haven't found any other workarounds for this except some general advice that I should move onto using the lmer command (which I can't do because I need to get p-values for my fits and according to https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html I won't get those from lmer). My questions are: (1) Is lmer the only way to do a binomial repeated-measures ANOVA in R? (which means that there's no way to do such an ANOVA in R without losing the p-values) and (2) if I am supposed to be using glmmPQL for this simple situation, what am I doing wrong? Thanks very much for any help anyone can give me. best, Toby Marthews __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel
Following the advice a colleague, I put the gc() and gcinfo(TRUE) commands just before the line I got the problem, and their output were: used (Mb) gc trigger (Mb) max used (Mb) Ncells 471485 12.61704095 45.6 7920371 211.5 Vcells 6408885 48.9 113919753 869.2 347651599 2652.4 Garbage collection 538 = 323+101+114 (level 2) ... 13.0 Mbytes of cons cells used (29%) 49.0 Mbytes of vectors used (7%) Error: cannot allocate vector of size 238.1 Mb If I understood correctly, I should have enough memory for allocating the new matrix (Q.obs - matrix(NA, nrow=6940, MZ ncol=9000) )) Thanks in advance for any help, Mauricio -- === Linux user #454569 -- Ubuntu user #17469 === 2011/1/17 Martin Maechler maech...@stat.math.ethz.ch: MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com on Mon, 17 Jan 2011 11:46:44 +0100 writes: MZ Dear R community, MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a MZ PAE kernel, as you can see here: MZ $ uname -a MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010 MZ i686 i686 i386 GNU/Linux MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940, MZ ncol=9000) ), I got the following error: Error: cannot allocate vector of size 238.3 Mb MZ However, the amount of free memory in my machine seems to be much MZ larger than this: MZ system(free) MZ \ total used free shared buffers cached MZ Mem: 12466236 6354116 6112120 0 67596 2107556 MZ -/+ buffers/cache: 4178964 8287272 MZ Swap: 12582904 0 12582904 MZ I tried to increase the memory limit available for R by using: MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M MZ but it didn't work. MZ Any hint about how can I get R using all the memory available in the machine ? Install a 64-bit version of Linux, i.e., ubuntu in your case and work from there. I don't think there's a way around that. Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
Hi, yes it works perfectly. I have another question: Is there way of selecting with a vector the values I wish to take out from a matrix. Example: I have this matrix and I want to take out the numbers in bold and get the second matrix below m [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 17165 192 191 15 8 [2,]77 2032 1699 1913 [3,]243 11 18 11 14 133 1 [4,]3757 17 18 106515 [5,]8 20 13 108 12 20 19116 [6,]9 141 12 12 12 17 18 1017 [7,]3 10 112 129 186 19 9 [8,] 132 17 16 1889 14916 [9,]94 1141 1797 2012 [10,]91488 19 198 1718 [,1] [,2] [,3] [,4] [1,]73 169 [2,]329 18 [3,] 13 1689 thanks AD -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Importing multiple text files with lapply.
Hello, I'm trying to read in 50 text filess with dates as content to create a list of tables. a is the list of filenames that need to be read in. The following command returns the following error mylist-lapply(a, read.table(header=TRUE, sep=\n)) Error in read.table(header = TRUE, sep = \n) : element 1 is empty; the part of the args list of 'is.character' being evaluated was: (file) Does anyone have any suggestions? Yours, Simon Kiss * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 519 761 7606 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-help Digest, Vol 95, Issue 17
For those issues with optimization methods (optim, optimx, and others) I see, a good percentage are because the objective function (or gradient if user-supplied) is mis-coded. However, an almost equal number are due to functions getting into overflow or underflow territory and yielding quantities that the optimization tools cannot handle (NA or Inf etc.) Two general approaches I find helpful: 1) even if there are no actual bounds on parameters, put in reasonable limits. They don't need to be too tight, just enough to keep the parameters from giving a silly objective function 2) do some evaluations of the objective to make sure it is really being properly calculated. Never hurts to have some known outcomes. Beyond this, we get into reparametrizations. Great idea, but far too much work for most of us, even if we work in the field. Best, JN On 01/17/2011 06:00 AM, r-help-requ...@r-project.org wrote: From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Jinrui Xu jinru...@umich.edu Cc: r-help@r-project.org Subject: Re: [R] fgev_error_matrix_singular __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fw: Re: help in calculating ar on ranked vector
--- On Mon, 1/17/11, Raymond Wong raywong...@yahoo.ca wrote: From: Raymond Wong raywong...@yahoo.ca Subject: Re: [R] help in calculating ar on ranked vector To: Uwe Ligges lig...@statistik.tu-dortmund.de Received: Monday, January 17, 2011, 11:56 AM Thanks Uwe: Here is my code. the first set of print statements work, but not the second. # z-as.vector(na.omit(z)) #remove na nz-length(z) rz-rank(z,ties.method=average) # print(ar(z, order.max=1, method=burg)) print(ar(z, order.max=1, method=ols)) print(ar(z, order.max=1, method=mle)) print(ar(z, order.max=1, method=yule-walker)) # # ** # print(ar(rz, order.max=1, method=burg)) print(ar(rz, order.max=1, method=ols)) print(ar(rz, order.max=1, method=mle)) print(ar(rz, order.max=1, method=yule-walker)) # What did I miss? Thanks a million. Raymond --- On Fri, 1/14/11, Uwe Ligges lig...@statistik.tu-dortmund.de wrote: From: Uwe Ligges lig...@statistik.tu-dortmund.de Subject: Re: [R] help in calculating ar on ranked vector To: Raymond Wong raywong...@yahoo.ca Cc: R-help@r-project.org Received: Friday, January 14, 2011, 12:42 PM Works for my examples. But you have not specified what you actual call to ar() was. Uwe Ligges On 12.01.2011 21:17, Raymond Wong wrote: I was using ar(stats) to calculate autoregressive coefficient. It works on vector z, but it will not work on vector rz-rank (z, ties.method=average). What did I miss? Any info will be greatly appreciated. TIA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using summaryBy with weighted data
Thanks Josh. I built on your example and ended up with the code below--if you or anyone sees any issues please let me know. It would be great if there were a slicker way to get these kinds of summary stats in R, but this gets the job done. # takes data frame z with weights w and data x, returns weighted mean, weighted SE, and N msenw = function(z){ N = length(na.omit(z)$response) i = which(!is.na(z$response)) return( c( W.M = weighted.mean(z$response, z$weights, na.rm=T), W.SE = sqrt(wtd.var(z$response, weights = z$weights))/sqrt(sum(z$weights[i])), N=N ) ) } library(doBy) library(Hmisc) ## make up some data (easier) mydata - data.frame(response = rnorm(100), group = rep(1:5, each = 20), weights = runif(100, 0, 1)) xy - by(mydata, mydata$group, msenw) data.frame( group = names(c(xy)), do.call(rbind, xy) ) ## can be extended to other data using: xy - by(data.frame(response = mydata$response, weights = mydata$weights), mydata$group, msenw) Solomon Messing www.stanford.edu/~messing On Jan 16, 2011, at 11:16 PM, Joshua Wiley wrote: Dear Solomon, On Sun, Jan 16, 2011 at 10:27 PM, Solomon Messing solomon.mess...@gmail.com wrote: Dear Soren and R users: I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows: library(doBy) ## make up some data response = rnorm(100) group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20)) weights = runif(100, 0, 1) mydata = data.frame(response,group,weights) ## run summaryBy without weights: summaryBy(response~group, data = mydata, FUN = mean) ## attempt to run summaryBy with weights, throws error summaryBy(x~group, data = mydata, FUN = weighted.mean, w=weights ) ## throws the error: # Error in tapply(lh.data[, lh.var[vv]], rh.string.factor, function(x) { : # arguments must have same length My guess is that summaryBy is not giving weighted.mean() each group of weights, but instead is passing all of the weights in the data set each time it calls weighted.mean(). Yes, of course. It has no way of knowing that the weights should also be being broken down by groupthey are not in the formula. Do you know if there is some way to get summaryBy to pass weights to weighted.mean() only for each group? Ideally there would be a way to pass more than one variable to a function (e.g., response and weights) or just an entire object (mydata) broken down by group. Then you would just make a wrapper function to pass the right values to the x and w arguments of weighted.mean. Instead here is a somewhat hacked version: library(doBy) ## make up some data (easier) mydata - data.frame(response = rnorm(100), group = rep(1:5, each = 20), weights = runif(100, 0, 1)) ## manually compute weighted mean tmp - summaryBy(response*weights ~ group, data = mydata, FUN = sum) tmp[,2] - tmp[,2]/with(mydata, tapply(weights, group, sum)) tmp ## weighted means ## here's the 'problem', if you will, even with +, they are passed one at a time summaryBy(response + weights ~ group, data = mydata, FUN = str) summaryBy(mydata ~ group, data = mydata, FUN = str) ## here is an option using by(): xy - by(mydata, mydata$group, function(z) weighted.mean(z$response, z$weights)) xy ## if you don't like the formatting data.frame(group = names(c(xy)), weighted.mean = c(xy)) HTH, Josh I suspect this functionality would be a tremendous benefit to R users who regularly work with weighted data, such as myself. Thanks, Solomon Messing www.stanford.edu/~messing PS I know this basic example can be done using lapply(split(...)) approach referenced here: http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg12349.html but for more complex tasks the lapply approach will mean writing a lot of extra code to run everything and then to get things formatted as nicely as summaryBy() was designed to do. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [Fwd: Re: R-help Digest, Vol 95, Issue 17]
Apologies if this is posted twice. The r-help mailing system gave an error (reported to moderator) on first try, but it may have gone through. Original Message Subject: Re: R-help Digest, Vol 95, Issue 17 From:Prof. John C Nash nas...@uottawa.ca Date:Mon, 17 January, 2011 1:04 pm To: r-help@r-project.org Cc: lig...@statistik.tu-dortmund.de jinru...@umich.edu -- For those issues with optimization methods (optim, optimx, and others) I see, a good percentage are because the objective function (or gradient if user-supplied) is mis-coded. However, an almost equal number are due to functions getting into overflow or underflow territory and yielding quantities that the optimization tools cannot handle (NA or Inf etc.) Two general approaches I find helpful: 1) even if there are no actual bounds on parameters, put in reasonable limits. They don't need to be too tight, just enough to keep the parameters from giving a silly objective function 2) do some evaluations of the objective to make sure it is really being properly calculated. Never hurts to have some known outcomes. Beyond this, we get into reparametrizations. Great idea, but far too much work for most of us, even if we work in the field. Best, JN On 01/17/2011 06:00 AM, r-help-requ...@r-project.org wrote: From: Uwe Ligges lig...@statistik.tu-dortmund.de To: Jinrui Xu jinru...@umich.edu Cc: r-help@r-project.org Subject: Re: [R] fgev_error_matrix_singular __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summing data frame columns on identical data
Dear all, I have 9 data frames, and I'm simply trying to sum the values of column 3 (on a row-by-row basis). However, there are a slightly different number of rows in each data frame, so I'm receiving the following error: Error in Ops.data.frame(mrunoff_207101[3], mrunoff_207102[3]) : + only defined for equally-sized data frames. Here is what I'm attempting to do: arunoff_2071 - cbind(mrunoff_207101[1:2], (mrunoff_207101[3] + mrunoff_207102[3] + mrunoff_207103[3] + mrunoff_207104[3] + mrunoff_207105[3] + mrunoff_207106[3] + mrunoff_207107[3] + mrunoff_207108[3] + mrunoff_207109[3])) Is there an easy way of summing based on congruent values in columns 1 and 2? The only way I can think of would be to use merge, but this would involve doing this for every pair of data frames. The data for each data frame look like this: head(mrunoff_207101) Latitude Longitude FPC 1 5.75 0.25 0.0112384744 2 6.25 0.25 0.0019959067 3 6.75 0.25 0.0003245941 4 7.25 0.25 0.0011973676 5 7.75 0.25 0.0001062602 6 8.25 0.25 0.0451578423 Any suggestions on how to achieve this easily will be very welcome. Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing multiple text files with lapply.
try: mylist - lapply(a, read.table, header = TRUE, sep = '\n') also is the separator really '\n' meaning a new-line? What exactly does the data look like? On Mon, Jan 17, 2011 at 11:47 AM, Simon Kiss simonjk...@yahoo.ca wrote: Hello, I'm trying to read in 50 text filess with dates as content to create a list of tables. a is the list of filenames that need to be read in. The following command returns the following error mylist-lapply(a, read.table(header=TRUE, sep=\n)) Error in read.table(header = TRUE, sep = \n) : element 1 is empty; the part of the args list of 'is.character' being evaluated was: (file) Does anyone have any suggestions? Yours, Simon Kiss * Simon J. Kiss, PhD Assistant Professor, Wilfrid Laurier University 73 George Street Brantford, Ontario, Canada N3T 2C9 Cell: +1 519 761 7606 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Log difference in a dataframe column
On 2011-01-17 07:44, eric wrote: What am I doing wrong here ? And what's the right way to calculate the log differences in a column in a df ? # first 3 rows of 5000 rows y[1:3,] Date Open High Low Close 1 1983-03-30 29.96 30.51 29.96 30.35 2 1983-03-31 30.35 30.55 30.20 30.24 3 1983-04-04 30.25 30.65 30.24 30.39 #equation in question ...why is this giving zeros ? y1- 100*log(y[,5]/(lag(y[,5],1))) # first 10 values from the equation...all zeros head(y1,10) [1] 0 0 0 0 0 0 0 0 0 0 Well, take a look at the output of lag(). Try it with as.ts(y[, 5]) replacing y[, 5]. Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The Percentile of a User-Defined pdf
I got it to work: # To get a percentile of a single-variable function: # Step 1: Integrate over the domain to ge the normalization constant: Z-integrate(function(x) sqrt(1+x^-1), 1,2)$value Z # Step 2: Find the .975 percentile x975-uniroot(function(t) integrate(function(x) sqrt(1+x^-1), 1, t)$value/Z-.975, lower=1, upper=2, tol=5e-4 )$root x975 # To get a percentile of a marginal of a bivariate function of (x,y): # Define and compute the marginal x distribution: fx-function(x) {sapply(x, function(x) integrate( function(y) sqrt(sin(x)+1/y), 0,10)$value ) } # Then proceed as above in the singe-variable case. Thank you Dieter and David. Nissim Kaufmann Dept. of Mathematics and Statistics University at Albany In reply to http://www.mail-archive.com/r-help@r-project.org/msg121420.html __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] matrix manipulations
Hi, I am having some difficulties with matrix operations. It is a little hard to explain it so please bear with me. I have a very large data set, large enough that it needs to be split in parts in order to deal with. I can work things on these parts but the problem lies in adding together these parts for the final answer. So that been said, let's say that i split the data in 2 parts, 1 and 2. Each part has data belonging to 6 different categories, and each category has 2 different classes, these classes being the same for each category. The classes are called land and water and each category is labeled cat1 to cat6. I am using the command (function) table to tabulate each class for each category, but since i split the data in 2 parts, one part has only some of the 6 categories, and the other some other of the 6 categories (and not necessarily exclusive). So let's built some results after i used the table function. m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c(land, water), c(cat2, cat5, cat6))) m1 cat2 cat5 cat6 land 3235 36 water 12 15 16 m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow = TRUE, dimnames = list(c(land, water), c(cat1, cat2, cat3, cat4))) m2 cat1 cat2 cat3 cat4 land 45 46 47 48 water 21 22 23 24 So my end desired result should be a matrix (or a data frame) that has 6 columns called cat1 to cat6 and 2 rows labeled land and water, and for the category that appears in both m1 and m2 the end result will be a sum. results will be m3: cat1 cat2 cat3 cat4 cat5 cat6 land 45 78 4748 35 36 water 21 34 2324 15 16 To do this i thought in making an empty matrix for each m1 and m2 (called m01 and m02 respectively) with 6 columns and 2 rows, and do a long if else statement in which i match the name of the first column in m1 with the name of the first column in m01 and if they match get the data from m1, if not leave it 0 and so on. Same thing for m2 and m02. This is long and extremely clunky but afterwards i can add m01 with m02 and get my desired result m3. Is there any way i can do this more elegantly? My real data is split in 4 parts, but the problem is the same. Thanks for all your inputs, and sorry for this long email, but i didn't know how else i could explain what i wanted to do. Monica __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
I believe you want to select a subset of rows and subset of columns of your original matrix m. If you had wanted only the first row of m, you could have used m[1,] Alternatively, if you had wanted only the second column of m then you could have used m[,2] m[1,2] would give you the element at row 1, column 2. You are requesting rows 2,7, and 8 and columns 1,4,6 and 7. The syntax is m[required rows, required columns] c() allows you to specify multiple rows/columns at the same time. HTH Pete -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221255.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulations
Monica - Perhaps this small example can demonstrate how factors can solve your problem: d1 = data.frame(cat=sample(c('cat2','cat5','cat6'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE)) d2 = data.frame(cat=sample(c('cat1','cat3','cat4'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE)) d1$cat = factor(d1$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6')) d2$cat = factor(d2$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6')) table(d1$group,d1$cat) + table(d2$group,d2$cat) cat1 cat2 cat3 cat4 cat5 cat6 land14 17 18 22 19 23 water 19 15 16 11 10 16 This works because when you include all possible levels in a factor, R will automatically put zeroes in the right places when you use table(): table(d1$group,d1$cat) cat1 cat2 cat3 cat4 cat5 cat6 land 0 1700 19 23 water0 1500 10 16 table(d2$group,d2$cat) cat1 cat2 cat3 cat4 cat5 cat6 land140 18 2200 water 190 16 1100 Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 17 Jan 2011, Monica Pisica wrote: Hi, I am having some difficulties with matrix operations. It is a little hard to explain it so please bear with me. I have a very large data set, large enough that it needs to be split in parts in order to deal with. I can work things on these parts but the problem lies in adding together these parts for the final answer. So that been said, let's say that i split the data in 2 parts, 1 and 2. Each part has data belonging to 6 different categories, and each category has 2 different classes, these classes being the same for each category. The classes are called land and water and each category is labeled cat1 to cat6. I am using the command (function) table to tabulate each class for each category, but since i split the data in 2 parts, one part has only some of the 6 categories, and the other some other of the 6 categories (and not necessarily exclusive). So let's built some results after i used the table function. m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c(land, water), c(cat2, cat5, cat6))) m1 cat2 cat5 cat6 land 3235 36 water 12 15 16 m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow = TRUE, dimnames = list(c(land, water), c(cat1, cat2, cat3, cat4))) m2 cat1 cat2 cat3 cat4 land 45 46 47 48 water 21 22 23 24 So my end desired result should be a matrix (or a data frame) that has 6 columns called cat1 to cat6 and 2 rows labeled land and water, and for the category that appears in both m1 and m2 the end result will be a sum. results will be m3: cat1 cat2 cat3 cat4 cat5 cat6 land 45 78 4748 35 36 water 21 34 2324 15 16 To do this i thought in making an empty matrix for each m1 and m2 (called m01 and m02 respectively) with 6 columns and 2 rows, and do a long if else statement in which i match the name of the first column in m1 with the name of the first column in m01 and if they match get the data from m1, if not leave it 0 and so on. Same thing for m2 and m02. This is long and extremely clunky but afterwards i can add m01 with m02 and get my desired result m3. Is there any way i can do this more elegantly? My real data is split in 4 parts, but the problem is the same. Thanks for all your inputs, and sorry for this long email, but i didn't know how else i could explain what i wanted to do. Monica __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulations
Try this: library(reshape) xtabs(rowSums(cbind(value.x, value.y), na.rm = TRUE) ~ X1 + X2, merge(melt(m1), melt(m2), by = c('X1', 'X2'), all = TRUE), exclude = FALSE) On Mon, Jan 17, 2011 at 5:59 PM, Monica Pisica pisican...@hotmail.comwrote: Hi, I am having some difficulties with matrix operations. It is a little hard to explain it so please bear with me. I have a very large data set, large enough that it needs to be split in parts in order to deal with. I can work things on these parts but the problem lies in adding together these parts for the final answer. So that been said, let's say that i split the data in 2 parts, 1 and 2. Each part has data belonging to 6 different categories, and each category has 2 different classes, these classes being the same for each category. The classes are called land and water and each category is labeled cat1 to cat6. I am using the command (function) table to tabulate each class for each category, but since i split the data in 2 parts, one part has only some of the 6 categories, and the other some other of the 6 categories (and not necessarily exclusive). So let's built some results after i used the table function. m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c(land, water), c(cat2, cat5, cat6))) m1 cat2 cat5 cat6 land 3235 36 water 12 15 16 m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow = TRUE, dimnames = list(c(land, water), c(cat1, cat2, cat3, cat4))) m2 cat1 cat2 cat3 cat4 land 45 46 47 48 water 21 22 23 24 So my end desired result should be a matrix (or a data frame) that has 6 columns called cat1 to cat6 and 2 rows labeled land and water, and for the category that appears in both m1 and m2 the end result will be a sum. results will be m3: cat1 cat2 cat3 cat4 cat5 cat6 land 45 78 4748 35 36 water 21 34 2324 15 16 To do this i thought in making an empty matrix for each m1 and m2 (called m01 and m02 respectively) with 6 columns and 2 rows, and do a long if else statement in which i match the name of the first column in m1 with the name of the first column in m01 and if they match get the data from m1, if not leave it 0 and so on. Same thing for m2 and m02. This is long and extremely clunky but afterwards i can add m01 with m02 and get my desired result m3. Is there any way i can do this more elegantly? My real data is split in 4 parts, but the problem is the same. Thanks for all your inputs, and sorry for this long email, but i didn't know how else i could explain what i wanted to do. Monica __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Replacing rows in a data frame
R-helpers, Below is a simple example of some output that I am getting while trying to work with a data frame in R 2.12.1 for Mac. - testdat - data.frame(matrix(ncol=10, nrow=10)) colnames(testdat) - c('a','b','c','d','e','f','g','h','i','j') testdat[seq(1,10,3),] - c(1,0,0,0,0,0,0,0,0,0) testdat a b c d e f g h i j 1 1 0 0 0 0 1 0 0 0 0 2 NA NA NA NA NA NA NA NA NA NA 3 NA NA NA NA NA NA NA NA NA NA 4 0 0 0 0 0 0 0 0 0 0 5 NA NA NA NA NA NA NA NA NA NA 6 NA NA NA NA NA NA NA NA NA NA 7 0 0 1 0 0 0 0 1 0 0 8 NA NA NA NA NA NA NA NA NA NA 9 NA NA NA NA NA NA NA NA NA NA 10 0 0 0 0 0 0 0 0 0 0 - The output is not what I would have anticipated. Since seq(1,10,3) gives the vector [1 4 7 10], I expected rows 1, 4, 7 and 10 of the data.frame testdat to contain the same data, a 1 for variable 'a' and zeros for all other variables. I guess I assumed the assigment would proceed by rows, but it appears from the resulting output to be proceeding by columns. Can someone point out how I can modify this simple code so that the assignments proceed by rows? Thank you. Brant __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Replacing rows in a data frame
Try this: testdat[seq(1,10,3),] - t(replicate(4, c(1,0,0,0,0,0,0,0,0,0))) On Mon, Jan 17, 2011 at 6:29 PM, Brant Inman brant.in...@me.com wrote: R-helpers, Below is a simple example of some output that I am getting while trying to work with a data frame in R 2.12.1 for Mac. - testdat - data.frame(matrix(ncol=10, nrow=10)) colnames(testdat) - c('a','b','c','d','e','f','g','h','i','j') testdat[seq(1,10,3),] - c(1,0,0,0,0,0,0,0,0,0) testdat a b c d e f g h i j 1 1 0 0 0 0 1 0 0 0 0 2 NA NA NA NA NA NA NA NA NA NA 3 NA NA NA NA NA NA NA NA NA NA 4 0 0 0 0 0 0 0 0 0 0 5 NA NA NA NA NA NA NA NA NA NA 6 NA NA NA NA NA NA NA NA NA NA 7 0 0 1 0 0 0 0 1 0 0 8 NA NA NA NA NA NA NA NA NA NA 9 NA NA NA NA NA NA NA NA NA NA 10 0 0 0 0 0 0 0 0 0 0 - The output is not what I would have anticipated. Since seq(1,10,3) gives the vector [1 4 7 10], I expected rows 1, 4, 7 and 10 of the data.frame testdat to contain the same data, a 1 for variable 'a' and zeros for all other variables. I guess I assumed the assigment would proceed by rows, but it appears from the resulting output to be proceeding by columns. Can someone point out how I can modify this simple code so that the assignments proceed by rows? Thank you. Brant __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R scheduling request
You could write a batch file and then have your OS schedule to run R on the batch file whenever you want (see Rscript for one approach of running the batch). Inside of R you can use Sys.sleep to wait a certain amount of time before running the next command. If you load the tcltk2 package then you can use the tclTaskSchedule function. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Alessandro Oggioni Sent: Saturday, January 15, 2011 6:19 AM To: r-help Subject: [R] R scheduling request Dear all, I have used R.rps to produce a Google API chart (googleVis) with a data request in another server. But i don't understand how is possible to scheduling a request data to the server and after produce a update of the charts. Thanks in advance. Alessandro Oggioni __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to still processing despite bug errors?
Hi, everybody. I am working processing EEG data from 1000 pacients. I have a specific syntax to perform the Spectral Analysis and a loop to analyse all subjects. each subject data are in separate folders (P1, P2 P3...) My question is: in some cases, some errors can appear in one subject. I want to know if is possible to jump to the next subject and perform the same syntax , exibiting an error like: Working on P1 Error X in P1 Working on P2... The idea is to let the computer processing continuosly all the subjects and at the end only see the problems in the subjects that R could not perform the analysis. Each Subject takes 20 minutes to perform the analysis and I don't want to stay days in front of PC, waiting for the next error in order to start the syntax again with the next subject. Any ideas? Thanks in Advance. Altay Lino de Souza. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
On Jan 17, 2011, at 11:16 AM, ADias wrote: Hi, yes it works perfectly. I have another question: Is there way of selecting with a vector the values I wish to take out from a matrix. Example: I have this matrix and I want to take out the numbers in bold and get the second matrix below This is a plain text mailing list (despite what the Nabble mirror may (mis-)lead you into believing) ... no bold. -- david. m [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 17165 192 191 15 8 [2,]77 2032 1699 1913 [3,]243 11 18 11 14 133 1 [4,]3757 17 18 106515 [5,]8 20 13 108 12 20 19116 [6,]9 141 12 12 12 17 18 1017 [7,]3 10 112 129 186 19 9 [8,] 132 17 16 1889 14916 [9,]94 1141 1797 2012 [10,]91488 19 198 1718 [,1] [,2] [,3] [,4] [1,]73 169 [2,]329 18 [3,] 13 1689 thanks AD -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using summaryBy with weighted data
Hi: Does this do what you need? wstats - function(d) { require(Hmisc) N - length(d$response[!is.na(d$response)]) c(WM = wtd.mean(d$response, d$weights), WSE = sqrt(wtd.var(d$response, d$weights)), N = N) } library(plyr) ddply(mydata, .(group), wstats) groupWM WSE N 1 1 0.1302255752 1.1911298 20 2 2 -0.2814664362 0.8582928 20 3 3 -0.3640550516 1.2618343 20 4 4 0.0002852392 1.1463205 20 5 5 -0.0070283053 1.2315683 20 The trick to writing this function for input into plyr is that the argument is a data frame. When called in ddply(), the function wstats() will be applied to each sub-frame corresponding to the grouping factor(s). Inside it, the variables of interest are extracted relative to the input data frame and the three quantities are computed. I used wtd.mean() and wtd.var() from Hmisc, as both will remove NAs by default. In the ddply call, the function name is simply cited since a sub-data frame is the sole argument of the function. I couldn't figure out how to get doBy to get this to work, as it seems best suited to functions of one argument (a single response), but here's an alternative using the data.table package: library(data.table) # Assumes Hmisc is already loaded... myDT - data.table(mydata, key = 'group') myDT[, list(N = length(response[!is.na(response)]), wtdMean = wtd.mean(response, weights), wtdSE = sqrt(wtd.var(response, weights))), by = 'group'] group N wtdMean wtdSE [1,] 1 20 0.1302255752 1.1911298 [2,] 2 20 -0.2814664362 0.8582928 [3,] 3 20 -0.3640550516 1.2618343 [4,] 4 20 0.0002852392 1.1463205 [5,] 5 20 -0.0070283053 1.2315683 data.table uses a different model of data organization from data frames. A simplistic description is that it you can think of a data.table as analogous to a table in a DBMS. Notice that the 'function call' is indexed inside the data table: the first 'subscript' corresponds to what are called I() operations (analogous to 'select' statements in an SQL); the second 'subscript' corresponds to J() operations, (analogous to 'where' statements), while the third argument is the by group(s), or sub-data tables, to which (in this case) the J() operations apply. For functions that take multiple arguments and that are meant to be applied in a groupwise fashion, I find plyr and data.table to be very good options. There are also base package alternatives (e.g., some combination of lapply(), mapply() and do.call()) and several other packages, but plyr and data.table are generally pretty good at handling most of the niggling details. Having said that, both have learning curves - data.table, in particular, will be much easier to pick up if you have some background in SQLs, since its syntax uses primary principles of SQL. data.table has a vignette and FAQ, along with an independent help list - for details, see its page on R-forge: http://r-forge.r-project.org/projects/datatable/ For plyr's documentation, see http://had.co.nz/plyr/ A link to its mailing list is found on that page as well. HTH, Dennis On Mon, Jan 17, 2011 at 10:24 AM, Solomon Messing solomon.mess...@gmail.com wrote: Thanks Josh. I built on your example and ended up with the code below--if you or anyone sees any issues please let me know. It would be great if there were a slicker way to get these kinds of summary stats in R, but this gets the job done. # takes data frame z with weights w and data x, returns weighted mean, weighted SE, and N msenw = function(z){ N = length(na.omit(z)$response) i = which(!is.na(z$response)) return( c( W.M = weighted.mean(z$response, z$weights, na.rm=T), W.SE = sqrt(wtd.var(z$response, weights = z$weights))/sqrt(sum(z$weights[i])), N=N ) ) } library(doBy) library(Hmisc) ## make up some data (easier) mydata - data.frame(response = rnorm(100), group = rep(1:5, each = 20), weights = runif(100, 0, 1)) xy - by(mydata, mydata$group, msenw) data.frame( group = names(c(xy)), do.call(rbind, xy) ) ## can be extended to other data using: xy - by(data.frame(response = mydata$response, weights = mydata$weights), mydata$group, msenw) Solomon Messing www.stanford.edu/~messing http://www.stanford.edu/%7Emessing On Jan 16, 2011, at 11:16 PM, Joshua Wiley wrote: Dear Solomon, On Sun, Jan 16, 2011 at 10:27 PM, Solomon Messing solomon.mess...@gmail.com wrote: Dear Soren and R users: I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows: library(doBy) ## make up some data response = rnorm(100) group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20)) weights = runif(100, 0, 1) mydata = data.frame(response,group,weights) ## run summaryBy without weights:
[R] Dealing with Latex output in Openoffice
I am making considerable use of Harrell's rms package, but I do not use Latex for writing. (I have enough trouble convincing my co-authors to use Openoffice!). rms makes copious use of Latex output for various mixed graphical and text outputs, amongst other things. Does someone have a convenient strategy for dealing with Latex output and openoffice, either within or outside of a OdfSweave environment? Thanks, Rob __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
OK!! So, the ideia is from the 1st matrix get the 2nd matrix with the use of a vector. is it possible? In the example I have a 10x10 matrix and I get from that one a second 4x3 matrix selected from a vector. thanks ADias 2011/1/17 David Winsemius dwinsem...@comcast.net On Jan 17, 2011, at 11:16 AM, ADias wrote: Hi, yes it works perfectly. I have another question: Is there way of selecting with a vector the values I wish to take out from a matrix. Example: I have this matrix and I want to take out the numbers in bold and get the second matrix below This is a plain text mailing list (despite what the Nabble mirror may (mis-)lead you into believing) ... no bold. -- david. m [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 17165 192 191 15 8 [2,]77 2032 1699 1913 [3,]243 11 18 11 14 133 1 [4,]3757 17 18 106515 [5,]8 20 13 108 12 20 19116 [6,]9 141 12 12 12 17 18 1017 [7,]3 10 112 129 186 19 9 [8,] 132 17 16 1889 14916 [9,]94 1141 1797 2012 [10,]91488 19 198 1718 [,1] [,2] [,3] [,4] [1,]73 169 [2,]329 18 [3,] 13 1689 thanks AD -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using summaryBy with weighted data
It is currently not possible to pass weights in summaryBy. Regards Søren Fra: Joshua Wiley [jwiley.ps...@gmail.com] Sendt: 17. januar 2011 08:16 Til: Solomon Messing Cc: r-help@r-project.org; Søren Højsgaard Emne: Re: [R] Using summaryBy with weighted data Dear Solomon, On Sun, Jan 16, 2011 at 10:27 PM, Solomon Messing solomon.mess...@gmail.com wrote: Dear Soren and R users: I am trying to use the summaryBy function with weights. Is this possible? An example that illustrates what I am trying to do follows: library(doBy) ## make up some data response = rnorm(100) group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20)) weights = runif(100, 0, 1) mydata = data.frame(response,group,weights) ## run summaryBy without weights: summaryBy(response~group, data = mydata, FUN = mean) ## attempt to run summaryBy with weights, throws error summaryBy(x~group, data = mydata, FUN = weighted.mean, w=weights ) ## throws the error: # Error in tapply(lh.data[, lh.var[vv]], rh.string.factor, function(x) { : # arguments must have same length My guess is that summaryBy is not giving weighted.mean() each group of weights, but instead is passing all of the weights in the data set each time it calls weighted.mean(). Yes, of course. It has no way of knowing that the weights should also be being broken down by groupthey are not in the formula. Do you know if there is some way to get summaryBy to pass weights to weighted.mean() only for each group? Ideally there would be a way to pass more than one variable to a function (e.g., response and weights) or just an entire object (mydata) broken down by group. Then you would just make a wrapper function to pass the right values to the x and w arguments of weighted.mean. Instead here is a somewhat hacked version: library(doBy) ## make up some data (easier) mydata - data.frame(response = rnorm(100), group = rep(1:5, each = 20), weights = runif(100, 0, 1)) ## manually compute weighted mean tmp - summaryBy(response*weights ~ group, data = mydata, FUN = sum) tmp[,2] - tmp[,2]/with(mydata, tapply(weights, group, sum)) tmp ## weighted means ## here's the 'problem', if you will, even with +, they are passed one at a time summaryBy(response + weights ~ group, data = mydata, FUN = str) summaryBy(mydata ~ group, data = mydata, FUN = str) ## here is an option using by(): xy - by(mydata, mydata$group, function(z) weighted.mean(z$response, z$weights)) xy ## if you don't like the formatting data.frame(group = names(c(xy)), weighted.mean = c(xy)) HTH, Josh I suspect this functionality would be a tremendous benefit to R users who regularly work with weighted data, such as myself. Thanks, Solomon Messing www.stanford.edu/~messing PS I know this basic example can be done using lapply(split(...)) approach referenced here: http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg12349.html but for more complex tasks the lapply approach will mean writing a lot of extra code to run everything and then to get things formatted as nicely as summaryBy() was designed to do. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel
Mauricio, I tried your matrix allocation on Gentoo-hardened 32 and 64 bit systems. Both work ok, using R-2.11.1 and R-2.12.2 respectively, and both use a recent 2.6.36 kernel revision. This is from the 32 bit system with 512 MB physical memory: system(free) total used free sharedbuffers cached Mem:469356 61884 407472 0 1368 21592 -/+ buffers/cache: 38924 430432 Swap: 1927796 360961891700 gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 120116 3.3 35 9.4 35 9.4 Vcells 78413 0.6 786432 6.0 391299 3.0 bs - matrix(NA, nrow=6940,ncol=9000) gc() used (Mb)gc trigger (Mb) max used (Mb) Ncells 120123 3.3 35 9.4 359.4 Vcells 31308414 238.9 34854943 266.0 31308428 238.9 system(free) total used free sharedbuffers cached Mem:469356 307528 161828 0 1404 22508 -/+ buffers/cache: 283616 185740 Swap: 1927796 360841891712 MZ I tried to increase the memory limit available for R by using: MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M Hmm, I wonder if specifying 5000M is a good idea within a 32-bit environment. Depending on R's internal implementation, maybe that value could overflow an tacitly wrap around on a 32 bit integer. (5000M 2^32 - 1) You may try to specify 1000M instead. But I think it's more probable that the system or VM configuration had setup a memory usage limit per user or per process. How to view/change this on redhat I don't know. But you may try to compile a small C programm using malloc() and see what happens if you request say 1Gigabyte: #include stdlib.h #include stdio.h void main() { const size_t size = 10LU; void* p = malloc(size); if ( p ) { fprintf(stderr,successfully allocated %lu bytes\n,size); }else { fprintf(stderr,allocation of %lu bytes failed:%m\n,size); } } put this into a file named, say, tmalloc.c and compile it using gcc tmalloc.c -o tmalloc Hugo On Monday 17 January 2011 16:42:43 Mauricio Zambrano wrote: Following the advice a colleague, I put the gc() and gcinfo(TRUE) commands just before the line I got the problem, and their output were: used (Mb) gc trigger (Mb) max used (Mb) Ncells 471485 12.61704095 45.6 7920371 211.5 Vcells 6408885 48.9 113919753 869.2 347651599 2652.4 Garbage collection 538 = 323+101+114 (level 2) ... 13.0 Mbytes of cons cells used (29%) 49.0 Mbytes of vectors used (7%) Error: cannot allocate vector of size 238.1 Mb If I understood correctly, I should have enough memory for allocating the new matrix (Q.obs - matrix(NA, nrow=6940, MZ ncol=9000) )) Thanks in advance for any help, Mauricio MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com on Mon, 17 Jan 2011 11:46:44 +0100 writes: MZ Dear R community, MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a MZ PAE kernel, as you can see here: MZ $ uname -a MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010 MZ i686 i686 i386 GNU/Linux MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940, MZ ncol=9000) ), I got the following error: Error: cannot allocate vector of size 238.3 Mb MZ However, the amount of free memory in my machine seems to be much MZ larger than this: MZ system(free) MZ \ total used free sharedbuffers cached MZ Mem: 1246623663541166112120 0 67596 2107556 MZ -/+ buffers/cache:41789648287272 MZ Swap: 12582904 0 12582904 MZ I tried to increase the memory limit available for R by using: MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M MZ but it didn't work. MZ Any hint about how can I get R using all the memory available in the machine ? Install a 64-bit version of Linux, i.e., ubuntu in your case and work from there. I don't think there's a way around that. Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to cut a multidimensional array along a chosen dimension and store each piece into a list
Dear R-Helpers, I wonder whether there is a function which cuts a multiple dimensional array along a chosen dimension and then store each piece (still an array of one dimension less) into a list. For example, arr - array(seq(1*2*3*4),dim=c(1,2,3,4)) # I made a point to set the length of the first dimension be 1to test whether I worry about drop=F option. brkArrIntoListAlong - function(arr,alongWhichDim){ return(outlist) } I have tried splitter_a in plyr package but does not get what I want. library(plyr) plyr:::splitter_a(arr,3) I understand that I can write a for loop to make it happen but I am searching for a better solution. Thanks in advance. -Sean [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Manipulation
Dear R family, I am a relative newbie and have been dabbling with R for a little while. Simple things really, but my employers are beginning to see the benefits of using R instead of excel. We have a remote monitoring station measuring groundwater levels. We download the date as a .csv file and up until now, we have been using excel to analyse the data. It’s been a hassle trying to wrestle with that damn program as my boss wants to do things that excel was never meant to do, so I’ve convinced my boss to give R a chance. It’s been a steep learning curve, but I’m fairly confident I can reduce the amount of labour involved in producing and improving the graphs we show our clients. The groundwater levels are measured by pressure sensors lowered into the monitoring wells. After a certain time, the sensors were lowered further into the well, thus creating a disparity in the measurements. The data frame I import into R looks something like this: DateWaterhead (mm) Parameter 1 Paramater 2, etc. 10-01-01 100 10-01-02 105 10-01-03 101 10-01-04 99 10-01-05 85 10-01-06200 # - Sensor lowered# 10-01-07199 10-01-08195 10-01-09185 10-01-10170 For example, on the 10-10-06, the sensor was lowered by 115 mm. When I download the csv file, I download the data from the beginning of the measurement period. I then need to adjust the height by 115 mm to account for the lowering of the parameter. My question to you is how do I do that in R? I am after a formula or a manipulation that selects the first five measurements of a column in the data frame and adds a fixed amount. This is something that is added everytime I download the csv file and import it into R so that when I display my data, it is based on the following data frame: DateWaterhead (mm) 10-01-01 215 10-01-02 220 10-01-03 216 10-01-04 214 10-01-05 200 10-01-06200 10-01-07199 10-01-08195 10-01-09185 10-01-10170 In short, I want to select a fixed number of rows from my data frame, add a constant to the rows of one of the columns, and insert the new values into their respective rows without affecting the subsequent rows. I hope I have produced a reproducible example, I have been searching high and low for a solution, but have come up against a brick wall. I feel I have read something that tackles this some time in the past, but can’t find it again. Thanks in advance! Sincerely, Michael Hopgood MRM Konsult AB -- View this message in context: http://r.789695.n4.nabble.com/Manipulation-tp3221260p3221260.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
Pete Brecknock wrote: try ... new_m = m[c(2,7,8),c(1,4,6,7)] HTH Pete Hi Pete, I haven't understood what you wanted to say here. Can you explain please? thanks ADias -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221252.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Difficult with round() function
Dear list, I'm writing a function to re-grid a data set from finer to coarser resolutions in R as follows (I use this function with sapply/apply): gridResize - function(startVec = stop(What's your input vector), to = stop(Missing 'to': How long do you want the fnial vector to be?)){ from - length(startVec) shortVec-numeric() tics - from*to for(j in 1:to){ interval - ((j/to)*tics - (1/to)*tics + 1):((j/to)*tics) benchmarks - interval/to #FIRST RUN ASSUMES FINAL BENCHMARK/TO IS AN INTEGER... positions - which(round(benchmarks) == benchmarks) indeces - benchmarks[positions] fracs - numeric() #SINCE MUCH OF THE TIME THIS WILL NOT BE THE CASE, THIS SCRIPT DEALS WITH THE REMAINDER... for(i in 1:length(positions)){ if(i == 1) fracs[i] - positions[i]/length(benchmarks) else{ fracs[i] - (positions[i] - sum(positions[1:(i-1)]))/length(benchmarks) } } #AND UPDATES STARTVEC INDECES AND FRACTION MULTIPLIERS if(max(positions) != length(benchmarks)) indeces - c(indeces, max(indeces) + 1) if(sum(fracs) != 1) fracs - c(fracs, 1 - sum(fracs)) fromVals - startVec[indeces] if(any(is.na(fromVals))){ NAindex - which(is.na(fromVals)) if(sum(Fracs[-NAindex]) = 0.5) shortVec[j] - sum(fromVals*fracs, na.rm=TRUE) else shortVec[j] - NA }else{shortVec[j] - sum(fromVals*fracs)} } return(shortVec) } for the simple test case test - gridResize(startVec = c(2,4,6,8,10,8,6,4,2), to = 7) the function works fine. For larger vectors, however, it breaks down. E.g.: test - gridResize(startVec = rnorm(300, 9, 20), to = 200) This returns the error: Error in positions[1:(i - 1)] : only 0's may be mixed with negative subscripts and the problem seems to be in the line positions - which(round(benchmarks) == benchmarks). In this particular example the code cracks up at j = 27. When set j = 27 and run the calculation manually I discover the following: benchmarks[200] [1] 40 benchmarks[200] == 40 [1] FALSE round(benchmarks[200]) == 40 [1] TRUE Even though my benchmark calculation seems to be returning a clean integers to serve as inputs for the creation of the 'positions' variable, for whatever reason R doesn't read it that way. I would be very grateful for any advice on how I can either alter my approach entirely (I am sure there is a far more elegant way to regrid data in R) or a simple fix for this rounding error. Many thanks in advance, Aaron -- Aaron Polhamus aaronpolha...@gmail.com Statistical consultant, Revolution Analytics MSc Applied Statistics, The University of Oxford, 2009 838a NW 52nd St, Seattle, WA 98107 Cell: +1 (206) 380.3948 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extraction and replacement of data in a data frame
Dear R family, I am a relative newbie and have been dabbling with R for a little while. Simple things really, but my employers are beginning to see the benefits of using R instead of excel. We have a remote monitoring station measuring groundwater levels. We download the date as a .csv file and up until now, we have been using excel to analyse the data. It’s been a hassle trying to wrestle with that damn program as my boss wants to do things that excel was never meant to do, so I’ve convinced my boss to give R a chance. It’s been a steep learning curve, but I’m fairly confident I can reduce the amount of labour involved in producing and improving the graphs we show our clients. The groundwater levels are measured by pressure sensors lowered into the monitoring wells. After a certain time, the sensors were lowered further into the well, thus creating a disparity in the measurements. The data frame I import into R looks something like this: DateWaterhead (mm) 10-01-01 100 10-01-02 105 10-01-03 101 10-01-04 99 10-01-05 85 10-01-06200 10-01-07199 10-01-08195 10-01-09185 10-01-10170 For example, on the 10-10-06, the sensor was lowered by 115 mm. When I download the csv file, I download the data from the beginning of the measurement period. I then need to adjust the height by 115 mm to account for the lowering of the parameter. My question to you is how do I do that in R? I am after a formula or a manipulation that selects the first five measurements and adds a fixed amount. This is something that is added everytime I download the csv file and import it into R so that when I display my data, it is based on the following data frame: DateWaterhead (mm) 10-01-01 215 10-01-02 220 10-01-03 216 10-01-04 214 10-01-05 200 10-01-06200 10-01-07199 10-01-08195 10-01-09185 10-01-10170 In short, I want to select a fixed number of rows of a column from my data frame, add a constant to these, and insert the new values into their respective rows without affecting the subsequent rows. I hope I have produced a reproducible example. I have been searching high and low for a solution, but have come up against a brick wall. I feel I have read something that tackles this some time in the past, but can’t find it again. Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3221261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to still processing despite bug errors?
Altay, simply run your tests under control of an exception handler: help(try) help(tryCatch) On Monday 17 January 2011 22:05:07 Altay wrote: Hi, everybody. I am working processing EEG data from 1000 pacients. I have a specific syntax to perform the Spectral Analysis and a loop to analyse all subjects. each subject data are in separate folders (P1, P2 P3...) My question is: in some cases, some errors can appear in one subject. I want to know if is possible to jump to the next subject and perform the same syntax , exibiting an error like: Working on P1 Error X in P1 Working on P2... The idea is to let the computer processing continuosly all the subjects and at the end only see the problems in the subjects that R could not perform the analysis. Each Subject takes 20 minutes to perform the analysis and I don't want to stay days in front of PC, waiting for the next error in order to start the syntax again with the next subject. Any ideas? Thanks in Advance. Altay Lino de Souza. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Accessing MySQL Database in R
I have a local installation of MySQL on my computer. I enter the following to access MySQL from the command line: /Applications/MAMP/Library/bin/mysql -h localhost -u root -p I am then prompted for a password, and I use: root This connects me to MySQL in the command line. I now want to access MySQL databases in R. I enter the following: mysql - dbDriver(MySQL) conn - dbConnect(mysql,user='root',host='localhost', password='root') I get the following error message: Error in mysqlNewConnection(drv, ...) : RS-DBI driver: (Failed to connect to database: Error: Access denied for user 'root'@'localhost' (using password: YES) Does anyone know why these aren't equivalent? -- View this message in context: http://r.789695.n4.nabble.com/Accessing-MySQL-Database-in-R-tp3221264p3221264.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] selection statistics from function
Hi, My code: e - rnorm(n=50, mean=0, sd=sqrt(0.5625)) x0 - c(rep(1,50)) x1 - rnorm(n=50,mean=2,sd=1) x2 - rnorm(n=50,mean=2,sd=1) x3 - rnorm(n=50,mean=2,sd=1) x4 - rnorm(n=50,mean=2,sd=1) y - 1+ 2*x1+4*x2+3*x3+2*x4+e x2[1] = 10 #influential observarion y[1] = 10 #influential observarion data.x - matrix(c(x0,x1,x2,x3,x4),ncol=5) data.y - matrix(y,ncol=1) data.k - cbind(data.x,data.y) result - list() for( i in 1: 3100) { data - data.k[sample(50,50,replace=TRUE),] dataX - data[,1:5] dataY - data[,6] B.cap - solve(crossprod(dataX)) %*% crossprod(dataX,dataY) P - dataX %*% solve(crossprod(dataX)) %*% t(dataX) Y.cap - P %*% dataY e - dataY - Y.cap dX - nrow(dataX) - ncol(dataX) var.cap - crossprod(e) / (dX) ei - as.vector(dataY - dataX %*% B.cap) pi - diag(P) var.cap.i - (((dX) * var.cap) / (dX - 1)) - (ei^2 / ((dX-1) * (1 - pi))) ti - ei / sqrt(var.cap * (1 - pi)) Ci - (ti^2 / (ncol(dataX))) * (pi / (1 - pi)) result - c(result,list(mean(Ci)))} table-do.call(rbind.data.frame,result) names(table)=c(Cook's Distance) table I want to find data's statistics (mean(Ci)) which do not contain influential observation. That is do not contain the value of 10. Can someone help me? Thanks for advices ! -- View this message in context: http://r.789695.n4.nabble.com/selection-statistics-from-function-tp3221267p3221267.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summing data frame columns on identical data
Hi: Try this based on the following toy example: ### Generate a list of named data frames # There are more efficient ways to do this with replicate, but I forgot :) # A function to generate a data frame dmake - function() data.frame(A = factor(rep(1:5, each = 10)), B = rep(rep(c(0.25, 0.5), each = 5), 5), y = rnorm(50)) # Create an empty list dflist - vector('list', 5) # populate it for(i in 1:5) dflist[[i]] - dmake() # Give names to the list components: names(dflist) - paste('df', 1:5, sep = '') library(plyr) # Function to sum y by A-B combinations for a generic data frame dsum - function(d) ddply(d, .(A, B), summarise, sumY = sum(y)) # Apply it to each component of the list: # Returns a list summlist - llply(dflist, dsum) # Returns a data frame summdf - ldply(dflist, dsum) Since you state that the individual data frames have different lengths, you may want to add another variable to dsum to return length, perhaps something like dsum2 - function(d) ddply(d, .(A, B), summarise, sumY = sum(y), n = length(y)) and apply either or both of the llply/ldply calls with dsum2 substituted for dsum. If you want to combine certain groups together, you can create a new factor that merges levels. The following post in the archives provides a clue: http://r.789695.n4.nabble.com/Documentation-detail-was-Merging-factor-levels-td911547.html Since you already have the data frames, you can do something like # Names of existing data frames in the workspace filelist - c('mydf1', 'mydf2', 'anotherdf', 'what_more', 'oyvey') dflist - as.list(sapply(filelist, get)) and then move on to the summarization stage. HTH, Dennis On Mon, Jan 17, 2011 at 10:42 AM, Steve Murray smurray...@hotmail.comwrote: Dear all, I have 9 data frames, and I'm simply trying to sum the values of column 3 (on a row-by-row basis). However, there are a slightly different number of rows in each data frame, so I'm receiving the following error: Error in Ops.data.frame(mrunoff_207101[3], mrunoff_207102[3]) : + only defined for equally-sized data frames. Here is what I'm attempting to do: arunoff_2071 - cbind(mrunoff_207101[1:2], (mrunoff_207101[3] + mrunoff_207102[3] + mrunoff_207103[3] + mrunoff_207104[3] + mrunoff_207105[3] + mrunoff_207106[3] + mrunoff_207107[3] + mrunoff_207108[3] + mrunoff_207109[3])) Is there an easy way of summing based on congruent values in columns 1 and 2? The only way I can think of would be to use merge, but this would involve doing this for every pair of data frames. The data for each data frame look like this: head(mrunoff_207101) Latitude Longitude FPC 1 5.75 0.25 0.0112384744 2 6.25 0.25 0.0019959067 3 6.75 0.25 0.0003245941 4 7.25 0.25 0.0011973676 5 7.75 0.25 0.0001062602 6 8.25 0.25 0.0451578423 Any suggestions on how to achieve this easily will be very welcome. Many thanks, Steve __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to doulbe all the value on a matrix
You have the capability of using the Nabble interface to post plain text. I have checked. There is a little button above your composition frame that lets you change to plain text. On Jan 17, 2011, at 4:38 PM, André Dias wrote: OK!! So, the ideia is from the 1st matrix get the 2nd matrix with the use of a vector. is it possible? Peter B already gave you an answer. m[c(2,7, 8), c(1,4,6, 7)] V1 V4 V6 V7 [1,] 7 3 16 9 [2,] 3 2 9 18 [3,] 13 16 8 9 In the example I have a 10x10 matrix and I get from that one a second 4x3 matrix selected from a vector. Then you should have presented the logical conditions for selecting that matrix rather than expecting us to guess what they might be. Please (re?)read the Posting Guide about what is expected of questioners regarding presenting complete examples of data and code. -- David. thanks ADias 2011/1/17 David Winsemius dwinsem...@comcast.net On Jan 17, 2011, at 11:16 AM, ADias wrote: Hi, yes it works perfectly. I have another question: Is there way of selecting with a vector the values I wish to take out from a matrix. Example: I have this matrix and I want to take out the numbers in bold and get the second matrix below This is a plain text mailing list (despite what the Nabble mirror may (mis-)lead you into believing) ... no bold. -- david. m [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] 17165 192 191 15 8 [2,]77 2032 1699 1913 [3,]243 11 18 11 14 133 1 [4,]3757 17 18 106515 [5,]8 20 13 108 12 20 19116 [6,]9 141 12 12 12 17 18 1017 [7,]3 10 112 129 186 19 9 [8,] 132 17 16 1889 14916 [9,]94 1141 1797 2012 [10,]91488 19 198 1718 [,1] [,2] [,3] [,4] [1,]73 169 [2,]329 18 [3,] 13 1689 thanks AD -- View this message in context: http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matrix manipulations
Hi, I've got 2 very good solutions, thank you very much. One, from Henrique Dallazuanna using the library reshape and one line of code - although it will take me quite some time to understand it. Here it is what he sent: library(reshape) xtabs(rowSums(cbind(value.x, value.y), na.rm = TRUE) ~ X1 + X2, merge(melt(m1), melt(m2), by = c('X1', 'X2'), all = TRUE), exclude = FALSE) The other is from Phil Spector ( code below) that i can understand quite easily, although until now to my shame i never quite used factor levels and their properties and i don't know their uses and possibilities. Until now i tried to avoid them and transform them in something else (like character strings). Again, thanks for all your help, Monica Date: Mon, 17 Jan 2011 12:13:09 -0800 From: spec...@stat.berkeley.edu To: pisican...@hotmail.com CC: r-help@r-project.org Subject: Re: [R] matrix manipulations Monica - Perhaps this small example can demonstrate how factors can solve your problem: d1 = data.frame(cat=sample(c('cat2','cat5','cat6'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE)) d2 = data.frame(cat=sample(c('cat1','cat3','cat4'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE)) d1$cat = factor(d1$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6')) d2$cat = factor(d2$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6')) table(d1$group,d1$cat) + table(d2$group,d2$cat) cat1 cat2 cat3 cat4 cat5 cat6 land 14 17 18 22 19 23 water 19 15 16 11 10 16 This works because when you include all possible levels in a factor, R will automatically put zeroes in the right places when you use table(): table(d1$group,d1$cat) cat1 cat2 cat3 cat4 cat5 cat6 land 0 17 0 0 19 23 water 0 15 0 0 10 16 table(d2$group,d2$cat) cat1 cat2 cat3 cat4 cat5 cat6 land 14 0 18 22 0 0 water 19 0 16 11 0 0 Hope this helps. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spec...@stat.berkeley.edu On Mon, 17 Jan 2011, Monica Pisica wrote: Hi, I am having some difficulties with matrix operations. It is a little hard to explain it so please bear with me. I have a very large data set, large enough that it needs to be split in parts in order to deal with. I can work things on these parts but the problem lies in adding together these parts for the final answer. So that been said, let's say that i split the data in 2 parts, 1 and 2. Each part has data belonging to 6 different categories, and each category has 2 different classes, these classes being the same for each category. The classes are called land and water and each category is labeled cat1 to cat6. I am using the command (function) table to tabulate each class for each category, but since i split the data in 2 parts, one part has only some of the 6 categories, and the other some other of the 6 categories (and not necessarily exclusive). So let's built some results after i used the table function. m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c(land, water), c(cat2, cat5, cat6))) m1 cat2 cat5 cat6 land 32 35 36 water 12 15 16 m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow = TRUE, dimnames = list(c(land, water), c(cat1, cat2, cat3, cat4))) m2 cat1 cat2 cat3 cat4 land 45 46 47 48 water 21 22 23 24 So my end desired result should be a matrix (or a data frame) that has 6 columns called cat1 to cat6 and 2 rows labeled land and water, and for the category that appears in both m1 and m2 the end result will be a sum. results will be m3: cat1 cat2 cat3 cat4 cat5 cat6 land 45 78 47 48 35 36 water 21 34 23 24 15 16 To do this i thought in making an empty matrix for each m1 and m2 (called m01 and m02 respectively) with 6 columns and 2 rows, and do a long if else statement in which i match the name of the first column in m1 with the name of the first column in m01 and if they match get the data from m1, if not leave it 0 and so on. Same thing for m2 and m02. This is long and extremely clunky but afterwards i can add m01 with m02 and get my desired result m3. Is there any way i can do this more elegantly? My real data is split in 4 parts, but the problem is the same. Thanks for all your inputs, and sorry for this long email, but i didn't know how else i could explain what i wanted to do. Monica __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help
Re: [R] Using summaryBy with weighted data
Thanks Dennis, looks like there's even less boiler plate code with plyr. By the way, what I labelled W.SE is meant to represent the weighted standard error of the mean. Your WSE calculations appear to be providing the weighted standard deviation of the variable. Is this a matter of needing to change my labels to W.SEM to avoid this kind of confusion, or is there literature suggesting that I should be using the standard deviation of the variable to estimate the weighted standard error of the mean? Thanks, -Solomon On Jan 17, 2011, at 1:11 PM, Dennis Murphy wrote: Hi: Does this do what you need? wstats - function(d) { require(Hmisc) N - length(d$response[!is.na(d$response)]) c(WM = wtd.mean(d$response, d$weights), WSE = sqrt(wtd.var(d$response, d$weights)), N = N) } library(plyr) ddply(mydata, .(group), wstats) groupWM WSE N 1 1 0.1302255752 1.1911298 20 2 2 -0.2814664362 0.8582928 20 3 3 -0.3640550516 1.2618343 20 4 4 0.0002852392 1.1463205 20 5 5 -0.0070283053 1.2315683 20 The trick to writing this function for input into plyr is that the argument is a data frame. When called in ddply(), the function wstats() will be applied to each sub-frame corresponding to the grouping factor(s). Inside it, the variables of interest are extracted relative to the input data frame and the three quantities are computed. I used wtd.mean() and wtd.var() from Hmisc, as both will remove NAs by default. In the ddply call, the function name is simply cited since a sub-data frame is the sole argument of the function. I couldn't figure out how to get doBy to get this to work, as it seems best suited to functions of one argument (a single response), but here's an alternative using the data.table package: library(data.table) # Assumes Hmisc is already loaded... myDT - data.table(mydata, key = 'group') myDT[, list(N = length(response[!is.na(response)]), wtdMean = wtd.mean(response, weights), wtdSE = sqrt(wtd.var(response, weights))), by = 'group'] group N wtdMean wtdSE [1,] 1 20 0.1302255752 1.1911298 [2,] 2 20 -0.2814664362 0.8582928 [3,] 3 20 -0.3640550516 1.2618343 [4,] 4 20 0.0002852392 1.1463205 [5,] 5 20 -0.0070283053 1.2315683 data.table uses a different model of data organization from data frames. A simplistic description is that it you can think of a data.table as analogous to a table in a DBMS. Notice that the 'function call' is indexed inside the data table: the first 'subscript' corresponds to what are called I() operations (analogous to 'select' statements in an SQL); the second 'subscript' corresponds to J() operations, (analogous to 'where' statements), while the third argument is the by group(s), or sub-data tables, to which (in this case) the J() operations apply. For functions that take multiple arguments and that are meant to be applied in a groupwise fashion, I find plyr and data.table to be very good options. There are also base package alternatives (e.g., some combination of lapply(), mapply() and do.call()) and several other packages, but plyr and data.table are generally pretty good at handling most of the niggling details. Having said that, both have learning curves - data.table, in particular, will be much easier to pick up if you have some background in SQLs, since its syntax uses primary principles of SQL. data.table has a vignette and FAQ, along with an independent help list - for details, see its page on R-forge: http://r-forge.r-project.org/projects/datatable/ For plyr's documentation, see http://had.co.nz/plyr/ A link to its mailing list is found on that page as well. HTH, Dennis On Mon, Jan 17, 2011 at 10:24 AM, Solomon Messing solomon.mess...@gmail.com wrote: Thanks Josh. I built on your example and ended up with the code below--if you or anyone sees any issues please let me know. It would be great if there were a slicker way to get these kinds of summary stats in R, but this gets the job done. # takes data frame z with weights w and data x, returns weighted mean, weighted SE, and N msenw = function(z){ N = length(na.omit(z)$response) i = which(!is.na(z$response)) return( c( W.M = weighted.mean(z$response, z$weights, na.rm=T), W.SE = sqrt(wtd.var(z$response, weights = z$weights))/sqrt(sum(z$weights[i])), N=N ) ) } library(doBy) library(Hmisc) ## make up some data (easier) mydata - data.frame(response = rnorm(100), group = rep(1:5, each = 20), weights = runif(100, 0, 1)) xy - by(mydata, mydata$group, msenw) data.frame( group = names(c(xy)), do.call(rbind, xy) ) ## can be extended to other data using: xy - by(data.frame(response = mydata$response, weights =
Re: [R] Using summaryBy with weighted data
Hi everyone, I am trying to run Sweave.bat (batchfiles_0.6-1) from the command line on Windows, but I get this error: C:\batchfiles_0.6-1Sweave.bat Sweave-test-1 Error: rterm.exe not found I don't know how to set up the path if this one were the problem... I ran rcmd.bat and I got this... so I don't know if it is a path problem. C:\batchfiles_0.6-1Rcmd,bat R_ARCH=/x64 R_ARCH0=x64 R_ARCH0=x64 cmdpath=C:\R\R-2.12.1\bin\x64\Rcmd.exe args=,bat 'bat' is not recognized as an internal or external command, operable program or batch file. the path of rterm.exe in my computer is: C:\R\R-2.12.1\bin\x64 thank you in advance! -- Sebastián Daza sebastian.d...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sweave.bat
Hi everyone, I am trying to run Sweave.bat (batchfiles_0.6-1) from the command line on Windows, but I get this error: C:\batchfiles_0.6-1Sweave.bat Sweave-test-1 Error: rterm.exe not found I don't know how to set up the path if this one were the problem... I ran rcmd.bat and I got this... so I don't know if it is a path problem. C:\batchfiles_0.6-1Rcmd,bat R_ARCH=/x64 R_ARCH0=x64 R_ARCH0=x64 cmdpath=C:\R\R-2.12.1\bin\x64\Rcmd.exe args=,bat 'bat' is not recognized as an internal or external command, operable program or batch file. the path of rterm.exe in my computer is: C:\R\R-2.12.1\bin\x64 thank you in advance! -- Sebastián Daza sebastian.d...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extraction and replacement of data in a data frame
Date: Mon, 17 Jan 2011 12:51:43 -0800 From: michael.hopg...@mrm.se To: r-help@r-project.org Subject: [R] Extraction and replacement of data in a data frame Dear R family, I am a relative newbie and have been dabbling with R for a little while. Simple things really, but my employers are beginning to see the benefits of using R instead of excel. We have a remote monitoring station measuring groundwater levels. We download the date as a .csv file and up until now, we have been using excel to analyse the data. It’s been a hassle trying to wrestle with that damn program as my boss wants to do things that excel was never meant to do, so I’ve convinced my boss to give R a chance. It’s been a steep learning curve, but I’m fairly confident I can reduce the amount of labour involved in producing and improving the graphs we show our clients. The groundwater levels are measured by pressure sensors lowered into the monitoring wells. After a certain time, the sensors were lowered further into the well, thus creating a disparity in the measurements. The data frame I import into R looks something like this: Date Waterhead (mm) 10-01-01 100 10-01-02 105 10-01-03 101 10-01-04 99 10-01-05 85 10-01-06 200 10-01-07 199 10-01-08 195 10-01-09 185 10-01-10 170 For example, on the 10-10-06, the sensor was lowered by 115 mm. When I download the csv file, I download the data from the beginning of the measurement period. I then need to adjust the height by 115 mm to account for the lowering of the parameter. My question to you is how do I do that in R? I am after a formula or a manipulation that selects the first five measurements and adds a fixed amount. This is something that is added everytime I download the csv file and import it into R so that when I display my data, it is based on the following data frame: See if this helps, I'm still learning how to do good R but this seems to work. Just personal pref I converted your data to csv, 254 cat xxx.txt | awk '{print 20$1,$2}' xxx.csv I've neer used posix before, just copying what I've seen here but it seemed to work as shown below, x-read.table(xxx.csv,sep=,) str(x) x$V1=as.POSIXct(x$V1) str(x) y=(x$V1as.POSIXct(2010-01-05)) y x$V2[y]=x$V2[y]+1 x output ends like 5 2010-01-05 200 6 2010-01-06 10200 7 2010-01-07 10199 8 2010-01-08 10195 9 2010-01-09 10185 10 2010-01-10 10170 Date Waterhead (mm) 10-01-01 215 10-01-02 220 10-01-03 216 10-01-04 214 10-01-05 200 10-01-06 200 10-01-07 199 10-01-08 195 10-01-09 185 10-01-10 170 In short, I want to select a fixed number of rows of a column from my data frame, add a constant to these, and insert the new values into their respective rows without affecting the subsequent rows. I hope I have produced a reproducible example. I have been searching high and low for a solution, but have come up against a brick wall. I feel I have read something that tackles this some time in the past, but can’t find it again. Thanks in advance! -- View this message in context: http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3221261.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Manipulation
Hi Michael, This can be accomplished using the basic extract and assign functions: dat - structure(list(Date = structure(1:10, .Label = c(10-01-01, 10-01-02, 10-01-03, 10-01-04, 10-01-05, 10-01-06, 10-01-07, 10-01-08, 10-01-09, 10-01-10), class = factor), Waterhead = c(100, 105, 101, 99, 85, 200, 199, 195, 185, 170)), .Names = c(Date, Waterhead), row.names = c(NA, -10L), class = data.frame) dat[1:5, Waterhead] - dat[1:5, Waterhead] - 115 You may find it helpful to work through An Introduction to R (http://cran.r-project.org/manuals.html) and/or one or more of the fine contributed introductory tutorials (http://cran.r-project.org/other-docs.html). Best, Ista On Mon, Jan 17, 2011 at 3:49 PM, michael.hopgood michael.hopg...@mrm.se wrote: Dear R family, I am a relative newbie and have been dabbling with R for a little while. Simple things really, but my employers are beginning to see the benefits of using R instead of excel. We have a remote monitoring station measuring groundwater levels. We download the date as a .csv file and up until now, we have been using excel to analyse the data. It’s been a hassle trying to wrestle with that damn program as my boss wants to do things that excel was never meant to do, so I’ve convinced my boss to give R a chance. It’s been a steep learning curve, but I’m fairly confident I can reduce the amount of labour involved in producing and improving the graphs we show our clients. The groundwater levels are measured by pressure sensors lowered into the monitoring wells. After a certain time, the sensors were lowered further into the well, thus creating a disparity in the measurements. The data frame I import into R looks something like this: Date Waterhead (mm) Parameter 1 Paramater 2, etc. 10-01-01 100 10-01-02 105 10-01-03 101 10-01-04 99 10-01-05 85 10-01-06 200 # - Sensor lowered# 10-01-07 199 10-01-08 195 10-01-09 185 10-01-10 170 For example, on the 10-10-06, the sensor was lowered by 115 mm. When I download the csv file, I download the data from the beginning of the measurement period. I then need to adjust the height by 115 mm to account for the lowering of the parameter. My question to you is how do I do that in R? I am after a formula or a manipulation that selects the first five measurements of a column in the data frame and adds a fixed amount. This is something that is added everytime I download the csv file and import it into R so that when I display my data, it is based on the following data frame: Date Waterhead (mm) 10-01-01 215 10-01-02 220 10-01-03 216 10-01-04 214 10-01-05 200 10-01-06 200 10-01-07 199 10-01-08 195 10-01-09 185 10-01-10 170 In short, I want to select a fixed number of rows from my data frame, add a constant to the rows of one of the columns, and insert the new values into their respective rows without affecting the subsequent rows. I hope I have produced a reproducible example, I have been searching high and low for a solution, but have come up against a brick wall. I feel I have read something that tackles this some time in the past, but can’t find it again. Thanks in advance! Sincerely, Michael Hopgood MRM Konsult AB -- View this message in context: http://r.789695.n4.nabble.com/Manipulation-tp3221260p3221260.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Accessing MySQL Database in R
Hi: Because R does not have a direct interface to MySQL? You need to load a communication package - the two most common ones are RODBC and RMySQL. The former requires that you register your MySQL database table(s) with ODBC before using the RODBC package on them, whereas the latter works with specific version combinations of MySQL and R. The RODBC package has a very informative vignette; for information re the RMySQL package, see http://biostat.mc.vanderbilt.edu/wiki/Main/RMySQL HTH, Dennis On Mon, Jan 17, 2011 at 1:30 PM, schlafly andrewschla...@gmail.com wrote: I have a local installation of MySQL on my computer. I enter the following to access MySQL from the command line: /Applications/MAMP/Library/bin/mysql -h localhost -u root -p I am then prompted for a password, and I use: root This connects me to MySQL in the command line. I now want to access MySQL databases in R. I enter the following: mysql - dbDriver(MySQL) conn - dbConnect(mysql,user='root',host='localhost', password='root') I get the following error message: Error in mysqlNewConnection(drv, ...) : RS-DBI driver: (Failed to connect to database: Error: Access denied for user 'root'@'localhost' (using password: YES) Does anyone know why these aren't equivalent? -- View this message in context: http://r.789695.n4.nabble.com/Accessing-MySQL-Database-in-R-tp3221264p3221264.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to cut a multidimensional array along a chosen dimension and store each piece into a list
On Mon, Jan 17, 2011 at 2:20 PM, Sean Zhang seane...@gmail.com wrote: Dear R-Helpers, I wonder whether there is a function which cuts a multiple dimensional array along a chosen dimension and then store each piece (still an array of one dimension less) into a list. For example, arr - array(seq(1*2*3*4),dim=c(1,2,3,4)) # I made a point to set the length of the first dimension be 1to test whether I worry about drop=F option. brkArrIntoListAlong - function(arr,alongWhichDim){ return(outlist) } I have tried splitter_a in plyr package but does not get what I want. library(plyr) plyr:::splitter_a(arr,3) We'll you're really not supposed to call internal functions - you probably want: alply(arr, 3) but you don't say what is wrong with the output. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.