[R] Bootstrapping in R
Hi all, 1i have 3 vectors a,b and c, each of length 25... i want to define a new data frame z such that z[1] = (a[1] b[1] c[1]), z[2] = (a[2] b[2] c[2]) and so on...how do i do it in R 2 Then i want to draw bootstrap samples from z. Kindly suggest how i can do this in R. Thanks, Preetam -- Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regarding Modeling - Please! QUICK HELP
I'm a student currently working with the *sleepstudy* dataset in matrix.pkg. It deals with the reaction times of sleep deprived students over a period of days. I am trying to model reaction times in order to describe the variation between students by days they havent slept. This is what I'm running in R, but unfortunately I'm missing something: logmod11 - lmer(log(Reaction) ~ (Subject|Days),REML=FALSE) This is obviously incorrect, so If someone could give me some quick help I'd really appreciate it. Thanks! -- J. Andrew Cochrane University of Illinois | 2013 College of Liberal Arts and Sciences | Statistics (630) 991-7502 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning a variable value based on multiple columns
Hi Jason, I think that the easiest for you would be to keep your current elseif statements as is, but change your NA into something else (e.g., -999, or anything else). To do this in one line, you can use the package gdata. In this code, I assume that your data are stored in the variable dataset: ### #install package gdata if not yet installed install.packages(gdata) #load package gdata library(gdata) #change NA into -999 dataset - NAToUnknown(dataset, -999) #do your ifs/ifelses here... #... #... #change -999 back into NA dataset - unknownToNA(dataset, -999) And that should do it. Hope this helps, Patrick 2013/4/24 Jason Stout, M.D. jason.st...@duke.edu Hi All, I'm hoping someone can help me with a relatively simple problem. Take the following dataset: IDDiabetesESRDHIVContact 100NA0 210NA0 3NA 100 40NA 01 51110 I want to generate a column called TSTcutoff based on the values in the row. TSTcutoff would be the lower of 15 (if Diabetes=ESRD=HIV=Contact=0), 10 (if Diabetes or ESRD=1 AND HIV=Contact=0), or 5 (if HIV OR Contact=1). I was thinking this could be done with a series of IFELSE statements, but the NA values make this more challenging. I want to ignore NA values when calculating TSTcutoff. So the final dataset should look like this: IDDiabetesESRDHIVContact TSTcutoff 100NA015 210NA0 10 3NA 10010 40NA 015 511105 Thanks for any suggestions. Jason Stout, MD, MHS Box 102359-DUMC Durham, NC 27710 FAX 919-681-7494 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding Modeling - Please! QUICK HELP
Hi Andrew, I don't know the dataset at all (and you seem to assume that your readers will), but anyway: it looks like you're trying to do an intercept-only model. If that's the case, try: logmod11 - lmer(log(Reaction) ~ 1 + (1|Subject),REML=FALSE) 1 is the intercept, and anything in parentheses are your random effects--in this case, the intercept is random, and your level-2 class variable is Subject (several lines per Subject). If you want to add Days as a predictor, try: logmod11 - lmer(log(Reaction) ~ 1 + Days + (Days|Subject),REML=FALSE) Here, both the intercept and the coefficient for Days are random (allowed to vary for each Subject). Don't forget to include the dataset after your formula if it's not attached to your environment. Hope this helps, Patrick 2013/4/24 Andrew Cochrane jandrew.cochr...@gmail.com: I'm a student currently working with the *sleepstudy* dataset in matrix.pkg. It deals with the reaction times of sleep deprived students over a period of days. I am trying to model reaction times in order to describe the variation between students by days they havent slept. This is what I'm running in R, but unfortunately I'm missing something: logmod11 - lmer(log(Reaction) ~ (Subject|Days),REML=FALSE) This is obviously incorrect, so If someone could give me some quick help I'd really appreciate it. Thanks! -- J. Andrew Cochrane University of Illinois | 2013 College of Liberal Arts and Sciences | Statistics (630) 991-7502 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding Modeling - Please! QUICK HELP
Sorry, this list has a No homework policy. Please ask your lecturer or tutor about this. cheers, Rolf Turner On 25/04/13 14:18, Andrew Cochrane wrote: I'm a student currently working with the *sleepstudy* dataset in matrix.pkg. It deals with the reaction times of sleep deprived students over a period of days. I am trying to model reaction times in order to describe the variation between students by days they havent slept. This is what I'm running in R, but unfortunately I'm missing something: logmod11 - lmer(log(Reaction) ~ (Subject|Days),REML=FALSE) This is obviously incorrect, so If someone could give me some quick help I'd really appreciate it. Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Missing data
Dear r-users, I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 Thank you so much for any help given. I hope my question is clear. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Interactive Mode
On 04/24/2013 06:53 PM, Hrachya Astsatryan wrote: Dear all, We are doing some research about the time series analysis of NDVI, and we found the NDVITS package which is a very great tool. Unfortunately when we run it, after TimeSeriesAnalysis it asks to enter Village or Country. library(ndvits, lib.loc=/home/vahe/R/i686-pc-linux-gnu-library/2.15) ndvidirectory=paste(system.file(extdata/VITO_Mzimba, package=ndvits), /, sep=) region=Mzimba Ystart=2004 Yend=2006 shape=SLP_Mzimba shapedir=paste(system.file(extdata/shape, package=ndvits), /, sep=) outfile = mzimbaTS2.txt outfile2 = MzimbaTS2.pdf outfiel3 = my.pdf signal = TimeSeriesAnalysis(shape, shapedir, ndvidirectory, region, Ystart, Yend, outfile, outfile2) How it is possible to call the package by default indicating Village option /we don't want to enter the parameter and don't want to change anything in the code which is quite difficult/. Hi Hrach, I thought I would have a shot at this, and it has been an education. There are a lot of dependencies. I was unable to trace which function asks the question Village or Country and it may be hidden. As a guess, I would say that this disambiguates names that may refer to both an area and a part of that area. Perhaps you could try something like: region=Mzimba (Village) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tables package - remove NAs and NaN
On Wed, Apr 24, 2013 at 9:23 PM, Santosh santosh2...@gmail.com wrote: Dear Rxperts, Sorry if I am posting a really really dumb request.. I am new to subversion and am trying to use subversion to download the tables package as suggested by Duncan. I installed subversion client(from collabnet) and tried to access tables package using the command below. svn checkout svn://scm.r-forge.r-project.org/svnroot/tables/ I don't know what's wrong here, but I would suggest that you use an SVN GUI for Windows (like RapidSVN or TortoiseSVN). This should avoid space related issues. Regards, Liviu I get the following error message: C:\Users\santosh\tempsvn checkout svn:// scm.r-forge.r-project.org/svnroot/tables/ svn: E730060: Unable to connect to a repository at URL 'svn://scm.r-forge.r-proj ect.org/svnroot/tables' svn: E730060: Can't connect to host 'scm.r-forge.r-project.org': A connection at tempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. Is there anything additional I need to do with Subversion or with the commands? Regards, Santosh On Tue, Apr 23, 2013 at 5:13 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 13-04-23 6:31 AM, Duncan Murdoch wrote: On 13-04-22 10:40 PM, David Winsemius wrote: On Apr 22, 2013, at 5:49 PM, Santosh wrote: Dear Rxperts, q - data.frame(p=rep(c(A,B),**each=10,len=30), a=rep(c(1,2,3),each=10),id=**seq(30), b=round(runif(30,10,20)), c=round(runif(30,40,70))) The operation below... tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)* (mean+sd),data=q) yields some rows of NAs and NaN as shown below b c p a N mean sdmean sd A 1 10 16.30 2.497 52.30 9.358 20 NaNNA NaN NA 3 10 15.60 2.716 60.30 8.001 B 10 NaNNA NaN NA 2 10 15.40 2.366 57.70 10.414 30 NaNNA NaN NA All 30 15.77 2.473 56.77 9.601 How do I remove the rows having N=0 ? I would like the resulting table look like.. b c p a N mean sdmean sd A 1 10 16.30 2.497 52.30 9.358 3 10 15.60 2.716 60.30 8.001 B 2 10 15.40 2.366 57.70 10.414 All 30 15.77 2.473 56.77 9.601 Here's a bit of a hack: tabular( (`p a`=interaction(p,a, drop=TRUE, sep= )) ~ (N = 1) + (b + c)* (mean+sd),data=q) b c p a N mean sd mean sd A 1 10 12.8 0.7888 52.1 8.020 B 2 10 16.3 3.0569 54.9 8.711 A 3 10 14.6 3.7771 56.5 6.980 I have been rather hoping that Duncan Murdoch would have noticed the earlier thread, but maybe he can comment on whether there is a more direct route/ This isn't something that the package is designed to handle: if you say p*a, it wants all combinations of p and a. If I wanted a table like that, I'd use a different hack. One possibility is to create that interaction column, but display it as just the initial letter, labelled p, and then add another column to contain the a values as data. It would be tricky to get the formatting right. Another possibility is to generate the whole table with the N=0 rows, and then post-process it to remove those rows, and adjust the row labels appropriately. This approach probably gives the nicer result, but the post-processing is quite messy: you need to delete some rows from the table, from its rowLabels attribute, and from the justification attributes of both the table and its rowLabels. (I should add a [ method to the package to hide this messiness.) I've done this now, in version 0.7.54 on R-forge. To leave out the rows with N=0, you can select a subset of the table where N (the first column) is non-zero: tab - tabular(((p=factor(p))*(a=**factor(a))+1) ~ (N = 1) + (b + c)*(mean+sd),data=q) tab[ tab[,1] 0, ] and it produces this: b c p a N mean sdmean sd A 1 10 16.20 3.458 56.3 10.155 3 10 13.60 2.119 58.1 8.075 B 2 10 14.40 2.547 51.2 9.438 All 30 14.73 2.888 55.2 9.419 Indexing of tables isn't as general as indexing of matrices, but most of the simple forms should work. I haven't tested yet, but I expect this will be fine in LaTeX or HTML (also new, not on CRAN yet) output as well. Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __
Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’
Dear Duncan, On Wed, Apr 24, 2013 at 11:04 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: What I've done sometimes in debugging is to change that error to a warning in the getNamespace() function, and add some tracing code to the serialization code to print the names of objects as they are loaded. (This goes in ReadItem in src/main/serialize.c.) I wouldn't expect Liviu to make those changes, but perhaps a verbose option could be added to load(), so that it could be available to users. I have added this in R-devel. The format of the printed output may well change before this is ever released, but it should be enough to identify the bad item already. You'll need a build of R-devel from r62658 or newer to see this. Then load(/tmp/a.rda, verbose=TRUE) will print the names of objects as they are read (the names are read after the attributes and before the value). If you want to see reams of mostly useless information, you can try verbose=n (for some number n=2 or more); this prints names and component numbers to a greater depth. Thank you for adding this in R. I will likely test this feature when it gets released. Best regards, Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] installing package
Hi I am trying to install a package (bioconductor) but every time I try to install it I get this message: source(http://bioconductor.org/biocLite.R;) Warning in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) : 'lib = C:/Program Files/R/R-3.0.0/library' is not writable Error in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) : unable to install packages biocLite(methylumi) I normally use mac computers, but I cannot get the right path for the folders I should use, so now I am trying with a windows platform instead. But now I cannot install one of the packages my pipeline needs. Can anyone help? I know it is probably a simple problem, but I have never used R before and don't know how to solve problems in it. Best Gitte Andersen E-mail: gitt...@hum-gen.au.dk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installing package
Hi, Do you have administrator rights? Regards, Pascal On 04/25/2013 04:19 PM, Gitte Brinch Andersen wrote: Hi I am trying to install a package (bioconductor) but every time I try to install it I get this message: source(http://bioconductor.org/biocLite.R;) Warning in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) : 'lib = C:/Program Files/R/R-3.0.0/library' is not writable Error in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) : unable to install packages biocLite(methylumi) I normally use mac computers, but I cannot get the right path for the folders I should use, so now I am trying with a windows platform instead. But now I cannot install one of the packages my pipeline needs. Can anyone help? I know it is probably a simple problem, but I have never used R before and don't know how to solve problems in it. Best Gitte Andersen E-mail: gitt...@hum-gen.au.dk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Loop for main title in a plot
You could use bquote. Something like this: a-c(1,2,3,4) b-c(1,2,3,4) nTrials - length(a) for (trial in 1:nTrials) { plot(x=a[1:trial], y=b[1:trial], ylab=expression(paste(Apple[P])), xlab=expression(paste(Banana^th)), main=bquote(italic(i-)~.(trial)^th~choice)) } -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Eva Günther Sent: Donnerstag, 25. April 2013 06:22 To: r-help@r-project.org Subject: [R] Loop for main title in a plot Hi all, I have a problem in including my plot in a loop. Here is a simple example for one plot: # Plot simple graph with super- and subscript a-c(1,2,3,4) b-c(1,2,3,4) plot(x=a,y=b, ylab=expression(paste(Apple[P])), xlab=expression(paste(Banana^th)), main=expression(paste(italic(i-)~4^th~choice))) Now I would like to include the titel (main) as a function of the number of trails for (trial in 1:nTrials) { plot( main=expression(paste(italic(i-)~trial^th~choice))) } e.g. nTrials = 5 The title should look like this: 5th plot: i ^th choice 4th plot: i-1 ^th choice 3th plot: i-2 ^th choice and so on I have problems to create that, could you please help me? Thank you!! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] fdrtool qvalues
Hi, I've just started using R and fdrtool, and I'm not sure if the qvalues I'm receiving back are accurate. I performed fdrtool on pvalues obtained from a two way anova on proteomics data. So I have 266 data values (protein spots) for two factors (ft, vr, and the interaction) for each biological sample. One of the two factors (vr) has a highly significant effect with 119 protein spots significantly affected at p0.05. When I run the fdrtool, the qvalues are slightly higher than pvalues (as expected), but only up to p0.01. Between p0.01 and p0.05, the qvalues are lower, giving me more significant protein spots at that level - is this correct? The other factor (ft) had only 3 weakly significant protein spots. When I run fdrtool, all 266 qvalues are 1. The interaction effect (ftxvr) produced 14 significant pvalues (mostly p0.05, a couple are p0.01). fdrtool produces qvalues ranging between 0.87-0.99 and the rise with rising pvalues, so I lose the significant results here. Best wishes, Catherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Floating point precision causing undesireable behaviour when printing as.POSIXlt times with microseconds?
FAQ 7.31 also if you are using POSIXct for current dates, the resolution is down to about a milliseconds. Sent from my iPad On Apr 24, 2013, at 13:57, O'Hanlon, Simon J simon.ohan...@imperial.ac.uk wrote: Dear list, When using as.POSIXlt with times measured down to microseconds the default format.POSIXlt seems to cause some possibly undesirable behaviour: According to the code in format.POSIXlt the maximum accuracy of printing fractional seconds is 1 microsecond, but if I do; options( digits.secs = 6 ) as.POSIXlt( 1.02 , tz=, origin=1970-01-01) as.POSIXlt( 1.98 , tz=, origin=1970-01-01) as.POSIXlt( 1.99 , tz=, origin=1970-01-01) I return respectively: [1] 1970-01-01 01:00:01.02 BST [1] 1970-01-01 01:00:01.98 BST [1] 1970-01-01 01:00:01 BST If options( digits.secs = 6 ) should I not expect to be able to print 1.99 seconds? This seems to be caused by the following code fragment in format.POSIXlt: np - getOption(digits.secs) if (is.null(np)) np - 0L else np - min(6L, np) if (np = 1L) for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs, i)) 1e-06)) { np - i break } Specifically for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs, i)) 1e-06)) Which in the case of 1.99 seconds will give: options( scipen = 10 ) np - 6 sapply( seq_len(np) - 1L , function(x) abs(1.99 - round(1.99, x)) ) [,1] [,2] [,3] [,4] [,5] [,6] [1,] 0.01 0.01 0.01 0.01 0.01 0.01 The logical test all( ... 1e-06) should evaluate to FALSE but due to floating point precision it evaluates TRUE: sprintf( %.20f , abs(1. 99 - round(1. 99,5))) [1] 0.00991773 If instead of: for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs, i)) 1e-06)) in format.POSIXlt we had a comparison value that was half the minimum increment: for (i in seq_len(np) - 1L) if (all(abs(secs - round(secs, i)) 5e-07)) This behaviour disappears: mod.format.POSIXlt( as.POSIXlt( 1.99 , tz=, origin=1970-01-01) ) [1] 1970-01-01 01:00:01.99 But I am unsure if the original behaviour is what I should expect given the documentation (I have read it and I can't see a reason to expect 1.99 to round down to 1). And also if changing the formatting function would have other undesirable consequences? My sessionInfo(): R version 3.0.0 (2013-04-03) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_United Kingdom.1252 [2] LC_CTYPE=English_United Kingdom.1252 [3] LC_MONETARY=English_United Kingdom.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United Kingdom.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base Thank you, Simon Simon O'Hanlon Postgraduate Researcher Helminth Ecology Research Group Department of Infectious Disease Epidemiology Imperial College London St. Mary's Hospital, Norfolk Place, London, W2 1PG, UK Office: +44 (0) 20 759 43229 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrapping in R
On Apr 25, 2013, at 7:02, Preetam Pal lordpree...@gmail.com wrote: Hi all, 1i have 3 vectors a,b and c, each of length 25... i want to define a new data frame z such that z[1] = (a[1] b[1] c[1]), z[2] = (a[2] b[2] c[2]) and so on...how do i do it in R z - data.frame(a, b, c) 2 Then i want to draw bootstrap samples from z. Look at the boot package. MW Kindly suggest how i can do this in R. Thanks, Preetam -- Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Decomposing a List
Greetings! For some reason I am not managing to work out how to do this (in principle) simple task! As a result of applying strsplit() to a vector of character strings, I have a long list L (N elements), where each element is a vector of two character strings, like: L[1] = c(A1,B1) L[2] = c(A2,B2) L[3] = c(A3,B3) [etc.] From L, I wish to obtain (as directly as possible, e.g. avoiding a loop) two vectors each of length N where one contains the strings that are first in the pair, and the other contains the strings which are second, i.e. from L (as above) I would want to extract: V1 = c(A1,A2,A3,...) V2 = c(B1,B2,B3,...) Suggestions? With thanks, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:16:46 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decomposing a List
Dear Dr. Harding, Try sapply(L, [, 1) sapply(L, [, 2) HTH, Jorge.- On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote: Greetings! For some reason I am not managing to work out how to do this (in principle) simple task! As a result of applying strsplit() to a vector of character strings, I have a long list L (N elements), where each element is a vector of two character strings, like: L[1] = c(A1,B1) L[2] = c(A2,B2) L[3] = c(A3,B3) [etc.] From L, I wish to obtain (as directly as possible, e.g. avoiding a loop) two vectors each of length N where one contains the strings that are first in the pair, and the other contains the strings which are second, i.e. from L (as above) I would want to extract: V1 = c(A1,A2,A3,...) V2 = c(B1,B2,B3,...) Suggestions? With thanks, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:16:46 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] identify object that causes Error in loadNamespace(name) : there is no package called ‘R.utils’
On 13-04-25 3:46 AM, Liviu Andronic wrote: Dear Duncan, On Wed, Apr 24, 2013 at 11:04 PM, Duncan Murdoch murdoch.dun...@gmail.com wrote: What I've done sometimes in debugging is to change that error to a warning in the getNamespace() function, and add some tracing code to the serialization code to print the names of objects as they are loaded. (This goes in ReadItem in src/main/serialize.c.) I wouldn't expect Liviu to make those changes, but perhaps a verbose option could be added to load(), so that it could be available to users. I have added this in R-devel. The format of the printed output may well change before this is ever released, but it should be enough to identify the bad item already. You'll need a build of R-devel from r62658 or newer to see this. Then load(/tmp/a.rda, verbose=TRUE) will print the names of objects as they are read (the names are read after the attributes and before the value). If you want to see reams of mostly useless information, you can try verbose=n (for some number n=2 or more); this prints names and component numbers to a greater depth. Thank you for adding this in R. I will likely test this feature when it gets released. That will be about a year from now... Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing data
I read your data into a dataframe x - read.table( clipboard ) and renamed the only column colnames( x )[1] - orig With a loop, I created a 2nd column miss where in every 10th row the observation is set to NA: for( i in 1 : length( x$orig ) ) { if( as.integer( rownames( x )[ i ] ) %% 10 == 0 ) { x$miss[i] - NA } else { x$miss[i] - x$orig[i] } } This is probably the least elegant of all possible solutions but it works... Rgds, Rainer On Wednesday 24 April 2013 23:41:21 Roslina Zakaria wrote: Dear r-users, I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 Thank you so much for any help given. I hope my question is clear. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mgcv: how select significant predictor vars when using gam(...select=TRUE) using automatic optimization
Juliet, for you the diagnostic plots: just to recall: the first model was this: fit-gam(target ~s(mgs)+s(gsd)+s(mud)+s(ssCmax),family=quasi(link=log),data=wspe1,method=REML,select=F) summary(fit) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -4.724 7.462 -0.6330.527 Approximate significance of smooth terms: edf Ref.df F p-value s(mgs)3.118 3.492 0.099 0.974 s(gsd)6.377 7.044 15.596 2e-16 *** s(mud)8.837 8.971 18.832 2e-16 *** s(ssCmax) 3.886 4.051 2.342 0.052 . --- R-sq.(adj) = 0.403 Deviance explained = 40.6% REML score = 33186 Scale est. = 8.7812e+05 n = 4511 (I slightly shortened the output) Also of interest: Model error as root mean squared error (RMSE): sqrt(mean(residuals.gam(fit,type=response)^2)) [1] 934.6647 Here are diagnostic plots: http://r.789695.n4.nabble.com/file/n4665370/screen-capture-1.png http://r.789695.n4.nabble.com/file/n4665370/screen-capture-2.png Here Simons comment to this particular model from Apr 18, 2013; 5:25pm (see above) The p-value computations are based on the approximation that things are approximately normal on the linear predictor scale, but actually they are no where close to normal in this case, which is why the p-values look inconsistent. The reason that the approximate normality assumption doesn't hold is that the model is quite a poor fit. If you take a look at gam.check(fit) you'll see that the constant variance assumption of quasi(link=log) is violated quite badly, and the residual distribution is really quite odd (plot residuals against fitted as well). Also see plot(fit,pages=1,scale=0) - it shows ballooning confidence intervals and smooth estimates that are so low in places that they might as well be minus infinity (given log link) - clearly something is wrong with this model! Following Simons advice (quote): try Tweedie(p=1.5,link=log) as the family. Also the predictor variables are very skewed which is giving leverage problems, so I would transform them to give less skew. e.g. Something like fit-gam(target~s(log(mgs))+s(I(gsd^.5))+s(I(mud^.25))+s(log(ssCmax)), + family=Tweedie(p=1.6,link=log),data=wspe1,method=REML) summary(fit) Parametric coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 4.026540.05231 76.97 2e-16 *** Approximate significance of smooth terms: edf Ref.df F p-value s(log(mgs))6.067 7.292 12.58 2e-16 *** s(I(gsd^0.5)) 4.009 5.138 18.25 2e-16 *** s(I(mud^0.25)) 7.210 8.240 58.54 2e-16 *** s(log(ssCmax)) 8.407 8.764 74.87 2e-16 *** R-sq.(adj) = 0.303 Deviance explained = 51% REML score = 14355 Scale est. = 27.702n = 4511 (I slightly shortened the output) RMSE did not improve: sqrt(mean(residuals.gam(fit,type=response)^2)) [1] 1009.268 diagnostic plots in the following http://r.789695.n4.nabble.com/file/n4665370/screen-capture-3.png http://r.789695.n4.nabble.com/file/n4665370/screen-capture-4.png wich looks much better. The QQ-plot is closer to identity, the residuals are more evenly spread and much smaller. Still, the correlation of response and fitted values seems pretty low Hope this helps, Jan -- View this message in context: http://r.789695.n4.nabble.com/mgcv-how-select-significant-predictor-vars-when-using-gam-select-TRUE-using-automatic-optimization-tp4664510p4665370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How are R version types named ? Any convention (like Hurricanes etc)
With reference to R News News: R version 3.0.0 (Masked Marvel) has been released on 2013-04-03. R version 2.15.3 (Security Blanket) has been released on 2013-03-01 R version 2.15.2 (Trick or Treat) R version 2.15.1 (Roasted Marshmallows) ... R version 2.15.0 (Easter Beagle) R version 2.14.0 (Great Pumpkin) Dear R help List, How are these version types named? Masked Marvel comes after Security Blanket comes after Trick or Treat comes after Roasted Marshmallows. Is it some convention like that for Hurricanes in the West. It is totally incomprehensible to me as I am in India. Sincerely, Ajay Ohri Author- R for Business Analytics http://www.amazon.com/R-Business-Analytics-A-Ohri/dp/1461443423 Founder- Decisionstats.com http://decisionstats.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing data
Hello, Something like this? x - scan(text = 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 ) putMissing - function(x, by){ idx - by*seq_along(x) idx - idx[which(idx = length(x))] x[idx] - NA x } putMissing(x, 10) putMissing(x, 5) Hope this helps, Rui Barradas Em 25-04-2013 07:41, Roslina Zakaria escreveu: Dear r-users, I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 Thank you so much for any help given. I hope my question is clear. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decomposing a List
Thanks, Jorge, that seems to work beautifully! (Now to try to understand why ... but that's for later). Ted. On 25-Apr-2013 10:21:29 Jorge I Velez wrote: Dear Dr. Harding, Try sapply(L, [, 1) sapply(L, [, 2) HTH, Jorge.- On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote: Greetings! For some reason I am not managing to work out how to do this (in principle) simple task! As a result of applying strsplit() to a vector of character strings, I have a long list L (N elements), where each element is a vector of two character strings, like: L[1] = c(A1,B1) L[2] = c(A2,B2) L[3] = c(A3,B3) [etc.] From L, I wish to obtain (as directly as possible, e.g. avoiding a loop) two vectors each of length N where one contains the strings that are first in the pair, and the other contains the strings which are second, i.e. from L (as above) I would want to extract: V1 = c(A1,A2,A3,...) V2 = c(B1,B2,B3,...) Suggestions? With thanks, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:16:46 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:31:57 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stochastic Frontier: Finding the optimal scale/scale efficiency by frontier package
Dear Miao On 25 April 2013 03:26, jpm miao miao...@gmail.com wrote: I am trying to find out the scale efficiency and optimal scale of banks by stochastic frontier analysis given the panel data of bank. I am free to choose any model of stochastic frontier analysis. The only approach I know to work with R is to estimate a translog production function by sfa or other related function in frontier package, and then use the Ray 1998 formula to find the scale efficiency. However, as the textbook Coelli et al 2005 point out that the concavity may not be satisfied, one needs to impose the nonpositive definiteness condition so that the scale efficiency 1. It might be that the true technology is not concave and that the elasticity of scale is larger than one. Indeed, most empirical studies find increasing returns to scale (in many different sectors). Therefore, it is probably inappropriate to impose concavity. How can I do it with frontier package? The frontier package cannot impose concavity on a Translog production function and I am not aware of any software that can do this in a stochastic frontier estimation -- probably, because imposing concavity usually does not make sense. Is there any other SFA model/function in R recommended to find out the scale efficiency and optimal scale? I suggest to plot the elasticity of scale against the firm size. If the elasticity of scale decreases with firm size, then the most productive firm size is at the firm size, where the elasticity of scale is one. However, there are some problems with using the Translog production function (and the Translog distance function) for determining the optimal firm size [1]. [1] http://econpapers.repec.org/RePEc:foi:wpaper:2012_12 If you have further questions regarding the frontier package, I suggest that you use the help forum at frontier's R-Forge site [2]. [2] https://r-forge.r-project.org/projects/frontier/ ... and please do not forget to cite the R packages that you use in your analysis in your publications. Thanks! Best wishes, Arne -- Arne Henningsen http://www.arne-henningsen.name __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Assigning a variable value based on multiple columns
Thanks Patrick--I think this solution will work perfectly. Jason Jason Stout, MD, MHS Box 102359-DUMC Durham, NC 27710 FAX 919-681-7494 From: Patrick Coulombe [patrick.coulo...@gmail.com] Sent: Thursday, April 25, 2013 1:53 AM To: Jason Stout, M.D. Cc: r-help@r-project.org Subject: Re: [R] Assigning a variable value based on multiple columns Hi Jason, I think that the easiest for you would be to keep your current elseif statements as is, but change your NA into something else (e.g., -999, or anything else). To do this in one line, you can use the package gdata. In this code, I assume that your data are stored in the variable dataset: ### #install package gdata if not yet installed install.packages(gdata) #load package gdata library(gdata) #change NA into -999 dataset - NAToUnknown(dataset, -999) #do your ifs/ifelses here... #... #... #change -999 back into NA dataset - unknownToNA(dataset, -999) And that should do it. Hope this helps, Patrick 2013/4/24 Jason Stout, M.D. jason.st...@duke.edu Hi All, I'm hoping someone can help me with a relatively simple problem. Take the following dataset: IDDiabetesESRDHIVContact 100NA0 210NA0 3NA 100 40NA 01 51110 I want to generate a column called TSTcutoff based on the values in the row. TSTcutoff would be the lower of 15 (if Diabetes=ESRD=HIV=Contact=0), 10 (if Diabetes or ESRD=1 AND HIV=Contact=0), or 5 (if HIV OR Contact=1). I was thinking this could be done with a series of IFELSE statements, but the NA values make this more challenging. I want to ignore NA values when calculating TSTcutoff. So the final dataset should look like this: IDDiabetesESRDHIVContact TSTcutoff 100NA015 210NA0 10 3NA 10010 40NA 015 511105 Thanks for any suggestions. Jason Stout, MD, MHS Box 102359-DUMC Durham, NC 27710 FAX 919-681-7494 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How are R version types named ? Any convention (like Hurricanes etc)
On 04/25/2013 07:46 PM, Ajay Ohri wrote: With reference to R News News: R version 3.0.0 (Masked Marvel) has been released on 2013-04-03. R version 2.15.3 (Security Blanket) has been released on 2013-03-01 R version 2.15.2 (Trick or Treat) R version 2.15.1 (Roasted Marshmallows) ... R version 2.15.0 (Easter Beagle) R version 2.14.0 (Great Pumpkin) Dear R help List, How are these version types named? Masked Marvel comes after Security Blanket comes after Trick or Treat comes after Roasted Marshmallows. Is it some convention like that for Hurricanes in the West. It is totally incomprehensible to me as I am in India. Hi Ajay, My guess is that these correspond roughly to the levels of enlightenment that can be attained by mortals. We begin with the being having no concept of enlightenment. A pumpkin, however great, is as Freud might have said, only a pumpkin. A beagle, despite its lowly concept of nirvana, which it considers to be a place full of bones and interesting things to sniff, has begun to climb that long, long ladder. At first we might think of a marshmallow as a step backward on the road to supernal knowledge, but if we consider it as the plight of a being impaled upon a black birch twig, its essence floating upward with the smoke from the campfire, it is easy to see that this spasm of suffering is the prelude to its own spiritual ascent. Trick or treat signifies the problem of the initiate. So many ways are open, so many promises made. Which way leads to the goal? The seeker may cast about for a Security Blanket, some apparently firm basis upon which to regain one's bearings. The Masked Marvel is the true way, always concealed from all but those who have cast off the earthly delights of black box statistical packages and devoted their lives to the study of R. In future versions, we will no doubt see further occult signs that will lead us in the right direction if only we remain true to our noble and transcendent mission. Okay, I think they are mostly from comic book characters. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How are R version types named ? Any convention (like Hurricanes etc)
I am not sure but it looks suspiciously like a set of references to the comicstrip Peanuts by Charlie Shultz. http://en.wikipedia.org/wiki/Peanuts John Kane Kingston ON Canada -Original Message- From: ohri2...@gmail.com Sent: Thu, 25 Apr 2013 15:16:17 +0530 To: r-help@r-project.org Subject: [R] How are R version types named ? Any convention (like Hurricanes etc) With reference to R News News: R version 3.0.0 (Masked Marvel) has been released on 2013-04-03. R version 2.15.3 (Security Blanket) has been released on 2013-03-01 R version 2.15.2 (Trick or Treat) R version 2.15.1 (Roasted Marshmallows) ... R version 2.15.0 (Easter Beagle) R version 2.14.0 (Great Pumpkin) Dear R help List, How are these version types named? Masked Marvel comes after Security Blanket comes after Trick or Treat comes after Roasted Marshmallows. Is it some convention like that for Hurricanes in the West. It is totally incomprehensible to me as I am in India. Sincerely, Ajay Ohri Author- R for Business Analytics http://www.amazon.com/R-Business-Analytics-A-Ohri/dp/1461443423 Founder- Decisionstats.com http://decisionstats.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. GET FREE SMILEYS FOR YOUR IM EMAIL - Learn more at http://www.inbox.com/smileys Works with AIM®, MSN® Messenger, Yahoo!® Messenger, ICQ®, Google Talk™ and most webmails __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regression and FMMs with flexmix
Robin, On Wed, Apr 24, 2013 at 11:24 AM, Robin Tviet robintv...@outlook.comwrote: I am trying to understand how to use the flexmix package, I have read the Leisch paper but am very unclear what is needed for the M-step driver. I am just fitting a simple linear regression model. The documentation is far from clear what the FLXmclust function does, but, it in principle could do all I need, however, I do not get sentible results as if I try the following the result is poor: x-c() for(i in 0:99){x$y[2*i]=(0+i);x$x[2*i]=i;x$x[2*i+1]=i;x$y[2*i+1]=i+1000;x$g[2*i]=1;x$g[2*i+1]=2} m1-flexmix(y~x ,data=x,k=2) table(x$g,m1@cluster) 1 2 1 25 74 2 67 33 there is no correlation between x and y, nor within groups, nor between groups so not sure why your model would make sense; the following model runs just (although it also depends on starting values whether the result is the 2 expected clusters or 1 large cluster of all the data): set.seed(1) m1-flexmix(y~1 ,data=x,k=2) m1 Call: flexmix(formula = y ~ 1, data = x, k = 2) Cluster sizes: 1 2 99 100 convergence after 2 iterations hth, Ingmar It all depends on the randomised starting values. So I think I need a better driver, but, I cannot find a spec for what I have to do in the driver. Where is FLXmclust documented? can anyone assist? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Selecting and then joining data blocks
Hi all, I have 4 matrices, each having 5 columns and 4 rows .denoted by B1,B2,B3,B4. I have generated a vector of 7 indices, say (1,2,4,3,2,3,1} which refers to the index of the matrices to be chosen and then appended one on the top of the next: like, in this case, I wish to have the following mega matrix: B1over B2 over B4 over B3 over B2 over B3 over B1. 1 How can I achieve this? 2 I don't want to manually identify and arrange the matrices for each vector of index values generated (for which the code I used is : index=sample( 4,7,replace=T)). How can I automate the process? Basically, I am doing bootstrapping , but the observations are actually 4X5 matrices. Appreciate your help. Thanks, Preetam --- Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Selecting and then joining data blocks
HI, set.seed(24) #creating the four matrix in a list lst1-lapply(1:4,function(x) matrix(sample(1:40,20,replace=TRUE),ncol=5)) names(lst1)- paste0(B,1:4) vec- c(1,2,4,3,2,3,1) res-do.call(rbind,lapply(vec,function(i) lst1[[i]])) dim(res) #[1] 28 5 #or B1- lst1[[1]] B2- lst1[[2]] B3- lst1[[3]] B4- lst1[[4]] res2-do.call(rbind,lapply(vec,function(i) get(paste0(B,i identical(res,res2) #[1] TRUE A.K. - Original Message - From: Preetam Pal lordpree...@gmail.com To: r-help@r-project.org Cc: Sent: Thursday, April 25, 2013 7:51 AM Subject: [R] Selecting and then joining data blocks Hi all, I have 4 matrices, each having 5 columns and 4 rows .denoted by B1,B2,B3,B4. I have generated a vector of 7 indices, say (1,2,4,3,2,3,1} which refers to the index of the matrices to be chosen and then appended one on the top of the next: like, in this case, I wish to have the following mega matrix: B1over B2 over B4 over B3 over B2 over B3 over B1. 1 How can I achieve this? 2 I don't want to manually identify and arrange the matrices for each vector of index values generated (for which the code I used is : index=sample( 4,7,replace=T)). How can I automate the process? Basically, I am doing bootstrapping , but the observations are actually 4X5 matrices. Appreciate your help. Thanks, Preetam --- Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decomposing a List
Hi, May be this helps. L- list(c(A1,B1),c(A2,B2),c(A3,B3)) simplify2array(L)[1,] #[1] A1 A2 A3 simplify2array(L)[2,] #[1] B1 B2 B3 #or library(stringr) word(sapply(L,paste,collapse= ),1) #[1] A1 A2 A3 A.K. - Original Message - From: ted.hard...@wlandres.net ted.hard...@wlandres.net To: r-help@r-project.org Cc: Sent: Thursday, April 25, 2013 6:16 AM Subject: [R] Decomposing a List Greetings! For some reason I am not managing to work out how to do this (in principle) simple task! As a result of applying strsplit() to a vector of character strings, I have a long list L (N elements), where each element is a vector of two character strings, like: L[1] = c(A1,B1) L[2] = c(A2,B2) L[3] = c(A3,B3) [etc.] From L, I wish to obtain (as directly as possible, e.g. avoiding a loop) two vectors each of length N where one contains the strings that are first in the pair, and the other contains the strings which are second, i.e. from L (as above) I would want to extract: V1 = c(A1,A2,A3,...) V2 = c(B1,B2,B3,...) Suggestions? With thanks, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:16:46 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] installing package
On 04/25/2013 12:19 AM, Gitte Brinch Andersen wrote: Hi I am trying to install a package (bioconductor) but every time I try to install it I get this message: source(http://bioconductor.org/biocLite.R;) Warning in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) : 'lib = C:/Program Files/R/R-3.0.0/library' is not writable Error in install.packages(BiocInstaller, repos = a[BioCsoft, URL]) : unable to install packages Hi Gitte -- this and your Mac path problems are really a question for the Bioconductor mailing list http://bioconductor.org/help/mailing-list/ I don't know the answer to your path problem, but the package author monitors that list and will be able to help. I would have expected the attempt run the biocLite.R script to result in a dialog that asks 'Would you like to use a personal library instead?', to which you should answer 'yes'. If for some reason you do not want to answer 'yes', then read the help page ?.libPaths Hope that helps, and please ask your questions about Bioconductor packages on the Bioconductor mailing list. Martin I normally use mac computers, but I cannot get the right path for the folders I should use, so now I am trying with a windows platform instead. But now I cannot install one of the packages my pipeline needs. Can anyone help? I know it is probably a simple problem, but I have never used R before and don't know how to solve problems in it. Best Gitte Andersen E-mail: gitt...@hum-gen.au.dk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Make R 3.0 open .RData files
Hello! I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and 3.0.0. Before I had R 3.0 I made it a setting that all .RData files - when I double-click on them - were opened by R 2.15.3. Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want to remove R 2.15.3. yet). I right-click on some .RData file, select Open with - Choose default program and then click on Browse. I browse to the folder where my R 3.0 is installed, then to the folder bin, then to the folder x64 and select Rgui.exe. However, when R opens - or after I shut R down and then double-click on some .RData file and R opens, it is again R 2.15.3, not R3.0. What am I doing wrong? Of course, when I open R 3.0 directly, then it opens no problem. Thank you! -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear Interpolation : Missing rates
Katherine, Split the rate names into their currency and tenor parts and assign a numeric value to each tenor. Choose a model to do your approximations (I used linear regression in the example below). Use this model to generate estimates for all combinations of currency and tenor. For example: # split the rate names into currency and tenor splitnames - do.call(rbind, strsplit(df$rate_name, _)) df$currency - as.factor(splitnames[, 1]) df$tenor - splitnames[, 2] # assign numeric value to each tenor uniquetenors - c(1w, 2w, 1m, 2m) uniquedays - c(7, 14, 30.5, 61) df$tenordays - uniquedays[match(df$tenor, uniquetenors)] # fit a linear model of rate on tenordays for each currency fit - lm(rates ~ currency*tenordays, data=df) # estimate rates for all combinations of currency and tenor fulldf - expand.grid(tenordays=unique(df$tenordays), currency=unique(df$currency)) fulldf$est.rates = predict(fit, newdata=fulldf) # merge observed rates with estimated rates dfwithest - merge(df, fulldf, all=TRUE) Jean On Thu, Apr 25, 2013 at 12:33 AM, Katherine Gobin katherine_go...@yahoo.com wrote: Dear R forum I have data.frame as df = data.frame(rate_name = c(USD_1w, USD_1w, USD_1w, USD_1w, USD_1m, USD_1m, USD_1m, USD_1m, USD_2m, USD_2m, USD_2m, USD_2m, GBP_1w, GBP_1w, GBP_1w, GBP_1w, GBP_1m, GBP_1m, GBP_1m, GBP_1m, GBP_2m, GBP_2m, GBP_2m, GBP_2m, EURO_1w, EURO_1w, EURO_1w, EURO_1w, EURO_2w, EURO_2w, EURO_2w, EURO_2w, EURO_2m, EURO_2m, EURO_2m, EURO_2m), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 2.09, 2.09, 2.11)) currency = c(EURO, GBP, USD) tenor = c(1w, 2w, 1m, 2m, 3m) # _ df rate_name rates rate_name rates 1 USD_1w 2.05 2 USD_1w 2.07 3 USD_1w 2.06 4 USD_1w 2.06 5 USD_1m 2.22 6 USD_1m 2.24 7 USD_1m 2.23 8 USD_1m 2.23 9 USD_2m 2.31 10USD_2m 2.33 11USD_2m 2.33 12USD_2m 2.31 13GBP_1w 1.06 14GBP_1w 1.08 15GBP_1w 1.08 16GBP_1w 1.08 17GBP_1m 1.21 18GBP_1m 1.21 19GBP_1m 1.23 20GBP_1m 1.21 21GBP_2m 1.41 22GBP_2m 1.39 23GBP_2m 1.39 24GBP_2m 1.37 25 EURO_1w 1.82 26 EURO_1w 1.82 27 EURO_1w 1.81 28 EURO_1w 1.80 29 EURO_2w 1.98 30 EURO_2w 1.98 31 EURO_2w 1.97 32 EURO_2w 1.97 33 EURO_2m 2.10 34 EURO_2m 2.09 35 EURO_2m 2.09 36 EURO_2m 2.11 As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to INTERPOLATE these rates, which can be done using approx or approxfun. In reality I can have many currencies with many tenors. Problem is when the data.frame df is read or accessed in R, I am not aware which tenor is missing. For a given currency, it is possible that mare than 1 consecutive tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing. I understand it's sort of vague question from me and do apologize for the same. Any suggestion please. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrapping in R
First you should read some introductory manuals on R. There are many to choose from at http://cran.r-project.org/other-docs.html For example, your first question is very simple: z - data.frame(a, b, c) To draw a single random sample (with replacement) from z: z1 - z[sample(1:nrow(z), nrow(z), replace=TRUE),] - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Michael Weylandt Sent: Thursday, April 25, 2013 4:36 AM To: Preetam Pal Cc: r-help@r-project.org Subject: Re: [R] Bootstrapping in R On Apr 25, 2013, at 7:02, Preetam Pal lordpree...@gmail.com wrote: Hi all, 1i have 3 vectors a,b and c, each of length 25... i want to 1define a new data frame z such that z[1] = (a[1] b[1] c[1]), z[2] = (a[2] b[2] c[2]) and so on...how do i do it in R z - data.frame(a, b, c) 2 Then i want to draw bootstrap samples from z. Look at the boot package. MW Kindly suggest how i can do this in R. Thanks, Preetam -- Preetam Pal (+91)-9432212774 M-Stat 2nd Year, Room No. N-114 Statistics Division, C.V.Raman Hall Indian Statistical Institute, B.H.O.S. Kolkata. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing data
Another approach: x[1:length(x) %% 10 == 0] - NA Just replace 10 by the interval you want. Or to add 5 missing values randomly: x[sample(1:length(x), 5)] -NA - David L Carlson Associate Professor of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rainer Schuermann Sent: Thursday, April 25, 2013 5:45 AM To: r-help@r-project.org Cc: Roslina Zakaria Subject: Re: [R] Missing data I read your data into a dataframe x - read.table( clipboard ) and renamed the only column colnames( x )[1] - orig With a loop, I created a 2nd column miss where in every 10th row the observation is set to NA: for( i in 1 : length( x$orig ) ) { if( as.integer( rownames( x )[ i ] ) %% 10 == 0 ) { x$miss[i] - NA } else { x$miss[i] - x$orig[i] } } This is probably the least elegant of all possible solutions but it works... Rgds, Rainer On Wednesday 24 April 2013 23:41:21 Roslina Zakaria wrote: Dear r-users, I would like to investigate about how to fill in missing data. I started with a complete data and try to introduce missing data into the data series. Then I would use some method to fill in the missing data and then compare with the original data how good it is. My question is, how do I introduce missing data in my complete data systematically like for example every 10th data will be erased and assumed as missing. Here are some rainfall data: 125 130.3 327.2 252.2 33.8 6.1 5.1 0.5 0.5 0 2.3 0 0 0 0 0 0 0 0 0 0.8 5.1 0 0.3 0 0 0 0 0 0 45.7 43.4 0 0 0 0 0 Thank you so much for any help given. I hope my question is clear. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make R 3.0 open .RData files
On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote: Hello! I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and 3.0.0. Before I had R 3.0 I made it a setting that all .RData files - when I double-click on them - were opened by R 2.15.3. Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want to remove R 2.15.3. yet). I right-click on some .RData file, select Open with - Choose default program and then click on Browse. I browse to the folder where my R 3.0 is installed, then to the folder bin, then to the folder x64 and select Rgui.exe. However, when R opens - or after I shut R down and then double-click on some .RData file and R opens, it is again R 2.15.3, not R3.0. What am I doing wrong? Of course, when I open R 3.0 directly, then it opens no problem. This is really a question about Windows 7, not about R, but I would guess you aren't telling it to make your choice permanent, or perhaps you are not allowed by your administrator to make permanent changes to file associations. You should ask for local help. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble Computing Type III SS in a Cox Regression
Hi Dr. Therneau, Thanks for your reply to my question. I'm aware that many on the list do not like type III SS. I'm not particularly attached to the idea of using them but often produce output for others who see value in type III SS. You mention the problems with type III SS when testing interactions. I don't think we'll be doing that here though. So my type III SS could just as easily be called type II SS I think. If the SS I'm calculating are essentially type II SS, is that still problematic for a Cox model? People using type III SS generally want a measure of whether or not a variable is contributing something to their model or if it could just as easily be discarded. Is there a better way of addressing this question than by using type III (or perhaps type II) SS? A series of model comparisons using a LRT might be the answer. If it is, is there an efficient way of implementing this approach when there are many predictors? Another approach might be to run models through step or stepAIC in order to determine which predictors are useful and to discard the rest. Is that likely to be any good? Thanks, Paul --- On Wed, 4/24/13, Terry Therneau thern...@mayo.edu wrote: From: Terry Therneau thern...@mayo.edu Subject: Re: Trouble Computing Type III SS in a Cox Regression To: r-help@r-project.org, Paul Miller pjmiller...@yahoo.com Received: Wednesday, April 24, 2013, 5:55 PM I should hope that there is trouble, since type III is an undefined concept for a Cox model. Since SAS Inc fostered the cult of type III they have recently added it as an option for phreg, but I am not able to find any hints in the phreg documentation of what exactly they are doing when you invoke it. If you can unearth this information, then I will be happy to tell you whether a. using the test (whatever it is) makes any sense at all for your data set b. if a is true, how to get it out of R I use the word cult on purpose -- an entire generation of users who believe in the efficacy of this incantation without having any idea what it actually does. In many particular instances the SAS type III corresponds to a survey sampling question, i.e., reweight the data so that it is balanced wrt factor A and then test factor B in the new sample. The three biggest problems with type III are that 1: the particular test has been hyped as better when in fact it sometimes is sensible and sometimes not, 2: SAS implemented it as a computational algorithm which unfortunately often works even when the underlying rationale does not hold and 3: they explain it using a notation that completely obscures the actual question. This last leads to the nonsense phrase test for main effects in the presence of interactions. There is a survey reweighted approach for Cox models, very closely related to the work on causal inference (marginal structural models), but I'd bet dollars to donuts that this is not what SAS is doing. (Per 2 -- type III was a particular order of operations of the sweep algorithm for linear models, and for backwards compatability that remains the core definition even as computational algorthims have left sweep behind. But Cox models can't be computed using the sweep algorithm). Terry Therneau On 04/24/2013 12:41 PM, r-help-requ...@r-project.org wrote: Hello All, Am having some trouble computing Type III SS in a Cox Regression using either drop1 or Anova from the car package. Am hoping that people will take a look to see if they can tell what's going on. Here is my R code: cox3grp- subset(survData, Treatment %in% c(DC, DA, DO), c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, PS2)) cox3grp- droplevels(cox3grp) str(cox3grp) coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = efron) coxCV drop1(coxCV, test=Chisq) require(car) Anova(coxCV, type=III) And here are my results: cox3grp- subset(survData, + Treatment %in% c(DC, DA, DO), + c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, PS2)) cox3grp- droplevels(cox3grp) str(cox3grp) 'data.frame': 227 obs. of 6 variables: $ PTNO : int 1195997 104625 106646 1277507 220506 525343 789119 817160 824224 82632 ... $ Treatment : Factor w/ 3 levels DC,DA,DO: 1 1 1 1 1 1 1 1 1 1 ... $ PFS_CENSORED: int 1 1 1 0 1 1 1 1 0 1 ... $ PFS_MONTHS : num 1.12 8.16 6.08 1.35 9.54 ... $ AGE : num 72 71 80 65 72 60 63 61 71 70 ... $ PS2 : Ord.factor w/ 2 levels YesNo: 2 2 2 2 2 2 2 2 2 2 ... coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = efron) coxCV Call: coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data = cox3grp, method = efron) coef exp(coef) se(coef) z p AGE 0.00492 1.005 0.00789 0.624 0.530 PS2.L -0.34523
Re: [R] Make R 3.0 open .RData files
Weird - because I was successful in doing it as I was installing earlier R versions and moved from an earlier version to a newer version. Never had any problems with making permanent changes to file associations in any other programs either. On Thu, Apr 25, 2013 at 9:00 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote: Hello! I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and 3.0.0. Before I had R 3.0 I made it a setting that all .RData files - when I double-click on them - were opened by R 2.15.3. Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want to remove R 2.15.3. yet). I right-click on some .RData file, select Open with - Choose default program and then click on Browse. I browse to the folder where my R 3.0 is installed, then to the folder bin, then to the folder x64 and select Rgui.exe. However, when R opens - or after I shut R down and then double-click on some .RData file and R opens, it is again R 2.15.3, not R3.0. What am I doing wrong? Of course, when I open R 3.0 directly, then it opens no problem. This is really a question about Windows 7, not about R, but I would guess you aren't telling it to make your choice permanent, or perhaps you are not allowed by your administrator to make permanent changes to file associations. You should ask for local help. Duncan Murdoch -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Weighted Principle Components analysis
Hello! I am doing Principle Componenets Analysis using psych package: mypc-principal(mydata,5,scores=TRUE) However, I was asked to run a case-weighted PCA - using an individual weight for each case. I could use corr from boot package to calculate the case-weighed intercorrelation matrix. But if I use the intercorrelation matrix as input (instead of the raw data), I am not going to get factor scores, which I do need to get. Any advice? Thank you very much! -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make R 3.0 open .RData files
On 25/04/2013 14:00, Duncan Murdoch wrote: On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote: Hello! I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and 3.0.0. Before I had R 3.0 I made it a setting that all .RData files - when I double-click on them - were opened by R 2.15.3. Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want to remove R 2.15.3. yet). I right-click on some .RData file, select Open with - Choose default program and then click on Browse. I browse to the folder where my R 3.0 is installed, then to the folder bin, then to the folder x64 and select Rgui.exe. However, when R opens - or after I shut R down and then double-click on some .RData file and R opens, it is again R 2.15.3, not R3.0. What am I doing wrong? Of course, when I open R 3.0 directly, then it opens no problem. This is really a question about Windows 7, not about R, but I would guess you aren't telling it to make your choice permanent, or perhaps you are not allowed by your administrator to make permanent changes to file associations. You should ask for local help. We've encountered this for our student accounts, and think it is a bug in Windows 7. If you remove the relevant old Registry entries first it should work. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear Interpolation : Missing rates
Dear Mr Adams, Thanks a lot for your solution. I understand it was very tricky and needed lot of application. Thanks again and do appreciate your efforts. Regards Katherine --- On Thu, 25/4/13, Adams, Jean jvad...@usgs.gov wrote: From: Adams, Jean jvad...@usgs.gov Subject: Re: [R] Linear Interpolation : Missing rates To: Katherine Gobin katherine_go...@yahoo.com Cc: R help r-help@r-project.org Date: Thursday, 25 April, 2013, 2:23 PM Katherine, Split the rate names into their currency and tenor parts and assign a numeric value to each tenor. Choose a model to do your approximations (I used linear regression in the example below). Use this model to generate estimates for all combinations of currency and tenor. For example: # split the rate names into currency and tenorsplitnames - do.call(rbind, strsplit(df$rate_name, _))df$currency - as.factor(splitnames[, 1]) df$tenor - splitnames[, 2] # assign numeric value to each tenoruniquetenors - c(1w, 2w, 1m, 2m)uniquedays - c(7, 14, 30.5, 61) df$tenordays - uniquedays[match(df$tenor, uniquetenors)] # fit a linear model of rate on tenordays for each currencyfit - lm(rates ~ currency*tenordays, data=df) # estimate rates for all combinations of currency and tenorfulldf - expand.grid(tenordays=unique(df$tenordays), currency=unique(df$currency))fulldf$est.rates = predict(fit, newdata=fulldf) # merge observed rates with estimated ratesdfwithest - merge(df, fulldf, all=TRUE) Jean On Thu, Apr 25, 2013 at 12:33 AM, Katherine Gobin katherine_go...@yahoo.com wrote: Dear R forum I have data.frame as df = data.frame(rate_name = c(USD_1w, USD_1w, USD_1w, USD_1w, USD_1m, USD_1m, USD_1m, USD_1m, USD_2m, USD_2m, USD_2m, USD_2m, GBP_1w, GBP_1w, GBP_1w, GBP_1w, GBP_1m, GBP_1m, GBP_1m, GBP_1m, GBP_2m, GBP_2m, GBP_2m, GBP_2m, EURO_1w, EURO_1w, EURO_1w, EURO_1w, EURO_2w, EURO_2w, EURO_2w, EURO_2w, EURO_2m, EURO_2m, EURO_2m, EURO_2m), rates = c(2.05, 2.07, 2.06, 2.06, 2.22, 2.24, 2.23, 2.23, 2.31, 2.33, 2.33, 2.31, 1.06, 1.08, 1.08, 1.08, 1.21, 1.21, 1.23, 1.21, 1.41, 1.39, 1.39, 1.37, 1.82, 1.82, 1.81, 1.80, 1.98, 1.98, 1.97, 1.97, 2.1, 2.09, 2.09, 2.11)) currency = c(EURO, GBP, USD) tenor = c(1w, 2w, 1m, 2m, 3m) # _ df rate_name rates rate_name rates 1 USD_1w 2.05 2 USD_1w 2.07 3 USD_1w 2.06 4 USD_1w 2.06 5 USD_1m 2.22 6 USD_1m 2.24 7 USD_1m 2.23 8 USD_1m 2.23 9 USD_2m 2.31 10 USD_2m 2.33 11 USD_2m 2.33 12 USD_2m 2.31 13 GBP_1w 1.06 14 GBP_1w 1.08 15 GBP_1w 1.08 16 GBP_1w 1.08 17 GBP_1m 1.21 18 GBP_1m 1.21 19 GBP_1m 1.23 20 GBP_1m 1.21 21 GBP_2m 1.41 22 GBP_2m 1.39 23 GBP_2m 1.39 24 GBP_2m 1.37 25 EURO_1w 1.82 26 EURO_1w 1.82 27 EURO_1w 1.81 28 EURO_1w 1.80 29 EURO_2w 1.98 30 EURO_2w 1.98 31 EURO_2w 1.97 32 EURO_2w 1.97 33 EURO_2m 2.10 34 EURO_2m 2.09 35 EURO_2m 2.09 36 EURO_2m 2.11 As can be seen that USD_2w, GBP_2w and EURO_1m are missing and I need to INTERPOLATE these rates, which can be done using approx or approxfun. In reality I can have many currencies with many tenors. Problem is when the data.frame df is read or accessed in R, I am not aware which tenor is missing. For a given currency, it is possible that mare than 1 consecutive tenors may be missing e.g. in case of EURO, I may have EURO_1w, EURO_2w and then EURO_4m. So EURO_1m, EURO_2m and EURO_3m are missing. I understand it's sort of vague question from me and do apologize for the same. Any suggestion please. Regards Katherine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decomposing a List
Well, what you really want to do is convert the list to a matrix, and it can be done directly and considerably faster than with the (implicit) looping of sapply: f1 - function(l)sapply(l,[,1) f2 - function(l)matrix(unlist(l),nr=2) l - strsplit(paste(sample(LETTERS,1e6,rep=TRUE),sample(1:10,1e6,rep=TRUE),sep=+),+,fix=TRUE) ## Then you get these results: system.time(x1 - f1(l)) user system elapsed 1.920.011.95 system.time(x2 - f2(l)) user system elapsed 0.060.020.08 system.time(x2 - f2(l)[1,]) user system elapsed 0.1 0.0 0.1 identical(x1,x2) [1] TRUE Cheers, Bert On Thu, Apr 25, 2013 at 3:32 AM, Ted Harding ted.hard...@wlandres.net wrote: Thanks, Jorge, that seems to work beautifully! (Now to try to understand why ... but that's for later). Ted. On 25-Apr-2013 10:21:29 Jorge I Velez wrote: Dear Dr. Harding, Try sapply(L, [, 1) sapply(L, [, 2) HTH, Jorge.- On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote: Greetings! For some reason I am not managing to work out how to do this (in principle) simple task! As a result of applying strsplit() to a vector of character strings, I have a long list L (N elements), where each element is a vector of two character strings, like: L[1] = c(A1,B1) L[2] = c(A2,B2) L[3] = c(A3,B3) [etc.] From L, I wish to obtain (as directly as possible, e.g. avoiding a loop) two vectors each of length N where one contains the strings that are first in the pair, and the other contains the strings which are second, i.e. from L (as above) I would want to extract: V1 = c(A1,A2,A3,...) V2 = c(B1,B2,B3,...) Suggestions? With thanks, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:16:46 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:31:57 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble Computing Type III SS in a Cox Regression
Please take this discussion offlist. It is **not** about R. -- Bert On Thu, Apr 25, 2013 at 5:59 AM, Paul Miller pjmiller...@yahoo.com wrote: Hi Dr. Therneau, Thanks for your reply to my question. I'm aware that many on the list do not like type III SS. I'm not particularly attached to the idea of using them but often produce output for others who see value in type III SS. You mention the problems with type III SS when testing interactions. I don't think we'll be doing that here though. So my type III SS could just as easily be called type II SS I think. If the SS I'm calculating are essentially type II SS, is that still problematic for a Cox model? People using type III SS generally want a measure of whether or not a variable is contributing something to their model or if it could just as easily be discarded. Is there a better way of addressing this question than by using type III (or perhaps type II) SS? A series of model comparisons using a LRT might be the answer. If it is, is there an efficient way of implementing this approach when there are many predictors? Another approach might be to run models through step or stepAIC in order to determine which predictors are useful and to discard the rest. Is that likely to be any good? Thanks, Paul --- On Wed, 4/24/13, Terry Therneau thern...@mayo.edu wrote: From: Terry Therneau thern...@mayo.edu Subject: Re: Trouble Computing Type III SS in a Cox Regression To: r-help@r-project.org, Paul Miller pjmiller...@yahoo.com Received: Wednesday, April 24, 2013, 5:55 PM I should hope that there is trouble, since type III is an undefined concept for a Cox model. Since SAS Inc fostered the cult of type III they have recently added it as an option for phreg, but I am not able to find any hints in the phreg documentation of what exactly they are doing when you invoke it. If you can unearth this information, then I will be happy to tell you whether a. using the test (whatever it is) makes any sense at all for your data set b. if a is true, how to get it out of R I use the word cult on purpose -- an entire generation of users who believe in the efficacy of this incantation without having any idea what it actually does. In many particular instances the SAS type III corresponds to a survey sampling question, i.e., reweight the data so that it is balanced wrt factor A and then test factor B in the new sample. The three biggest problems with type III are that 1: the particular test has been hyped as better when in fact it sometimes is sensible and sometimes not, 2: SAS implemented it as a computational algorithm which unfortunately often works even when the underlying rationale does not hold and 3: they explain it using a notation that completely obscures the actual question. This last leads to the nonsense phrase test for main effects in the presence of interactions. There is a survey reweighted approach for Cox models, very closely related to the work on causal inference (marginal structural models), but I'd bet dollars to donuts that this is not what SAS is doing. (Per 2 -- type III was a particular order of operations of the sweep algorithm for linear models, and for backwards compatability that remains the core definition even as computational algorthims have left sweep behind. But Cox models can't be computed using the sweep algorithm). Terry Therneau On 04/24/2013 12:41 PM, r-help-requ...@r-project.org wrote: Hello All, Am having some trouble computing Type III SS in a Cox Regression using either drop1 or Anova from the car package. Am hoping that people will take a look to see if they can tell what's going on. Here is my R code: cox3grp- subset(survData, Treatment %in% c(DC, DA, DO), c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, PS2)) cox3grp- droplevels(cox3grp) str(cox3grp) coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = efron) coxCV drop1(coxCV, test=Chisq) require(car) Anova(coxCV, type=III) And here are my results: cox3grp- subset(survData, + Treatment %in% c(DC, DA, DO), + c(PTNO, Treatment, PFS_CENSORED, PFS_MONTHS, AGE, PS2)) cox3grp- droplevels(cox3grp) str(cox3grp) 'data.frame':227 obs. of 6 variables: $ PTNO: int 1195997 104625 106646 1277507 220506 525343 789119 817160 824224 82632 ... $ Treatment : Factor w/ 3 levels DC,DA,DO: 1 1 1 1 1 1 1 1 1 1 ... $ PFS_CENSORED: int 1 1 1 0 1 1 1 1 0 1 ... $ PFS_MONTHS : num 1.12 8.16 6.08 1.35 9.54 ... $ AGE : num 72 71 80 65 72 60 63 61 71 70 ... $ PS2 : Ord.factor w/ 2 levels YesNo: 2 2 2 2 2 2 2 2 2 2 ... coxCV- coxph(Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data=cox3grp, method = efron) coxCV Call: coxph(formula = Surv(PFS_MONTHS, PFS_CENSORED == 1) ~ AGE + PS2, data = cox3grp, method = efron)
[R] problem with geom_point in ggplot using a different column
I want to draw boxplot where the geom_points are displayed based on ERBB2.MUT subset and they should be displayed in the right box (based both on the ERBB2.2064 field and ERBB2_Status). However, given my command I currently only see red points corresponding to MUT subset in one straight line corresponding to only ERBB2.2064 stratification on x-axis. It dosen't take into account the ERBB2.Status stratification. Can anyone help me? Call ERBB2|2064 ERBB2_Status ERBB2-MUT A 7.214E-01 CHANGE MUT B -4.208E-02 NEUTRAL MUT D 1.080E+00 NEUTRAL MUT C 2.347E-01 NEUTRAL MUT ggplot(data=testdata, aes(x=Call, y=ERBB2.2064)) + geom_boxplot(aes(fill=ERBB2_Status),width=0.8)+theme_bw()+geom_point(data=subset(testdata,ERBB2.MUT==MUT),aes(shape=Call,color=Red)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble Computing Type III SS in a Cox Regression
You've missed the point of my earlier post, which is that type III is not an answerable question. 1. There are lots of ways to compare Cox models, LRT is normally considered the most reliable by serious authors. There is usually not much difference between score, Wald, and LRT tests though, and the other two are more convenient in many situations. 2. Type III is a question that can't be addressed. SAS prints something out with that label, but since they don't document what it is, and people with in-depth knowlegde of Cox models (like me) cannot figure out what a sensible definition could actually be, there is nowhere to go. How to do this in R can't be answered. (It has nothing to do with interactions.) 3. If you have customers who think that the earth is flat, global warming is a conspiracy, or that type III has special meaning this is a re-education issue, and I can't much help with that. Terry T. On 04/25/2013 07:59 AM, Paul Miller wrote Hi Dr. Therneau, Thanks for your reply to my question. I'm aware that many on the list do not like type III SS. I'm not particularly attached to the idea of using them but often produce output for others who see value in type III SS. You mention the problems with type III SS when testing interactions. I don't think we'll be doing that here though. So my type III SS could just as easily be called type II SS I think. If the SS I'm calculating are essentially type II SS, is that still problematic for a Cox model? People using type III SS generally want a measure of whether or not a variable is contributing something to their model or if it could just as easily be discarded. Is there a better way of addressing this question than by using type III (or perhaps type II) SS? A series of model comparisons using a LRT might be the answer. If it is, is there an efficient way of implementing this approach when there are many predictors? Another approach might be to run models through step or stepAIC in order to determine which predictors are useful and to discard the rest. Is that likely to be any good? Thanks, Paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tables: proper use of Hline() in tabular()
Dear all, I am unable to understand how Hline() works in tabular(). I've read the vignette and the help page, and here this example compiles perfectly fine: latex( tabular( Species + Hline() + 1 ~ Heading()*mean*All(iris), data=iris) ) However, if I try it on my own data it fails. Consider this: set.seed(1) Xa - data.frame(p=rep(c(First group,Second group,Third group),each=10,len=30), a=sample(c(Some long label,Some other long label, Yet another label), 30, replace=TRUE),id=seq(30), b=round(runif(30,10,20)), c=round(runif(30,40,70))) (x - tabular(((p=factor(p))*(a=factor(a))+1) ~ (N = 1) + (b + c)* (mean+sd),data=Xa)) pa N mean sdmean sd First group Some long label3 15.67 2.082 64.33 3.786 Some other long label 4 14.75 2.630 52.25 8.461 Yet another label 3 15.67 3.215 50.67 3.055 Second group Some long label2 17.00 1.414 57.50 10.607 Some other long label 3 17.00 1.000 60.00 8.888 Yet another label 5 15.00 3.082 58.20 8.672 Third group Some long label4 13.75 3.594 58.75 5.909 Some other long label 4 13.50 1.732 46.50 3.786 Yet another label 2 16.00 1.414 50.00 4.243 All 30 15.13 2.501 55.37 8.045 I would like to place an Hline() between rows 3:4, rows 6:7, rows 9:10. But either way I place it I get something that doesn't compile in LaTeX (! Misplaced \noalign. error). For example, x - tabular(((p=factor(p)))*(a=factor(a)) +(Hline() + 1) ~ (N = 1) + (b + c)* (mean+sd),data=Xa) latex(x) \begin{tabular}{llc} \hline \multicolumn{2}{c}{b} \multicolumn{2}{c}{c} \\ p a N mean sd mean \multicolumn{1}{c}{sd} \\ \hline First group Some long label $\phantom{0}3$ $15.67$ $2.082$ $64.33$ $\phantom{0}3.786$ \\ Some other long label $\phantom{0}4$ $14.75$ $2.630$ $52.25$ $\phantom{0}8.461$ \\ Yet another label $\phantom{0}3$ $15.67$ $3.215$ $50.67$ $\phantom{0}3.055$ \\ Second group Some long label $\phantom{0}2$ $17.00$ $1.414$ $57.50$ $10.607$ \\ Some other long label $\phantom{0}3$ $17.00$ $1.000$ $60.00$ $\phantom{0}8.888$ \\ Yet another label $\phantom{0}5$ $15.00$ $3.082$ $58.20$ $\phantom{0}8.672$ \\ Third group Some long label $\phantom{0}4$ $13.75$ $3.594$ $58.75$ $\phantom{0}5.909$ \\ Some other long label $\phantom{0}4$ $13.50$ $1.732$ $46.50$ $\phantom{0}3.786$ \\ Yet another label $\phantom{0}2$ $16.00$ $1.414$ $50.00$ $\phantom{0}4.243$ \\ \hline %\\ All $30$ $15.13$ $2.501$ $55.37$ $\phantom{0}8.045$ \\ \hline \end{tabular} Please advise how to use Hline() in the example above. Regards, Liviu -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [SQL]
Hi, The data for my new project are in a bunch of .sql files, instead of the clasic csv files that I'm used to work with. Could someone explain to me how to read these files into R? Thanks, -Ignacio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Predictions with missing inputs
Hi Bill, Very clear response. How about when the missing values are on the response variable being predicted (y)? That is, the model is fitted only to complete cases, but then I want to have predictions for all individual y (including those missing). Can I use the mean for that variable 'y'? EXAMPLE: mynewdata - mydata mynewdata$y-mean(mydata$y) mypred - predict(mymodel, mynewdata) Thanks, Manuel -- View this message in context: http://r.789695.n4.nabble.com/Predictions-with-missing-inputs-tp3302303p4665411.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make R 3.0 open .RData files
Brian, how do I remove the relevant old Registry entries? Thank you! Dimitri On Thu, Apr 25, 2013 at 10:29 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: On 25/04/2013 14:00, Duncan Murdoch wrote: On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote: Hello! I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and 3.0.0. Before I had R 3.0 I made it a setting that all .RData files - when I double-click on them - were opened by R 2.15.3. Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want to remove R 2.15.3. yet). I right-click on some .RData file, select Open with - Choose default program and then click on Browse. I browse to the folder where my R 3.0 is installed, then to the folder bin, then to the folder x64 and select Rgui.exe. However, when R opens - or after I shut R down and then double-click on some .RData file and R opens, it is again R 2.15.3, not R3.0. What am I doing wrong? Of course, when I open R 3.0 directly, then it opens no problem. This is really a question about Windows 7, not about R, but I would guess you aren't telling it to make your choice permanent, or perhaps you are not allowed by your administrator to make permanent changes to file associations. You should ask for local help. We've encountered this for our student accounts, and think it is a bug in Windows 7. If you remove the relevant old Registry entries first it should work. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- Dimitri Liakhovitski [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make R 3.0 open .RData files
On 25/04/2013 17:15, Dimitri Liakhovitski wrote: Brian, how do I remove the relevant old Registry entries? That is not an R question. Our sysadmins do it Thank you! Dimitri On Thu, Apr 25, 2013 at 10:29 AM, Prof Brian Ripley rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk wrote: On 25/04/2013 14:00, Duncan Murdoch wrote: On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote: Hello! I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and 3.0.0. Before I had R 3.0 I made it a setting that all .RData files - when I double-click on them - were opened by R 2.15.3. Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want to remove R 2.15.3. yet). I right-click on some .RData file, select Open with - Choose default program and then click on Browse. I browse to the folder where my R 3.0 is installed, then to the folder bin, then to the folder x64 and select Rgui.exe. However, when R opens - or after I shut R down and then double-click on some .RData file and R opens, it is again R 2.15.3, not R3.0. What am I doing wrong? Of course, when I open R 3.0 directly, then it opens no problem. This is really a question about Windows 7, not about R, but I would guess you aren't telling it to make your choice permanent, or perhaps you are not allowed by your administrator to make permanent changes to file associations. You should ask for local help. We've encountered this for our student accounts, and think it is a bug in Windows 7. If you remove the relevant old Registry entries first it should work. -- Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~__ripley/ http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 tel:%2B44%201865%20272861 (self) 1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 tel:%2B44%201865%20272595 -- Dimitri Liakhovitski -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Make R 3.0 open .RData files
a) See FAQ 2.17 b) Methods for configuring operating systems are off topic here. I will say there is a REGEDIT program in Windows, but there are potential permissions complications (you may not have them) and possible collateral damage (don't touch it if you don't understand it) that mean you should study up on this topic with an appropriate resource (book, forum, expert, system administrator, etc.) before attempting it. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Brian, how do I remove the relevant old Registry entries? Thank you! Dimitri On Thu, Apr 25, 2013 at 10:29 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: On 25/04/2013 14:00, Duncan Murdoch wrote: On 13-04-25 8:33 AM, Dimitri Liakhovitski wrote: Hello! I have Windows 7 Enterprise and two versions of R installed: 2.15.3 and 3.0.0. Before I had R 3.0 I made it a setting that all .RData files - when I double-click on them - were opened by R 2.15.3. Now I want them to be opened by R 3.0 instead of R 2.15.3 (but I don't want to remove R 2.15.3. yet). I right-click on some .RData file, select Open with - Choose default program and then click on Browse. I browse to the folder where my R 3.0 is installed, then to the folder bin, then to the folder x64 and select Rgui.exe. However, when R opens - or after I shut R down and then double-click on some .RData file and R opens, it is again R 2.15.3, not R3.0. What am I doing wrong? Of course, when I open R 3.0 directly, then it opens no problem. This is really a question about Windows 7, not about R, but I would guess you aren't telling it to make your choice permanent, or perhaps you are not allowed by your administrator to make permanent changes to file associations. You should ask for local help. We've encountered this for our student accounts, and think it is a bug in Windows 7. If you remove the relevant old Registry entries first it should work. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] connecting matrices
Thanks arun,The second one look ok..thanks indeed Elisa Date: Thu, 25 Apr 2013 07:37:25 -0700 From: smartpink...@yahoo.com Subject: Re: connecting matrices To: eliza_bo...@hotmail.com CC: r-help@r-project.org HI Elisa, I guess there is a mistake. Check whether this is what you wanted. indx-sort(el1,index.return=TRUE)$ix[1:3] list(el[,indx],indx) #[[1]] á# áá [,1] [,2] [,3] #[1,]áá 41áá 21áá 11 #[2,]áá 42áá 22áá 12 #[3,]áá 43áá 23áá 13 #[4,]áá 44áá 24áá 14 #[5,]áá 45áá 25áá 15 # #[[2]] #[1] 9 5 3 A.K. - Original Message - From: arun smartpink...@yahoo.com To: eliza botto eliza_bo...@hotmail.com Cc: R help r-help@r-project.org Sent: Thursday, April 25, 2013 10:09 AM Subject: Re: connecting matrices Dear Elisa, Try this: el- matrix(1:100,ncol=20) áset.seed(25) áel1- matrix(sample(1:100,20,replace=TRUE),ncol=1) In the example you showed, there were no column names.á álist(el[,sort(el1)[1:3]],sort(el1,index.return=TRUE)$ix[1:3]) #[[1]] á# áá [,1] [,2] [,3] #[1,]áá 31áá 61áá 71 #[2,]áá 32áá 62áá 72 #[3,]áá 33áá 63áá 73 #[4,]áá 34áá 64áá 74 #[5,]áá 35áá 65áá 75 # #[[2]] #[1] 9 5 3 A.K. From: eliza botto eliza_bo...@hotmail.com To: smartpink...@yahoo.com smartpink...@yahoo.com Sent: Thursday, April 25, 2013 9:54 AM Subject: connecting matrices Dear Arun, [text file contains the exact format] Although the last codes were absolutely correct and worked the way i want them to. I have an additional cover-up question.á Suppose i have a matrix el... here i show you only some part of that matrix so that codes can work faster. el á á á[,595586] [,595587] [,595588] [,595589] [,595590] [,595591] [,595592] [,595593] [,595594] [,595595] [,595596] [,595597] [,595598] [,595599] [,595600] [,595601] [1,] á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á55 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 [2,] á á á á59 á á á á59 á á á á59 á á á á59 á á á á59 á á á á59 á á á á60 á á á á60 á á á á60 á á á á61 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 [3,] á á á á60 á á á á60 á á á á60 á á á á61 á á á á61 á á á á62 á á á á61 á á á á61 á á á á62 á á á á62 á á á á58 á á á á58 á á á á58 á á á á58 á á á á58 á á á á59 [4,] á á á á61 á á á á62 á á á á63 á á á á62 á á á á63 á á á á63 á á á á62 á á á á63 á á á á63 á á á á63 á á á á59 á á á á60 á á á á61 á á á á62 á á á á63 á á á á60 á á á[,595602] [,595603] [,595604] [,595605] [,595606] [,595607] [,595608] [,595609] [,595610] [,595611] [,595612] [,595613] [,595614] [,595615] [,595616] [,595617] [1,] á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 á á á á56 [2,] á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á57 á á á á58 á á á á58 á á á á58 á á á á58 á á á á58 á á á á58 á á á á58 [3,] á á á á59 á á á á59 á á á á59 á á á á60 á á á á60 á á á á60 á á á á61 á á á á61 á á á á62 á á á á59 á á á á59 á á á á59 á á á á59 á á á á60 á á á á60 á á á á60 [4,] á á á á61 á á á á62 á á á á63 á á á á61 á á á á62 á á á á63 á á á á62 á á á á63 á á á á63 á á á á60 á á á á61 á á á á62 á á á á63 á á á á61 á á á á62 á á á á63 In connection to this matrix, there is another matrix which contains coordination values for each of the column of matrix el el1 [595586,] á 5.67 áá [595587,] á 55.90 áá [595588,] á 515 áá [595589,] á 755 áá [595590,] á 955 áá [595591,] á 5.95 áá [595592,] á 575 áá [595593,] á 505 áá [595594,] á 505 áá [595595,] á 515 áá [595596,] á 5612 áá [595597,] á 506 áá [595598,] á 576 áá [595599,] á 5126 áá [595600,] á 5216 áá [595601,] á 5666 áá [595602,] á 526 áá [595603,] á 5.6 áá [595604,] á 156 áá [595605,] á 4556 áá [595606,] á 5556 áá [595607,] á 1256 áá [595608,] á 1256 áá [595609,] á 8756 áá [595610,] á 5906 áá [595611,] á 789 áá [595612,] á 5006 áá [595613,] á 1256 áá [595614,] á 3356 áá [595615,] á 7756 áá [595616,] á 4456 áá [595617,] á 3356 áá What i want in the end is a list of two elemens containing the 10 column of el which have the lowest values in matrix el1. More precisely [[1]] [,595603][,595586][595591,] 56 575959 596062 626163 [[2]] 5.65.675.95 is it possible to carry out such operation?? thanks for your help Elisaá á á [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [SQL]
With so little information, one can only guess. I would guess your .sql files contain scripts written in the SQL language, in which case you will need some local database support to help you run those scripts in whatever database has the data. Perhaps the scripts will output csv files. If it turns out that you need run the SQL scripts from within R, then I'd suggest asking for help on R-sig-db. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 4/25/13 9:09 AM, Ignacio Martinez ignaci...@gmail.com wrote: Hi, The data for my new project are in a bunch of .sql files, instead of the clasic csv files that I'm used to work with. Could someone explain to me how to read these files into R? Thanks, -Ignacio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [SQL]
The format of files with a SQL extension are not necessarily well- defined. In most cases I have found, they are text files that contain SQL Data Definition Language statements (CREATE TABLE) and possibly Data Manipulation Language statements (INSERT INTO). You may be able to extract the portions of the files that contain data using read.csv and judicious use of the skip and nrow arguments, but you will have to first become familiar with the contents of the file using a text editor. If they are binary files, you may need to consult with the source of the data to identify the format used more precisely. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Ignacio Martinez ignaci...@gmail.com wrote: Hi, The data for my new project are in a bunch of .sql files, instead of the clasic csv files that I'm used to work with. Could someone explain to me how to read these files into R? Thanks, -Ignacio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with geom_point in ggplot using a different column
https://github.com/hadley/devtools/wiki/Reproducibility John Kane Kingston ON Canada -Original Message- From: angerusso1...@gmail.com Sent: Thu, 25 Apr 2013 11:09:18 -0400 To: r-help@r-project.org, r-help-requ...@r-project.org Subject: [R] problem with geom_point in ggplot using a different column I want to draw boxplot where the geom_points are displayed based on ERBB2.MUT subset and they should be displayed in the right box (based both on the ERBB2.2064 field and ERBB2_Status). However, given my command I currently only see red points corresponding to MUT subset in one straight line corresponding to only ERBB2.2064 stratification on x-axis. It dosen't take into account the ERBB2.Status stratification. Can anyone help me? Is this supposed to represent your data? Call ERBB2|2064 ERBB2_Status ERBB2-MUT A 7.214E-01 CHANGE MUT B -4.208E-02 NEUTRAL MUT D 1.080E+00 NEUTRAL MUT C 2.347E-01 NEUTRAL MUT ggplot(data=testdata, aes(x=Call, y=ERBB2.2064)) + geom_boxplot(aes(fill=ERBB2_Status),width=0.8)+theme_bw()+geom_point(data=subset(testdata,ERBB2.MUT==MUT),aes(shape=Call,color=Red)) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Send any screenshot to your friends in seconds... Works in all emails, instant messengers, blogs, forums and social networks. TRY IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if2 for FREE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with package RNetCDF when attached
I have a problem with the RNetCDF package in MacOSX 10.8.3, R3.0.0. If you have a solution, it would be great ! Thanks a lot. Marc Girondot install.packages(RNetCDF) essai de l'URL 'http://cran.at.r-project.org/bin/macosx/contrib/3.0/RNetCDF_1.6.1-2.tgz' Content type 'application/x-gzip' length 2071758 bytes (2.0 Mb) URL ouverte == downloaded 2.0 Mb The downloaded binary packages are in /var/folders/6f/w2t25jws2ng_qqnvl4xgnnc0gn/T//Rtmphm82to/downloaded_packages library(RNetCDF, lib.loc=/Library/Frameworks/R.framework/Versions/3.0/Resources/library) Error : .onLoad a échoué dans loadNamespace() pour 'RNetCDF', détails : appel : NULL erreur : I/O error (udunits) Erreur : le chargement du package ou de l'espace de noms a échoué pour ‘RNetCDF’ -- __ Marc Girondot, Pr Laboratoire Ecologie, Systématique et Evolution Equipe de Conservation des Populations et des Communautés CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079 Bâtiment 362 91405 Orsay Cedex, France Tel: 33 1 (0)1.69.15.72.30 Fax: 33 1 (0)1.69.15.73.53 e-mail: marc.giron...@u-psud.fr Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html Skype: girondot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RStudio.. text editor
Dear Rxperts/RStudio users, Is there a way to set tabs (the TAB key) in the text editor of RStudio, similar to the way customization can be done in Tinn-R? Thanks and regards, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RStudio.. text editor
On 25/04/2013 3:04 PM, Santosh wrote: Dear Rxperts/RStudio users, Is there a way to set tabs (the TAB key) in the text editor of RStudio, similar to the way customization can be done in Tinn-R? You're asking on the wrong list. RStudio has its own support forums. Start on their web site... Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pglm package: fitted values and residuals
On Wed, Apr 24, 2013 at 4:37 PM, Achim Zeileis achim.zeil...@uibk.ac.at wrote: On Wed, 24 Apr 2013, Paul Johnson wrote: On Wed, Apr 24, 2013 at 3:11 AM, alfonso.carf...@uniparthenope.it wrote: I'm using the package pglm and I'have estimated a random probit model. I need to save in a vector the fitted values and the residuals of the model but I can not do it. I tried with the command fitted.values using the following procedure without results: This is one of those ask the pglm authors questions. You should take it up with the authors of the package. There is a specialized email list R-sig-mixed where you will find more people working on this exact same thing. pglm looks like fun to me, but it is not quite done, so far as I can tell. I'm sure that there are many. One of my attempts to write up a list is in Table 1 of vignette(betareg, package = betareg). Yes! That's exactly the list I was thinking of. It was driving me crazy I could not find it. Thanks for the explanation. I don't think I should have implied that the pglm author must actually implement all the methods, it is certainly acceptable to leverage the methods that exist. It just happened that the ones I tested were not implemented by any of the affiliated packages. But this thread leads me to one question I've wondered about recently. Suppose I run somebody's regression function and out comes an object. Do we have a way to ask that object what are all of the methods that might apply to you? Here's why I wondered. You've noticed that predict.lm has the interval=confidence argument, but predict.glm does not. So if I receive a regression model, I'd like to say to it do you have a predict method and if I could get that predict method, I could check to see if there is a formal argument interval. If it does not, maybe I'd craft one for them. pj Personally, I don't write anova() methods for my model objects because I can leverage lrtest() and waldtest() from lmtest and linearHypothesis() and deltaMethod() from car as long as certain standard methods are available, including coef(), vcov(), logLik(), etc. Similarly, an AIC() method is typically not needed as long as logLik() is available. And BIC() works if nobs() is available in addition. Best, Z pj library(pglm) m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 + SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry)) m1_S$fitted.values residuals(m1) Can someone help me about it? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul E. Johnson Professor, Political Science Assoc. Director 1541 Lilac Lane, Room 504 Center for Research Methods University of Kansas University of Kansas http://pj.freefaculty.org http://quant.ku.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] glmnet webinar Friday May 3 at 10am PDT
I will be giving a webinar on glmnet on Friday May 3, 2013 at 10am PDT (pacific daylight time) The one-hour webinar will consist of: - Intro to lasso and elastic net regularization, and coefficient paths - Why is glmnet so efficient and flexible - New features of the latest version of glmnet - Live glmnet demonstration - Question and Answer period To sign up for the webinar, please go to https://www3.gotomeeting.com/register/77950 The webinar is hosted by the Orange County R User Group., and will be moderated by its president Ray DiGiacomo Trevor Hastie has...@stanford.edu Professor, Department of Statistics, Stanford University Phone: (650) 725-2231 Fax: (650) 725-8977 URL: http://www.stanford.edu/~hastie address: room 104, Department of Statistics, Sequoia Hall 390 Serra Mall, Stanford University, CA 94305-4065 -- [[alternative HTML version deleted]] ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RStudio.. text editor
I have not use tinn-r in a while but Tools Options Code Editing perhaps? John Kane Kingston ON Canada -Original Message- From: santosh2...@gmail.com Sent: Thu, 25 Apr 2013 12:04:17 -0700 To: r-help@r-project.org Subject: [R] RStudio.. text editor Dear Rxperts/RStudio users, Is there a way to set tabs (the TAB key) in the text editor of RStudio, similar to the way customization can be done in Tinn-R? Thanks and regards, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pglm package: fitted values and residuals
On Thu, Apr 25, 2013 at 3:14 PM, Paul Johnson pauljoh...@gmail.com wrote: On Wed, Apr 24, 2013 at 4:37 PM, Achim Zeileis achim.zeil...@uibk.ac.at wrote: On Wed, 24 Apr 2013, Paul Johnson wrote: On Wed, Apr 24, 2013 at 3:11 AM, alfonso.carf...@uniparthenope.it wrote: I'm using the package pglm and I'have estimated a random probit model. I need to save in a vector the fitted values and the residuals of the model but I can not do it. I tried with the command fitted.values using the following procedure without results: This is one of those ask the pglm authors questions. You should take it up with the authors of the package. There is a specialized email list R-sig-mixed where you will find more people working on this exact same thing. pglm looks like fun to me, but it is not quite done, so far as I can tell. I'm sure that there are many. One of my attempts to write up a list is in Table 1 of vignette(betareg, package = betareg). Yes! That's exactly the list I was thinking of. It was driving me crazy I could not find it. Thanks for the explanation. I don't think I should have implied that the pglm author must actually implement all the methods, it is certainly acceptable to leverage the methods that exist. It just happened that the ones I tested were not implemented by any of the affiliated packages. But this thread leads me to one question I've wondered about recently. Suppose I run somebody's regression function and out comes an object. Do we have a way to ask that object what are all of the methods that might apply to you? Yes, minus the might: library(pglm) example(pglm) # produces an object named la sapply(class(la), function(x) methods(class=x)) # lists functions with methods for objects of this class Best, Ista Here's why I wondered. You've noticed that predict.lm has the interval=confidence argument, but predict.glm does not. So if I receive a regression model, I'd like to say to it do you have a predict method and if I could get that predict method, I could check to see if there is a formal argument interval. If it does not, maybe I'd craft one for them. pj Personally, I don't write anova() methods for my model objects because I can leverage lrtest() and waldtest() from lmtest and linearHypothesis() and deltaMethod() from car as long as certain standard methods are available, including coef(), vcov(), logLik(), etc. Similarly, an AIC() method is typically not needed as long as logLik() is available. And BIC() works if nobs() is available in addition. Best, Z pj library(pglm) m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 + SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry)) m1_S$fitted.values residuals(m1) Can someone help me about it? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul E. Johnson Professor, Political Science Assoc. Director 1541 Lilac Lane, Room 504 Center for Research Methods University of Kansas University of Kansas http://pj.freefaculty.org http://quant.ku.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RStudio.. text editor
Great Thanks so much! On Thu, Apr 25, 2013 at 12:30 PM, John Kane jrkrid...@inbox.com wrote: I have not use tinn-r in a while but Tools Options Code Editing perhaps? John Kane Kingston ON Canada -Original Message- From: santosh2...@gmail.com Sent: Thu, 25 Apr 2013 12:04:17 -0700 To: r-help@r-project.org Subject: [R] RStudio.. text editor Dear Rxperts/RStudio users, Is there a way to set tabs (the TAB key) in the text editor of RStudio, similar to the way customization can be done in Tinn-R? Thanks and regards, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop! Check it out at http://www.inbox.com/earth [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in validObject(.Object) :
Hi all, I am trying to run R NDVITS package, and I am getting the following error: Error in validObject(.Object) : invalid class GridTopology object: cells.dim has incorrect dimension Can you please suggest any idea about understanding this error and solving it. Regards, Vahe [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pglm package: fitted values and residuals
On Thu, 25 Apr 2013, Ista Zahn wrote: On Thu, Apr 25, 2013 at 3:14 PM, Paul Johnson pauljoh...@gmail.com wrote: On Wed, Apr 24, 2013 at 4:37 PM, Achim Zeileis achim.zeil...@uibk.ac.at wrote: On Wed, 24 Apr 2013, Paul Johnson wrote: On Wed, Apr 24, 2013 at 3:11 AM, alfonso.carf...@uniparthenope.it wrote: I'm using the package pglm and I'have estimated a random probit model. I need to save in a vector the fitted values and the residuals of the model but I can not do it. I tried with the command fitted.values using the following procedure without results: This is one of those ask the pglm authors questions. You should take it up with the authors of the package. There is a specialized email list R-sig-mixed where you will find more people working on this exact same thing. pglm looks like fun to me, but it is not quite done, so far as I can tell. I'm sure that there are many. One of my attempts to write up a list is in Table 1 of vignette(betareg, package = betareg). Yes! That's exactly the list I was thinking of. It was driving me crazy I could not find it. Thanks for the explanation. I don't think I should have implied that the pglm author must actually implement all the methods, it is certainly acceptable to leverage the methods that exist. It just happened that the ones I tested were not implemented by any of the affiliated packages. But this thread leads me to one question I've wondered about recently. Suppose I run somebody's regression function and out comes an object. Do we have a way to ask that object what are all of the methods that might apply to you? Yes, minus the might: library(pglm) example(pglm) # produces an object named la sapply(class(la), function(x) methods(class=x)) # lists functions with methods for objects of this class Well, this shows you the methods that are available for the class but not necessarily what arguments are supported. And even if the arguments are available they do not necessarily mean the same thing. And some things may or may not work via inheritance... So coming back to Paul's question: Yes, I think it would be nice to have support for this and in fact I have thought about similar infrastructure. But so far I didn't have a good idea for a sufficiently robust/reliable implementation. There are just so many details in the different model objects that can be handled differently. Best, Z Best, Ista Here's why I wondered. You've noticed that predict.lm has the interval=confidence argument, but predict.glm does not. So if I receive a regression model, I'd like to say to it do you have a predict method and if I could get that predict method, I could check to see if there is a formal argument interval. If it does not, maybe I'd craft one for them. pj Personally, I don't write anova() methods for my model objects because I can leverage lrtest() and waldtest() from lmtest and linearHypothesis() and deltaMethod() from car as long as certain standard methods are available, including coef(), vcov(), logLik(), etc. Similarly, an AIC() method is typically not needed as long as logLik() is available. And BIC() works if nobs() is available in addition. Best, Z pj library(pglm) m1_S-pglm(Feed ~ Cons_PC_1 + imp_gen_1 + LGDP_PC_1 + lnEI_1 + SH_Ren_1,data,family=binomial(probit),model=random,method=bfgs,index=c(Year,IDCountry)) m1_S$fitted.values residuals(m1) Can someone help me about it? Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Paul E. Johnson Professor, Political Science Assoc. Director 1541 Lilac Lane, Room 504 Center for Research Methods University of Kansas University of Kansas http://pj.freefaculty.org http://quant.ku.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trouble Computing Type III SS in a Cox Regression
On 26/04/13 03:40, Terry Therneau wrote: (In response to a question about computing type III sums of squares in a Cox regression): SNIP If you have customers who think that the earth is flat, global warming is a conspiracy, or that type III has special meaning this is a re-education issue, and I can't much help with that. Fortune nomination! cheers, Rolf __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Transferring R to another computer, R_HOME_DIR
Hello, I was looking at the R (installed on RHEL6) shell script and saw R_HOME_DIR=/usr/lib64/R. Nowhere (and I could have got it wrong) does it read in the environment value R_HOME_DIR. I have the need to rsync the entire folder below /usr/lib64/R to another computer into another directory location. Without changing the R shell script, how can i force it read in R_HOME_DIR? Or maybe i misunderstood the bash source? (Note, i cannot recompile on target machine) Cheers Saptarshi 1. I also realize Rscript will not work (i think path is hard coded in the source) Beginning of /usr/lib64/R/bin/R R_HOME_DIR=/usr/lib64/R if test ${R_HOME_DIR} = /usr/lib64/R; then case linux-gnu in linux*) run_arch=`uname -m` case $run_arch in x86_64|mips64|ppc64|powerpc64|sparc64|s390x) libnn=lib64 libnn_fallback=lib ;; *) libnn=lib libnn_fallback=lib64 ;; esac if [ -x /usr/${libnn}/R/bin/exec/R ]; then R_HOME_DIR=/usr/lib64/R elif [ -x /usr/${libnn_fallback}/R/bin/exec/R ]; then R_HOME_DIR=/usr/lib64/R ## else -- leave alone (might be a sub-arch) fi ;; esac fi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transferring R to another computer, R_HOME_DIR
Quoting Saptarshi Guha saptarshi.g...@gmail.com: Hello, I was looking at the R (installed on RHEL6) shell script and saw R_HOME_DIR=/usr/lib64/R. Nowhere (and I could have got it wrong) does it read in the environment value R_HOME_DIR. I have the need to rsync the entire folder below /usr/lib64/R to another computer into another directory location. Without changing the R shell script, how can i force it read in R_HOME_DIR? Or maybe i misunderstood the bash source? (Note, i cannot recompile on target machine) If you can't compile on the target machine, that indicates that you wouldn't have access to /usr/lib64/R anyway, so you need a different approach. Fortunately, it's easy to compile into your home directory where you do have write access. The INSTALL file in the distributed tar.gz file shows you how to compile where you want and what link you need to make it accessible. Even though the file is called INSTALL, it explains how it's not necessary to install R in order to use it. HTH Cheers Saptarshi 1. I also realize Rscript will not work (i think path is hard coded in the source) Beginning of /usr/lib64/R/bin/R R_HOME_DIR=/usr/lib64/R if test ${R_HOME_DIR} = /usr/lib64/R; then case linux-gnu in linux*) run_arch=`uname -m` case $run_arch in x86_64|mips64|ppc64|powerpc64|sparc64|s390x) libnn=lib64 libnn_fallback=lib ;; *) libnn=lib libnn_fallback=lib64 ;; esac if [ -x /usr/${libnn}/R/bin/exec/R ]; then R_HOME_DIR=/usr/lib64/R elif [ -x /usr/${libnn_fallback}/R/bin/exec/R ]; then R_HOME_DIR=/usr/lib64/R ## else -- leave alone (might be a sub-arch) fi ;; esac fi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [newbie] how to find and combine geographic maps with particular features?
SUMMARY: Specific problem: I'm regridding biomass-burning emissions from a global/unprojected inventory to a regional projection (LCC over North America). I need to have boundaries for Canada, Mexico, and US (including US states), but also Caribbean and Atlantic nations (notably the Bahamas). I would also like to add Canadian provinces and Mexican states. How to put these together? General problem: are there references regarding * sources for different geographical and political features? * combining maps for the different R graphics packages? DETAILS: (Apologies if this is a FAQ, but googling has not helped me with this.) I'd appreciate help with a specific problem, as well as guidance (e.g., pointers to docs) regarding the larger topic of combining geographical maps (especially projected ones, i.e., not just lon-lat) on plots of regional data (i.e., data that is multinational but not global). My specific problem is https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na/downloads/GFED-3.1_2008_N2O_monthly_emissions_regrid_20130404_1344.pdf which plots N2O concentrations from a global inventory of fire emissions (GFED) regridded to a North American projection. (See https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na for details.) The plot currently includes boundaries for Canada, Mexico, and US (including US states, since this is being done for a US agency), which are being gotten calling code from package=M3 http://cran.r-project.org/web/packages/M3/ like https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na/src/95484c5d63502ab146402cedc3612dcdaf629bd7/vis_regrid_vis.r?at=master ## get projected North American map NorAm.shp - project.NorAm.boundaries.for.CMAQ( units='m', extents.fp=template_input_fp, extents=template.extents, LCC.parallels=c(33,45), CRS=out.crs) https://bitbucket.org/tlroche/gfed-3.1_global_to_aqmeii-na/src/95484c5d63502ab146402cedc3612dcdaf629bd7/visualization.r?at=master # database: Geographical database to use. Choices include state # (default), world, worldHires, canusamex, etc. Use # canusamex to get the national boundaries of the Canada, the # USA, and Mexico, along with the boundaries of the states. # The other choices (state, world, etc.) are the names of # databases included with the ‘maps’ and ‘mapdata’ packages. project.M3.boundaries.for.CMAQ - function( database='state', # see `?M3::get.map.lines.M3.proj` units='m',# or 'km': see `?M3::get.map.lines.M3.proj` extents.fp, # path to extents file extents, # raster::extent object LCC.parallels=c(33,45), # LCC standard parallels: see https://github.com/TomRoche/cornbeltN2O/wiki/AQMEII-North-American-domain#wiki-EPA CRS # see `sp::CRS` ) { library(M3) ## Will replace raw LCC map's coordinates with: metadata.coords.IOAPI.list - M3::get.grid.info.M3(extents.fp) metadata.coords.IOAPI.x.orig - metadata.coords.IOAPI.list$x.orig metadata.coords.IOAPI.y.orig - metadata.coords.IOAPI.list$y.orig metadata.coords.IOAPI.x.cell.width - metadata.coords.IOAPI.list$x.cell.width metadata.coords.IOAPI.y.cell.width - metadata.coords.IOAPI.list$y.cell.width library(maps) map.lines - M3::get.map.lines.M3.proj( file=extents.fp, database=database, units=m) # dimensions are in meters, not cells. TODO: take argument map.lines.coords.IOAPI.x - (map.lines$coords[,1] - metadata.coords.IOAPI.x.orig) map.lines.coords.IOAPI.y - (map.lines$coords[,2] - metadata.coords.IOAPI.y.orig) map.lines.coords.IOAPI - cbind(map.lines.coords.IOAPI.x, map.lines.coords.IOAPI.y) # # start debugging # class(map.lines.coords.IOAPI) # # [1] matrix # summary(map.lines.coords.IOAPI) # # map.lines.coords.IOAPI.x map.lines.coords.IOAPI.y # # Min. : 283762Min. : 160844 # # 1st Qu.:26502441st Qu.:1054047 # # Median :3469204Median :1701052 # # Mean :3245997Mean :1643356 # # 3rd Qu.:43009693rd Qu.:2252531 # # Max. :4878260Max. :2993778 # # NA's :168NA's :168 # # end debugging # Note above is not zero-centered, like our extents: # extent : -2556000, 2952000, -1728000, 186 (xmin, xmax, ymin, ymax) # So gotta add (xmin, ymin) below. ## Get LCC state map # see http://stackoverflow.com/questions/14865507/how-to-display-a-projected-map-on-an-rlatticelayerplot map.IOAPI - maps::map( database=state, projection=lambert, par=LCC.parallels, plot=FALSE) # parameters to lambert: ^ # see mapproj::mapproject map.IOAPI$x - map.lines.coords.IOAPI.x + extents.xmin map.IOAPI$y - map.lines.coords.IOAPI.y +
Re: [R] Decomposing a List
On Apr 25, 2013, at 7:53 AM, Bert Gunter wrote: Well, what you really want to do is convert the list to a matrix, and it can be done directly and considerably faster than with the (implicit) looping of sapply: f1 - function(l)sapply(l,[,1) f2 - function(l)matrix(unlist(l),nr=2) l - strsplit(paste(sample(LETTERS,1e6,rep=TRUE),sample(1:10,1e6,rep=TRUE),sep=+),+,fix=TRUE) Consider this alternative: L = list( c(A1,B1), c(A2,B2), c(A3,B3) ) simplify2array(L) [,1] [,2] [,3] [1,] A1 A2 A3 [2,] B1 B2 B3 -- David. ## Then you get these results: system.time(x1 - f1(l)) user system elapsed 1.920.011.95 system.time(x2 - f2(l)) user system elapsed 0.060.020.08 system.time(x2 - f2(l)[1,]) user system elapsed 0.1 0.0 0.1 identical(x1,x2) [1] TRUE Cheers, Bert On Thu, Apr 25, 2013 at 3:32 AM, Ted Harding ted.hard...@wlandres.net wrote: Thanks, Jorge, that seems to work beautifully! (Now to try to understand why ... but that's for later). Ted. On 25-Apr-2013 10:21:29 Jorge I Velez wrote: Dear Dr. Harding, Try sapply(L, [, 1) sapply(L, [, 2) HTH, Jorge.- On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote: Greetings! For some reason I am not managing to work out how to do this (in principle) simple task! As a result of applying strsplit() to a vector of character strings, I have a long list L (N elements), where each element is a vector of two character strings, like: L[1] = c(A1,B1) L[2] = c(A2,B2) L[3] = c(A3,B3) [etc.] From L, I wish to obtain (as directly as possible, e.g. avoiding a loop) two vectors each of length N where one contains the strings that are first in the pair, and the other contains the strings which are second, i.e. from L (as above) I would want to extract: V1 = c(A1,A2,A3,...) V2 = c(B1,B2,B3,...) Suggestions? With thanks, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:16:46 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:31:57 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] advert: courses in R use, programming in Seattle
There are three courses in R at the Summer Institute for Statistics Genetics, in Seattle this July, ranging from completely introductory to advanced programming. The intermediate and advanced courses are taught by me and Ken Rice, the (new) introductory course by Ken and Tim Thornton. More information at http://www.biostat.washington.edu/suminst/sisg/schedule -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decomposing a List
Well... WIth the same list,l,as before: system.time(x3 - simplify2array(l)) user system elapsed 2.110.052.20 system.time(x2 - f2(l)) ## the matrix(unlist(...)) one user system elapsed 0.110.000.11 identical(x2,x3) [1] TRUE So kind of a big difference if you care about efficiency... (and I can't remember all those specialized functions, anyway!) -- Bert On Thu, Apr 25, 2013 at 8:53 PM, David Winsemius dwinsem...@comcast.net wrote: On Apr 25, 2013, at 7:53 AM, Bert Gunter wrote: Well, what you really want to do is convert the list to a matrix, and it can be done directly and considerably faster than with the (implicit) looping of sapply: f1 - function(l)sapply(l,[,1) f2 - function(l)matrix(unlist(l),nr=2) l - strsplit(paste(sample(LETTERS,1e6,rep=TRUE),sample(1:10,1e6,rep=TRUE),sep=+),+,fix=TRUE) Consider this alternative: L = list( c(A1,B1), c(A2,B2), c(A3,B3) ) simplify2array(L) [,1] [,2] [,3] [1,] A1 A2 A3 [2,] B1 B2 B3 -- David. ## Then you get these results: system.time(x1 - f1(l)) user system elapsed 1.920.011.95 system.time(x2 - f2(l)) user system elapsed 0.060.020.08 system.time(x2 - f2(l)[1,]) user system elapsed 0.1 0.0 0.1 identical(x1,x2) [1] TRUE Cheers, Bert On Thu, Apr 25, 2013 at 3:32 AM, Ted Harding ted.hard...@wlandres.net wrote: Thanks, Jorge, that seems to work beautifully! (Now to try to understand why ... but that's for later). Ted. On 25-Apr-2013 10:21:29 Jorge I Velez wrote: Dear Dr. Harding, Try sapply(L, [, 1) sapply(L, [, 2) HTH, Jorge.- On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding ted.hard...@wlandres.netwrote: Greetings! For some reason I am not managing to work out how to do this (in principle) simple task! As a result of applying strsplit() to a vector of character strings, I have a long list L (N elements), where each element is a vector of two character strings, like: L[1] = c(A1,B1) L[2] = c(A2,B2) L[3] = c(A3,B3) [etc.] From L, I wish to obtain (as directly as possible, e.g. avoiding a loop) two vectors each of length N where one contains the strings that are first in the pair, and the other contains the strings which are second, i.e. from L (as above) I would want to extract: V1 = c(A1,A2,A3,...) V2 = c(B1,B2,B3,...) Suggestions? With thanks, Ted. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:16:46 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - E-Mail: (Ted Harding) ted.hard...@wlandres.net Date: 25-Apr-2013 Time: 11:31:57 This message was sent by XFMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Input Chinese characters not correctly echoed in ESS
I had this weird encoding issue for my Emacs and R environment. Display of Chinese characters are all good with my .Rprofile setting Sys.setlocale(LC_ALL,zh_CN.utf-8); except the echo of input ones. linkTexts[5] font 使ç¨å¸®å© functionNotExist() é误: 没æfunctionNotExistè¿ä¸ªå½æ° fire - ä½ å¥½ fire [1] As we can see, Chinese characters contained in the vector linkTexts, Chinese error messages, and input Chinese characters all can be perfectly shown, yet the echo of input characters were only shown as blank placeholders. sessionInfo() is here, which is as expected given the Sys.setlocale(LC_ALL,zh_CN.utf-8); setting: sessionInfo() R version 2.15.2 (2012-10-26) Platform: i386-apple-darwin9.8.0/i386 (32-bit) locale: [1] zh_CN.utf-8/zh_CN.utf-8/zh_CN.utf-8/C/zh_CN.utf-8/C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] XML_3.96-1.1 loaded via a namespace (and not attached): [1] compiler_2.15.2 tools_2.15.2 And I have no locale settings in the .Emacs file. To me, this seems to be an Emacs encoding issue, but I just don't know how to correct it. Any idea or suggestion? Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting started in parallel computing on a windows OS
Thanks for this martin. I'll start retooling and let you know how it goes. Ben Caldwell Graduate fellow On Apr 24, 2013 4:34 PM, Martin Morgan mtmor...@fhcrc.org wrote: On 04/24/2013 02:50 PM, Benjamin Caldwell wrote: Dear R help, I've what I think is a fairly simple parallel problem, and am getting bogged down in documentation and packages for much more complex situations. I have a big matrix (30^5,5]. I have a function that will act on each row of that matrix sequentially and output the 'best' result from the whole matrix (it compares the result from each row to the last and keeps the 'better' result). I would like to divide that first large matrix into chunks equal to the number of cores I have available to me, and work through each chunk, then output the results from each chunk. I'm really having trouble making head or tail of how to do this on a windows machine - lots of different false starts on several different packages now. Basically, I have the function, and I can of course easily divide the matrix into chunks. I just need a way to process each chunk in parallel (other than opening new R sessions for each core manually). Any help much appreciated - after two days of trying to get this to work I'm pretty burnt out. Hi Ben -- in your code from this morning you had a function fitting - function(ndx.grd=two,dt.grd=**one,ind.vr='ind',rsp.vr='res') { ## ... setup for(i in 1:length(ndx.grd[,1])){ ## ... do work } ## ... collate results } that you're trying to run in parallel. Obviously the ## ... represent lines I've removed. When you say something like y - foreach(icount(length(two))) %dopar% fitting() its saying that you want to run fitting() length(two) times. So you're actually doing the same thing length(two) times, whereas you really want to divide the work thats inside fitting() into chunks, and do those on separate cores! Conceptually what you'd like to do is fit_one - function(idx, ndx.grd, dt.grd, ind.vr, rsp.vr) { ## ... do work on row idx _ONLY_ } and then evaluate with ## ... setup y - foreach (idx = icount(nrow(two)) %dopar% one_fit(idx, two, one, ind, res) ## ... collate so that fit_one fits just one of your combinations. foreach will worry about distributing the work. Make sure that fit_one works first, before trying to run this in parallel; your use of try(), trying to fit different data types (character, integer, numeric) into a matrix rather than data.frame, and the type coercions all indicate that you're fighting with R rather than working with it. Hope that helps, Martin Thanks *Ben Caldwell* [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Transferring R to another computer, R_HOME_DIR
Well, to my understanding, you planned to rsync the original compiled folder from one machine to somewhere on another machine, and work with it. Then how about create a file link on the second machine for /usr/lib64/R? Or maybe I misunderstand your purpose? On Thu, Apr 25, 2013 at 5:57 PM, Saptarshi Guha saptarshi.g...@gmail.comwrote: Hello, I was looking at the R (installed on RHEL6) shell script and saw R_HOME_DIR=/usr/lib64/R. Nowhere (and I could have got it wrong) does it read in the environment value R_HOME_DIR. I have the need to rsync the entire folder below /usr/lib64/R to another computer into another directory location. Without changing the R shell script, how can i force it read in R_HOME_DIR? Or maybe i misunderstood the bash source? (Note, i cannot recompile on target machine) Cheers Saptarshi 1. I also realize Rscript will not work (i think path is hard coded in the source) Beginning of /usr/lib64/R/bin/R R_HOME_DIR=/usr/lib64/R if test ${R_HOME_DIR} = /usr/lib64/R; then case linux-gnu in linux*) run_arch=`uname -m` case $run_arch in x86_64|mips64|ppc64|powerpc64|sparc64|s390x) libnn=lib64 libnn_fallback=lib ;; *) libnn=lib libnn_fallback=lib64 ;; esac if [ -x /usr/${libnn}/R/bin/exec/R ]; then R_HOME_DIR=/usr/lib64/R elif [ -x /usr/${libnn_fallback}/R/bin/exec/R ]; then R_HOME_DIR=/usr/lib64/R ## else -- leave alone (might be a sub-arch) fi ;; esac fi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.