Re: [R] How to make this for() loop memory efficient?
Ray, your solution works and is indeed faster than mine! It looks like it's going to take a few days to to 400,000 rows, still, which is unfortunate. Steve, thanks for your help, I'll definitely self-teach plyr and data.table. - Isaac Research Assistant Quantitative Finance Faculty, UTS -- View this message in context: http://r.789695.n4.nabble.com/How-to-make-this-for-loop-memory-efficient-tp4283594p4284716.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R problem: unable to read data in the xls-format in the PerformAnalytics package
Hallo I have the following problem 1) Problem: I am unable to read data in the xls-format in the PerformAnalytics package. While it works well for several commands, e.g. t(table.Stats(msci_ret)) it does not work for other commands, e.g. x - msci_ret[, c(CH), drop = FALSE] table.Drawdowns(x) The error code is: Error in checkData(R) : The data cannot be converted into a time series. If you are trying to pass in names from a data object with one column, you should use the form 'data[rows, columns, drop = FALSE]'. Rownames should have standard date formats, such as '1985-03-15'. 2) I guess it is the German data format after I transform the data from ts -- xls. Jan 1970 -0.025317808 -0.0488751680 -0.006219300 -0.0737541890 -0.015215166 Feb 1970 -0.016650677 -0.0289782710 -0.053743041 0.0548771330 0.012912517 Mrz 1970 -0.000312907 0.0094675260 0.025474411 0.0060957210 0.040465110 ... Okt 2010 0.029227924 0.0572494500 0.024199223 0.0386233850 -0.016248884 Nov 2010 -0.023411677 0.0133836160 -0.024089614 0.0011141310 0.060007812 Dez 2010 0.019613741 0.0375536810 0.065130745 0.0647369970 0.041147678 ... but I am not sure, perhaps it is something else. 3) What I did: I save excel-data (6 raw time series with a header line) in the csv- format and read it in R, first in ts, and then converting it to csv: msci_ret = ts(msci_ret, start=1970, frequency=12) msci_ret - as.xts(msci_ret) Apparently, I do not understand exactly how to generate a date format for monthly data which can be read unter PerformanceAnalytics. I attach my csv data. Thanks for your help! yvonne__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Sum of a couple of variables of which a few have NA values
Dear Petra, I think the easiest way, because the most flexible to me, would be to have an object containing the indexes of the variables you want to use. indx - c(2,3,4,6,35) # The first column is id right? dat$sums - rowSums(dat[indx], na.rm=TRUE) See what I mean? There are probably other solutions, but this is what I would do. HTH, Ivan Le 10/01/12 19:47, PetraOpic a écrit : Dear Ivan, Thank you very much for your help. How do I use rowSums if I need to skip a variable from summing? (example: sum var1, var2, var3, var5, var34 only). Thanks in advance, Petra Opic -- View this message in context: http://r.789695.n4.nabble.com/Sum-of-a-couple-of-variables-of-which-a-few-have-NA-values-tp4282448p4282969.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Message original Sujet: Fwd: [R] Sum of a couple of variables of which a few have NA values Date : Tue, 10 Jan 2012 17:44:44 +0100 De :Ivan Calandra ivan.calan...@u-bourgogne.fr Répondre à :ivan.calan...@u-bourgogne.fr Pour : R list r-help@r-project.org Hi Petra, Try this: dat$sums- rowSums(dat[3:5], na.rm=TRUE) I think this should do what you're looking for HTH, Ivan Message original Sujet: [R] Sum of a couple of variables of which a few have NA values Date : Tue, 10 Jan 2012 17:25:21 +0100 De :Petra Opicpetrao...@gmail.com Pour : r-help@r-project.org Dear everyone, I have looked all over the internet but I cannot find a way to solve my problem. In my data I want to sum a couple of variables. Some of these variables have NA values, and when I add them together, the result is NA dat- data.frame( id = gl(5,1), var1 = rnorm(5, 10), var2 = rnorm(5, 7), var3 = rnorm(5, 6), var4 = rnorm(5, 3), var5 = rnorm(5, 8) ) dat[3,3]- NA dat[4,5]- NA dat id var1 var2 var3 var4 var5 1 1 9.371328 7.830814 5.032541 3.491053 7.626418 2 2 10.413516 7.333630 6.557178 1.465597 8.591770 3 3 10.967073 NA 6.674079 3.946451 7.251263 4 4 9.900380 7.727111 5.059698 NA 6.632962 5 5 9.191068 7.901271 6.652410 2.734856 8.484757 attach(dat) dat$sum- var2 + var3 + var4 # I think I'm doing this wrong, but I don't know what command to use dat id var1 var2 var3 var4 var5 sum 1 1 9.371328 7.830814 5.032541 3.491053 7.626418 16.35441 2 2 10.413516 7.333630 6.557178 1.465597 8.591770 15.35640 3 3 10.967073 NA 6.674079 3.946451 7.251263 NA 4 4 9.900380 7.727111 5.059698 NA 6.632962 NA 5 5 9.191068 7.901271 6.652410 2.734856 8.484757 17.28854 I would like to omit the values of NA and just sum the rest. I tried to use rowSums() but that sums an entire row and I only need a few variables. Does anyone know how to do this? Thanks in advance, Petra __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ivan CALANDRA Université de Bourgogne UMR CNRS 5561 Biogéosciences 6 Boulevard Gabriel 21000 Dijon, FRANCE ivan.calan...@u-bourgogne.fr __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 64bit R under 32bit winxp
You cannot install 64-bit R on 32-bit OS, but you can install a 32-bit R on a 64-bit OS, and you can later install 64-bit R as well. That is, installing 32-bit R does not interfere with your option to later install a 64-bit R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. 孟欣 lm_meng...@163.com wrote: Hi all: My OS is 32bit winxp,but I wanna install 64bit R2.14.1. From the following website,it says You can also go back and add 64-bit components to a 32-bit install, or vice versa http://cran.r-project.org/bin/windows/rw-FAQ.html#Can-both-32_002d-and-64_002dbit-R-be-installed-on-the-same-machine_003f Does it mean that I can install and run 64bit R2.14.1 under 32bit winxp?If so,how can I add 64-bit components to a 32-bit install? Many thanks for your help. My best [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help
On Jan 11, 2012, at 02:37 , R. Michael Weylandt wrote: That sort of name is allowed but not advised because it can lead to confusion in certain non-standard evaluation functions like subset(). Standard evaluation too: data1$1G attach(data1) 1G Nowadays, backquoting (`1G`) solves the issue (in all cases?), but it wasn't always so. If you really want the name like that add the check.names = FALSE argument to read.table() [snip] data1 - read.table(file = filename.txt, header=FALSE, col.names = c(class, P, 1G)) but in the output I get an X infront of 1G, which disappears when I run it with the name 'G' instead of '1G'. Am I not allowed to use numerical values? [snap] -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reg : Capture.output not supporting UTF-8 data
Hi, I had generated PMML using rpart algorithm.The input had UTF-8 and the same had come in PMML. But when i tried to use capture.output on that pmml , it did not retain the UTF_8 encoding.We are using this PMML in our applicaition and we need to retain the UTF-8 characters. Any method where i can retain the UTF-8 characters in the PMML and use capture.output? Thanks, Raji -- View this message in context: http://r.789695.n4.nabble.com/Reg-Capture-output-not-supporting-UTF-8-data-tp4284840p4284840.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vegan(ordistep) error: Error in if (aod[1, 5] = Pin) { : missing value where TRUE/FALSE needed
Nevil Amos nevil.amos at monash.edu writes: I am getting the following erro rmessage in ordistep. I have a number of similarly structured datasets using ordistep in a loop, and the message only occurs for some of the datasets. I cannot include a reproducible sample - the specific datasets where this is occur ing are fairly large and there are several pcnm's in the rhs of the formula. Error in if (aod[1, 5] = Pin) { : missing value where TRUE/FALSE needed Nevil, It seems to me that the source of the problem appears in this table: Inertia Proportion Rank Total 1.8110 1. Conditional0.8681 0.4793 32 Constrained0. 0.0 Unconstrained 0.9429 0.5207 29 Inertia is variance Some constraints were aliased because they were collinear (redundant) The key point is that Constrained component is completely alaised (Inertia 0, Rank 0) and therefore it cannot be analysed in permutation tests. You get the same error message with this model: mod - rda(dune ~ Moisture + Condition(Moisture), dune.env) and for the same reason. In your case, PCNM's seem to explain everything and there is nothing left for other variables, and therefore you cannot analyse them. Cheers, Jari Oksanen I can see how to fix this in vegan. All I can do is to handle these cases smoothly and with comprehensible error messages, though. They cannot be handled with permutation tests since there is nothing to do if Constrained component is zeroed. Cheers, Jari Oksanen __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems with glht for ancova
I've run an ancova, edadysexo is a factor with 3 levels,and log(lcc) is the covariate (continous variable) I get this results ancova-aov(log(peso)~edadysexo*log(lcc)) summary(ancova) Df Sum Sq Mean Sq F value Pr(F) edadysexo2 31.859 15.9294 803.9843 2e-16 *** log(lcc) 1 11.389 11.3887 574.8081 2e-16 *** edadysexo:log(lcc) 2 0.063 0.0317 1.6021 0.2025 Residuals 509 10.085 0.0198 Then, I tried to do a post-hoc using glht from multcomp package, but, because the interaction it's not siginificant i did it just for the factor summary(glht(ancova, linfct=mcp(edadysexo=Tukey))) Simultaneous Tests for General Linear Hypotheses Multiple Comparisons of Means: Tukey Contrasts Fit: aov(formula = log(peso) ~ edadysexo * log(lcc)) Linear Hypotheses: Estimate Std. Error t value Pr(|t|) M - H == 0-1.8218 1. -1.6400.230 SUB - H == 0 -0.9298 1.1957 -0.7780.717 SUB - M == 0 0.8921 1.1130 0.8010.702 (Adjusted p values reported -- single-step method) Mensajes de aviso perdidos In mcp2matrix(model, linfct = linfct) : covariate interactions found -- default contrast might be inappropriate Two question: 1) Why I don't get differences in edadysexo when i got it in the ancova 2) About the warning message. Why R said covariate interactions found of i didn't get it in the ancova Next, i tried to made glht with the interaction, but, obviously, i got an error because Tukey just work with factors. summary(glht(ancova, linfct=mcp(edadysexo*log(lcc)=Tukey))) Error: inesperado '=' en summary(glht(ancova, linfct=mcp(edadysexo*log(lcc)= What should i do? Thanks in advance - Mario Garrido Escudero PhD student Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca. Agrícola Universidad de Salamanca -- View this message in context: http://r.789695.n4.nabble.com/problems-with-glht-for-ancova-tp4284908p4284908.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot maps with R
Dear all I would like to use R and make some maps. I want to have strict control, over the details of the produced map, like remove borders, city names, add markers, add labels. Is there any package apart Rgooglemaps that can do something like that? B.R Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Max value of an integer
Hi. Is there any constant that represents the maximum value of an integer? If I need to setup by myself what is the maximum value? Best, Rui __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] runif with condition
On Jan 10, 2012, at 18:11 , AlanM wrote: I have to disagree with what's been posted, but I think some very interesting points have been addressed. I'd like to add my two cents. Consider the pair {X, 1-X} where X is sampled from a uniform(0,1) distribution. The quantity 1- X also comes from a uniform(0,1) distribution and therefore is probabilistic and not deterministic. The sum of independent random variables is itself a random variable. Also of non-independent ones, provided you allow the possibility of a degenerate distribution, as in the case above. If X1, X2 X3 are independent and uniformly distributed, then the distribution of Y = X1 + X2 + X3 can be determined (i.e. Y is probabilistic and NOT deterministic). Y is a random variable, but it is correlated with X1, X2 and X3. The set {X1, X2, X3, 100 - (X1 + X2 + X3) } contains 4 random variables, however they are neither independent or identically distributed. Yes. You can achieve various properties like X1+X2+X3+X4=100, X1,...,X4 identically distributed, but not independent and not uniform. (Generate 4 independent variables from some distribution on the positive axis and rescale to the required sum.) You can't have X1,...,X4 all uniform on (0,100), even if non-independent, with a sum of 100, because the mean of the sum would be the sum of the means, i.e., 200! Whether you can have X1,...,X4, exchangeable and uniform on (0,50) is, er, an interesting question. (I would say probably not, but I can't think of an argument.) If you are curious, check this out. Deriving the Probability Density for Sums of Uniform Random Variables Edward J. Lusk and Haviland Wright The American Statistician Vol. 36, No. 2 (May, 1982), pp. 128-130 Thanks to the OP. This has become an interesting thread. -Alan Mitchell -- View this message in context: http://r.789695.n4.nabble.com/runif-with-condition-tp4278704p4282600.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 sample wilcox.test != kruskal.test
Hi, thanks for your answer. Unfortunately I cannot reproduce your results. In my example the results still differ when I use your approach: x - c(10,11,15,8,16,12,20) y - c(10,14,18,25,28,30,35) f - as.factor(c(rep(a,7), rep(b,7))) d - c(x,y) kruskal.test(x,y) Kruskal-Wallis rank sum test data: x and y Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232 kruskal.test(x~y) Kruskal-Wallis rank sum test data: x by y Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232 kruskal.test(d~f) Kruskal-Wallis rank sum test data: d by f Kruskal-Wallis chi-squared = 3.6816, df = 1, p-value = 0.05502 kruskal.test(f~d) Kruskal-Wallis rank sum test data: f by d Kruskal-Wallis chi-squared = 11.1429, df = 12, p-value = 0.5167 I know the last kruskal.test(f~d) is not correct as the factor is always placed as the second bit but I still tried it that way just to be sure... Cheers -- View this message in context: http://r.789695.n4.nabble.com/2-sample-wilcox-test-kruskal-test-tp4282888p4285003.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem installing packages
On 10.01.2012 22:40, R. Michael Weylandt michael.weyla...@gmail.com wrote: What lists are you referring to when you state: there are many packages that do not show up in the list of binaries. They do in the list of sources? CRAN? To see all packages installed on your machine try rownames(installed.packages(()) I think available.packages() will give packages available from your local CRAN mirror. Just a hunch, but if you are seeing different behaviors between home and work it may well be that a work-firewall is blocking packages which contain pre-compiled DLLs/SOs while allowing the source code through. Also, if you aren't on Windows the vast majority of packages are quite easy to learn compile yourself Michael On Jan 10, 2012, at 4:01 PM, natalia nordennatnor...@gmail.com wrote: Thank you very much for your answers. I could do it by downloading the package I needed manually and then installing it through the Terminal. Yet the fundamental problem remains. I downloaded R 2.14.1 several times from different mirrors I think you should change the mirror for the installation of packages, not for downloading R. Uwe Ligges and there are many packages that do not show up in the list of binaries. They do in the list of sources, but then I have a problem compiling the package... I did not have this problem from my home computer when I installed R 2.14.0. Best, Natalia El 10/01/12 14:52, Ken Hutchisonvicvoncas...@gmail.com escribió: Maybe check your proxy settings in your browser and make sure you're connecting to the mirror. Ken Sent from my iPhone On Jan 10, 2012, at 8:35 AM, natalia nordennatnor...@gmail.com wrote: Hello, I was using version 2.13.2 and I have just downloaded the latest version 2.14.1. However, I'm trying to install the packages I was using and when I look for them in the packages list, I can´t find many in the CRAN binaries (e.g. vegan). I do find them in the CRAN sources but the installation fails. I tried downloading the version 2.14.0 and I had the same problem. I re-installed the old version, and now it works again. Is this a problem with 2.14? Thank you for your help. Natalia Norden Natalia Norden Profesor Asistente Departamento de Ecología y Territorio Facultad de Estudios Ambientales y Rurales Pontificia Universidad Javeriana Bogotá, Colombia Tel: 320 83 20 Ext: 2448 www.phylodiversity.net/nnorden/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Max value of an integer
rmx wrote Hi. Is there any constant that represents the maximum value of an integer? If I need to setup by myself what is the maximum value? ?.Machine i.e. .Machine$integer.max Berend -- View this message in context: http://r.789695.n4.nabble.com/Max-value-of-an-integer-tp4284953p4285054.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 sample wilcox.test != kruskal.test
2012/1/11 syrvn ment...@gmx.net Hi, thanks for your answer. Unfortunately I cannot reproduce your results. In my example the results still differ when I use your approach: x - c(10,11,15,8,16,12,20) y - c(10,14,18,25,28,30,35) f - as.factor(c(rep(a,7), rep(b,7))) d - c(x,y) kruskal.test(x,y) Try to compare wilcox.test and right formula in kruskal, I got: all.equal(wilcox.test(x,y, correct = F,exact=F)$p.value,kruskal.test(d~f)$p.value) [1] TRUE -- Mi³ego dnia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 64bit R under 32bit winxp
On 11/01/2012 08:55, Jeff Newmiller wrote: You cannot install 64-bit R on 32-bit OS, Technically, you can (but by default the packaged installers will refuse to do so). What you cannot do is run 64-bit R on a 32-bit version of Windows: the OS will refuse to run the executables (and if it is old enough, not even recognize them). The only reason for mentioning this is for the record. People do drag up R-help postings from years past (someone recently sent me a response to one from the 1990s). We do plan to allow both versions of R to be installed on 32-bit Windows in the future (with a warning). You still will not be able to run them, but you will be able to make binary packages for both versions: also this could be useful for system-wide installations from a 32-bit server OS. but you can install a 32-bit R on a 64-bit OS, and you can later install 64-bit R as well. That is, installing 32-bit R does not interfere with your option to later install a 64-bit R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. 孟欣lm_meng...@163.com wrote: Hi all: My OS is 32bit winxp,but I wanna install 64bit R2.14.1. From the following website,it says You can also go back and add 64-bit components to a 32-bit install, or vice versa http://cran.r-project.org/bin/windows/rw-FAQ.html#Can-both-32_002d-and-64_002dbit-R-be-installed-on-the-same-machine_003f Does it mean that I can install and run 64bit R2.14.1 under 32bit winxp?If so,how can I add 64-bit components to a 32-bit install? Many thanks for your help. My best [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 sample wilcox.test != kruskal.test
The devil is in the details (and in the arguments in Lukasz code). The defaults for the two functions are different: wilcox.test uses an exact test (which is not available in kruskal.test afaik) for your data, and uses the continuity correction if the normal approximation is requested (neither available in kruskal.test). See the manual (in particular ?wilcox.test) for details, or the pertinent literature for the theoretical background. HTH, Michael -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Lukasz Reclawowicz Sent: Wednesday, January 11, 2012 12:00 To: syrvn Cc: r-help@r-project.org Subject: Re: [R] 2 sample wilcox.test != kruskal.test 2012/1/11 syrvn ment...@gmx.net Hi, thanks for your answer. Unfortunately I cannot reproduce your results. In my example the results still differ when I use your approach: x - c(10,11,15,8,16,12,20) y - c(10,14,18,25,28,30,35) f - as.factor(c(rep(a,7), rep(b,7))) d - c(x,y) kruskal.test(x,y) Try to compare wilcox.test and right formula in kruskal, I got: all.equal(wilcox.test(x,y, correct = F,exact=F)$p.value,kruskal.test(d~f)$p.value) [1] TRUE -- Mi³ego dnia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Constructing a data.frame from csv files
Dear R helpers, Following is my R code where I am trying to calculate returns and then trying to create a data.frame. Since, I am not aware how many instruments I will be dealing so I have constructed a function. My R code is as follows - library(plyr) mydata - data.frame(instru_name = c(instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B), date = c(10-Jan-12,9-Jan-12,8-Jan-12, 7-Jan-12, 6-Jan-12,5-Jan-12,4-Jan-12,3-Jan-12,2-Jan-12,1-Jan-12, 31-Dec-11, 30-Dec-11,29-Dec-11,28-Dec-11,10-Jan-12,9-Jan-12,8-Jan-12, 7-Jan-12,6-Jan-12,5-Jan-12,4-Jan-12,3-Jan-12,2-Jan-12,1-Jan-12,31-Dec-11,30-Dec-11,29-Dec-11,28-Dec-11), price = c(11.9,10.5,13,14.5,14.4,14.8,10.1,12,14.3, 10.7,11.2,10.2,10.2,10.8,41.9,40.5,43,44.5,44.4,48.8,42.1,44,46.3,48.7,46.2,44.2,42.2,40.8)) attach(mydata) opt_return_volatilty = function(price, instru_name) { price_returns = matrix(data = NA, nrow = (length(price)-1), ncol = 1) for (i in(1:(length(price)-1))) { price_returns[i] = log(price[i]/price[i+1]) } volatility = sd(price_returns) entity_returns = unique(instru_name) colnames(price_returns) = entity_returns write.csv(price_returns, file = paste(entity_returns, .csv, sep = ), row.names = FALSE) return(data.frame(list(volatility = volatility))) } entity_volatility - ddply(.data=mydata, .variables = instru_name, .fun=function(x) opt_return_volatilty(price = x$price, instru_name = x$instru_name)) entity_volatility instru_name volatility 1 instru_A 0.17746897 2 instru_B 0.06565341 fileNames - list.files(pattern = instru.*.csv) fileNames [1] instru_A.csv instru_B.csv # _ # MY QUERY # I need to construct the data frame consisting of all the returns. I.e. I need to have # a data.frame like instru_A instru_B 0.125163143 0.033983853 -0.2135741 -0.059898142 -0.109199292 -0.034289073 0.006920443 0.00224972 -0.027398974 -0.094490843 I am using following Code input - do.call(rbind, lapply(fileNames, function(.name) { .data - read.csv(.name, header = TRUE, as.is = TRUE) .data$file - .name .data })) # I get following error. Error in match.names(clabs, names(xi)) : names do not match previous names Kindly guide Regards Vincy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with segmented
dear Phil, I am not able to read the error message.. did you forget it? However: does x exist in the workspace? The following lines work: myreg2 = lm(y ~ x, data=xy) mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000)) myreg2 = lm(xy$y ~ xy$x) x-xy$x mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000)) The following line does *not* work (as specified in ?segmented, argument seg.Z) myreg2 = lm(xy$y ~ xy$x) mysegmented = segmented(myreg2, seg.Z=~xy$x, psi=c(245000)) #error Hope to have been clear, vito Il 10/01/2012 17.17, Filoche ha scritto: Hi everyone. I'm trying to use the segmented function with the following data: For instance, I use segmented package as follow: myreg2 = lm(xy$y ~ xy$x) mysegmented = segmented(myreg2, seg.Z=~x, psi=c(245000), control = seg.control(display=FALSE)) Which get me to the following error : As a break point, a starting guess of 245000 seems fair. Anyone has an idea why I'm getting such error? Regards, Phil -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-segmented-tp4282398p4282398.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Vito M.R. Muggeo Dip.to Sc Statist e Matem `Vianelli' Università di Palermo viale delle Scienze, edificio 13 90128 Palermo - ITALY tel: 091 23895240 fax: 091 485726 http://dssm.unipa.it/vmuggeo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] runif with condition
On 12-01-11 5:12 AM, peter dalgaard wrote: On Jan 10, 2012, at 18:11 , AlanM wrote: I have to disagree with what's been posted, but I think some very interesting points have been addressed. I'd like to add my two cents. Consider the pair {X, 1-X} where X is sampled from a uniform(0,1) distribution. The quantity 1- X also comes from a uniform(0,1) distribution and therefore is probabilistic and not deterministic. The sum of independent random variables is itself a random variable. Also of non-independent ones, provided you allow the possibility of a degenerate distribution, as in the case above. If X1, X2 X3 are independent and uniformly distributed, then the distribution of Y = X1 + X2 + X3 can be determined (i.e. Y is probabilistic and NOT deterministic). Y is a random variable, but it is correlated with X1, X2 and X3. The set {X1, X2, X3, 100 - (X1 + X2 + X3) } contains 4 random variables, however they are neither independent or identically distributed. Yes. You can achieve various properties like X1+X2+X3+X4=100, X1,...,X4 identically distributed, but not independent and not uniform. (Generate 4 independent variables from some distribution on the positive axis and rescale to the required sum.) You can't have X1,...,X4 all uniform on (0,100), even if non-independent, with a sum of 100, because the mean of the sum would be the sum of the means, i.e., 200! Whether you can have X1,...,X4, exchangeable and uniform on (0,50) is, er, an interesting question. (I would say probably not, but I can't think of an argument.) I think it probably is possible -- it's basically a 4 dimensional copula, and those are pretty flexible. We have a construction for 2D, and I think I have one in 3D. (In 3D, the intersection of the plane X+Y+Z=100 with the cube [0,50]^3 is a regular hexagon; you just need to spread the mass over the hexagon in the right way to get uniform marginals.) I think the shape W+X+Y+Z=100 makes when it intersects with the 4-cube is a regular octahedron in some projection, but I don't know how to distribute mass over it for uniform marginals. Duncan Murdoch If you are curious, check this out. Deriving the Probability Density for Sums of Uniform Random Variables Edward J. Lusk and Haviland Wright The American Statistician Vol. 36, No. 2 (May, 1982), pp. 128-130 Thanks to the OP. This has become an interesting thread. -Alan Mitchell -- View this message in context: http://r.789695.n4.nabble.com/runif-with-condition-tp4278704p4282600.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 sample wilcox.test != kruskal.test
Hi Michael and Mi³ego dnia, yes right. I get identical results now! thanks a lot! -- View this message in context: http://r.789695.n4.nabble.com/2-sample-wilcox-test-kruskal-test-tp4282888p4285325.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spplot : help with legend
Hi! I am looking for some examples on how to plot using spplot the overlay of :polygon+point+point both point files are different files with different coordinates. I am planning on doing something like this: polygon.file=readShapeSpatial(polygons) l1 = list(sp.points, points.file1, pch = 19,col = red) l2 = list(sp.points, points.file2, pch = 19,col = green) spplot(polygon.file, common,col=black, sp.layout = list(l1,l2)) However I didn't find any help on how to do the legend, except for these examples: http://www.nceas.ucsb.edu/scicomp/usecases/CreateMapsWithRGraphics http://casoilresource.lawr.ucdavis.edu/drupal/node/962 It seems there is a way to do it, but too confusing for me (mapLegendGrob2) Any help is appreciated! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] get the percentage rank of a value based on an empirical data vector
Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constructing a data.frame from csv files
The error message says it all: the dataframes that you are creating, and then trying to 'rbind', do not have the same columns. You need to at least show what the first couple of lines of each of you input files are, or output the names of the columns as you are reading the files. This is some elementary debugging that you will have to learn. On Wed, Jan 11, 2012 at 7:38 AM, Vincy Pyne vincy_p...@yahoo.ca wrote: Dear R helpers, Following is my R code where I am trying to calculate returns and then trying to create a data.frame. Since, I am not aware how many instruments I will be dealing so I have constructed a function. My R code is as follows - library(plyr) mydata - data.frame(instru_name = c(instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B), date = c(10-Jan-12,9-Jan-12,8-Jan-12, 7-Jan-12, 6-Jan-12,5-Jan-12,4-Jan-12,3-Jan-12,2-Jan-12,1-Jan-12, 31-Dec-11, 30-Dec-11,29-Dec-11,28-Dec-11,10-Jan-12,9-Jan-12,8-Jan-12, 7-Jan-12,6-Jan-12,5-Jan-12,4-Jan-12,3-Jan-12,2-Jan-12,1-Jan-12,31-Dec-11,30-Dec-11,29-Dec-11,28-Dec-11), price = c(11.9,10.5,13,14.5,14.4,14.8,10.1,12,14.3, 10.7,11.2,10.2,10.2,10.8,41.9,40.5,43,44.5,44.4,48.8,42.1,44,46.3,48.7,46.2,44.2,42.2,40.8)) attach(mydata) opt_return_volatilty = function(price, instru_name) { price_returns = matrix(data = NA, nrow = (length(price)-1), ncol = 1) for (i in(1:(length(price)-1))) { price_returns[i] = log(price[i]/price[i+1]) } volatility = sd(price_returns) entity_returns = unique(instru_name) colnames(price_returns) = entity_returns write.csv(price_returns, file = paste(entity_returns, .csv, sep = ), row.names = FALSE) return(data.frame(list(volatility = volatility))) } entity_volatility - ddply(.data=mydata, .variables = instru_name, .fun=function(x) opt_return_volatilty(price = x$price, instru_name = x$instru_name)) entity_volatility instru_name volatility 1 instru_A 0.17746897 2 instru_B 0.06565341 fileNames - list.files(pattern = instru.*.csv) fileNames [1] instru_A.csv instru_B.csv # _ # MY QUERY # I need to construct the data frame consisting of all the returns. I.e. I need to have # a data.frame like instru_A instru_B 0.125163143 0.033983853 -0.2135741 -0.059898142 -0.109199292 -0.034289073 0.006920443 0.00224972 -0.027398974 -0.094490843 I am using following Code input - do.call(rbind, lapply(fileNames, function(.name) { .data - read.csv(.name, header = TRUE, as.is = TRUE) .data$file - .name .data })) # I get following error. Error in match.names(clabs, names(xi)) : names do not match previous names Kindly guide Regards Vincy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make this for() loop memory efficient?
On 01/11/2012 12:09 AM, iliketurtles wrote: Ray, your solution works and is indeed faster than mine! It looks like it's going to take a few days to to 400,000 rows, still, which is unfortunate. Steve, thanks for your help, I'll definitely self-teach plyr and data.table. I added a column with the first two digits of the module data$XX - substr(L[,2], 1, 2) then created a data frame that summarized the first module of each call and the length of the phone call df - with(data, data.frame(FirstModule=tapply(XX, `phone calls`, `[[`, 1), Length=tapply(XX, `phone calls`, length))) then summarized the length of the phone calls associated with each module with(df, tapply(Length, FirstModule, mean)) resulting in with(df, tapply(Length, FirstModule, mean)) 82 84 92 93 94 96 97 1.00 2.00 1.75 1.67 1.00 1.22 1.67 Martin - Isaac Research Assistant Quantitative Finance Faculty, UTS -- View this message in context: http://r.789695.n4.nabble.com/How-to-make-this-for-loop-memory-efficient-tp4283594p4284716.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get the percentage rank of a value based on an empirical data vector
On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote: Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? Two options : 1) sort x and use findInterval, divide the index by length(x) and multiply by 100 (It can all be done as a one-liner.) 2) I generally reach for the `ecdf` function making machine when I see sample quantile problems and see if I can cast the problem in terms for which it applies. For my random draw I get: findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with speed (replacing the loop?)
Dear R-ers, I have a loop below that loops through my numeric variables in data frame x and through levels of the factor group and multiplies (group by group) the values of numeric variables in x by the corresponding group-specific values from data frame y. In reality, my: dim(x) is 300,000 rows by 100 variables, and dim(y) is 120 levels of group by 100 variables. So, my huge data frame x takes up a lot of space in memory. This is why I am actually replacing values of a and b in x with newly calculated values, rather than adding them. The code does what I need, but it takes forever. Is there maybe a more speedy way to achieve what I need? Thanks a lot! Dimitri # Example data: x-data.frame(group=c(rep(group1,5),rep(group2,5)), a=1:10,b=seq(10,100,by=10)) x$group-as.factor(x$group) y-data.frame(group=c(group1,group2),a=c(10,20),b=c(2,3)) y$group-as.factor(y$group) (x);(y) # My code: myvars-c(a,b) for(var in myvars){ for(group in levels(y$group)){ temp-x[x$group %in% group,var] temp-temp * y[y$group %in% group,var] x[x$group %in% group,var]-temp } } (x) -- Dimitri Liakhovitski __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R problem: unable to read data in the xls-format in the PerformAnalytics package
On Wed, Jan 11, 2012 at 2:41 AM, ysei...@bluewin.ch ysei...@bluewin.ch wrote: Hallo I have the following problem 1) Problem: I am unable to read data in the xls-format in the PerformAnalytics package. While it works well for several commands, e.g. t(table.Stats(msci_ret)) it does not work for other commands, e.g. x - msci_ret[, c(CH), drop = FALSE] table.Drawdowns(x) The error code is: Error in checkData(R) : The data cannot be converted into a time series. If you are trying to pass in names from a data object with one column, you should use the form 'data[rows, columns, drop = FALSE]'. Rownames should have standard date formats, such as '1985-03-15'. 2) I guess it is the German data format after I transform the data from ts -- xls. Jan 1970 -0.025317808 -0.0488751680 -0.006219300 -0.0737541890 -0.015215166 Feb 1970 -0.016650677 -0.0289782710 -0.053743041 0.0548771330 0.012912517 Mrz 1970 -0.000312907 0.0094675260 0.025474411 0.0060957210 0.040465110 ... Okt 2010 0.029227924 0.0572494500 0.024199223 0.0386233850 -0.016248884 Nov 2010 -0.023411677 0.0133836160 -0.024089614 0.0011141310 0.060007812 Dez 2010 0.019613741 0.0375536810 0.065130745 0.0647369970 0.041147678 ... but I am not sure, perhaps it is something else. 3) What I did: I save excel-data (6 raw time series with a header line) in the csv- format and read it in R, first in ts, and then converting it to csv: msci_ret = ts(msci_ret, start=1970, frequency=12) msci_ret - as.xts(msci_ret) Please provide a reproducible example (your file was not attached). Use dput() to include a _small_ sample of your data. My guess is that Excel is not saving the dates to the CSV in a standard date format. Apparently, I do not understand exactly how to generate a date format for monthly data which can be read unter PerformanceAnalytics. I attach my csv data. Thanks for your help! yvonne Best, -- Joshua Ulrich | FOSS Trading: www.fosstrading.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get the percentage rank of a value based on an empirical data vector
If performance is an issue, I think mean(x y) will be as quick as it can be done in R alone (you could do it in C in a single pass if needed which might be a good first exercise in using compiled code) Michael On Jan 11, 2012, at 8:58 AM, David Winsemius dwinsem...@comcast.net wrote: On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote: Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? Two options : 1) sort x and use findInterval, divide the index by length(x) and multiply by 100 (It can all be done as a one-liner.) 2) I generally reach for the `ecdf` function making machine when I see sample quantile problems and see if I can cast the problem in terms for which it applies. For my random draw I get: findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] general question on Spotfire
Dear R users, I have been using R for 10 years, and I love it very much. But in my daily job for drug discovery, some people use Spotfire. I tried Spotfire on couple of data sets. It sounds I still need do some data manipulation before plot figures. For example, I can not plot figure with data arranged in rows (is this true, or I am stupid?). So far I don't feel any benefit Spotfire can provide over R. I am just wondering whether it just because I am new to Spotfire, or it's true that Spotfire is not a good tool for statistician. Also could anyone give me any suggestion how to learn Spotfire? Thanks John [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with speed (replacing the loop?)
Hi, On Wed, Jan 11, 2012 at 9:57 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Dear R-ers, I have a loop below that loops through my numeric variables in data frame x and through levels of the factor group and multiplies (group by group) the values of numeric variables in x by the corresponding group-specific values from data frame y. In reality, my: dim(x) is 300,000 rows by 100 variables, and dim(y) is 120 levels of group by 100 variables. So, my huge data frame x takes up a lot of space in memory. This is why I am actually replacing values of a and b in x with newly calculated values, rather than adding them. The code does what I need, but it takes forever. Is there maybe a more speedy way to achieve what I need? Thanks a lot! Here's an all-middle-steps included way to do so using data.table. If you use more data.table-centric idioms (using `:=` operator and other ways to `merge`) you can likely eek out less memory and higher speed, but I'll leave it like so for pedagogical purposes ;-) library(data.table) ## your data xx - data.table(group=c(rep(group1,5),rep(group2,5)), a=1:10, b=seq(10,100,by=10), key=group) yy - data.table(group=c(group1,group2), a=c(10,20), b=c(2,3), key=group) ## temp data.table to get your ducks in a row m - merge(xx, yy, by=group, suffixes=c(.x, .y)) ## your answers will be in the aa and bb columns result - transform(m, aa=a.x * a.y, bb=b.x * b.y) Truth be told, if you use normal data.frames, the code will look very similar to above, so you can try that, too. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general question on Spotfire
On 12-01-11 10:13 AM, John Smith wrote: Dear R users, I have been using R for 10 years, and I love it very much. But in my daily job for drug discovery, some people use Spotfire. I tried Spotfire on couple of data sets. It sounds I still need do some data manipulation before plot figures. For example, I can not plot figure with data arranged in rows (is this true, or I am stupid?). So far I don't feel any benefit Spotfire can provide over R. I am just wondering whether it just because I am new to Spotfire, or it's true that Spotfire is not a good tool for statistician. Also could anyone give me any suggestion how to learn Spotfire? Shouldn't you be asking this question to Spotfire users? Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotOHLC(alpha3): Error in plotOHLC(alpha3) : x is not a open/high/low/close time series
Hi Ted, On Tue, Jan 10, 2012 at 1:59 PM, Ted Byers r.ted.by...@gmail.com wrote: R version 2.12.0, 64 bit on Windows. Here is a short script that illustrates the problem: library(tseries) library(xts) setwd('C:\\cygwin\\home\\Ted\\New.Task\\NKs-01-08-12\\NKs\\tests') x = read.table(quotes_h.2.dat, header = FALSE, sep=\t, skip=0) str(x) y - data.frame(as.POSIXlt(paste(x$V2,substr(x$V4,4,8),sep= ),format='%Y-%m-%d %H:%M'),x$V5) colnames(y) - c(tickdate,price) str(y) plot(y) z - as.irts(y) str(z) plot(z) str(alpha3) List of 2 $ time : POSIXt[1:98865], format: 2010-06-30 15:47:00 2010-06-30 15:53:00 2010-06-30 17:36:00 ... $ value: num [1:98865, 1:4] 9215 9220 9205 9195 9195 ... ..- attr(*, dimnames)=List of 2 .. ..$ : NULL .. ..$ : chr [1:4] z.Open z.High z.Low z.Close - attr(*, class)= chr ts - attr(*, tsp)= num [1:3] 1 2 1 This is a big clue. Your alpha3 object is a list with two elements 1) the datetime, and 2) the OHLC values as a ts object. There's no as.xts() method for this type of object. alpha3 - as.xts(to.minutes3(z,OHLC = TRUE)) Look at str(z). Why did you convert z to an irts object instead of directly to an xts object? to.minutes() expects an xts object. Try something like this instead: z - xts(y[,2], y[,1]) alpha3 - to.minutes3(z, OHLC=TRUE) plotOHLC(alpha3) Error in plotOHLC(alpha3) : x is not a open/high/low/close time series The file quotes_h.2.dat contains real time tick data for futures contracts, so the above manipulation is my attempt to just get a time series with one column being a date/time and the other being tick price. I believe I have to use read.table to make a data frame, and then the manipulations to combine the date and time fields from that feed, along with the price. My first attempt at using to.minutes3 (and I am interested in the other 'to.period' functions too), is to get a regular time series to which I can apply rollapply, along with a function in which I use various autoregression methods, along with forecasting for as long as the 95% confidence intervals is reasonably close - I want to know how far into the future the forecast contains useful information. And then, I want to create a plot in which I do the autoregression, and then plot the actual and forecast prices (along with the confidence interval), as a function of time, embed that in a function, which rollappply works with, so I can have a plot comprised of all those individual plots (plotting only the comparison of actual and forecast values). It seems everything works adequately until I try the plotOHLC function itself, which gives me the error in the subject line. I would ask for two things: 1) what the fix is to get rid of that error plotOHLC gives me 2) some tips on the 'walk-forward' method I am looking at using. Thanks Ted [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Ulrich | FOSS Trading: www.fosstrading.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with speed (replacing the loop?)
Thanks a lot, Steve. I have one question (below): library(data.table) ## your data xx - data.table(group=c(rep(group1,5),rep(group2,5)), a=1:10, b=seq(10,100,by=10), key=group) yy - data.table(group=c(group1,group2), a=c(10,20), b=c(2,3), key=group) ## temp data.table to get your ducks in a row m - merge(xx, yy, by=group, suffixes=c(.x, .y)) Dimitri: The step above (merge) - I was thinking of it but decided against it because my xx already fills up tons of memory. When I merge xx and yy that doubles the number of variables - I am afraid my memory won't hold that much stuff... ## your answers will be in the aa and bb columns result - transform(m, aa=a.x * a.y, bb=b.x * b.y) Truth be told, if you use normal data.frames, the code will look very similar to above, so you can try that, too. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get the percentage rank of a value based on an empirical data vector
findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 thanks, that helped a lot! On 11.01.2012, at 14:58, David Winsemius wrote: On Jan 11, 2012, at 8:12 AM, Martin Batholdy wrote: Hi, I have a vector with values: x - rnorm(1000, 5, 2) and one single value: y - 6.2 now I would like to know the percent rank of y based on the 'population'-vector x. Is there a convenient function that calculates the percent rank of a y for the given vector x? Two options : 1) sort x and use findInterval, divide the index by length(x) and multiply by 100 (It can all be done as a one-liner.) 2) I generally reach for the `ecdf` function making machine when I see sample quantile problems and see if I can cast the problem in terms for which it applies. For my random draw I get: findInterval(6.2, sort(x)) [1] 704 xecdf - ecdf(x) xecdf(6.2) [1] 0.704 -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CairoPDF and greek letter spacing
Thanks for the input, but I can confirm the either form of the paste command i.e. expression(paste(Length (, mu*m, ))) or my original which had expression(paste(Length (, mu, m))), and the expression command alone have the same effect. It may be limited to Linux machines (which I did state I used in my first post) as when I previously used an OS X machine I didn't see the bug. I don't use OS X anymore, I've implemented a completely Linux based lab and would like to get this sorted out. Any suggestions? Rolf, if you can repoduce it then I think we have a bug report but to whom? R or Cairo? John From: David Winsemius [dwinsem...@comcast.net] Sent: Monday, January 09, 2012 10:36 PM To: Rolf Turner Cc: Walker, John Stephen; r-help@r-project.org Subject: Re: [R] CairoPDF and greek letter spacing On Jan 10, 2012, at 12:22 AM, Rolf Turner wrote: On 10/01/12 15:25, David Winsemius wrote: On Jan 9, 2012, at 8:19 PM, Rolf Turner wrote: On 10/01/12 11:40, John Walker wrote: I have a small problem with R graphics output. When I use the lattice package and CairoPDF to generate publication quality graphs I often use the expression to create an axis title that has microlitres or micrometers as a unit. I use something like the following 'expression(paste(Length (, mu,m )))' as an argument to the xlabel function. The command works but the mu and 'm' have a space between them. It looks like 'u m' rather than 'um'. It only seems to happen with the CairoPDF output on my linux machine, it's fine on the X11 device. I've fixed it in the past by importing the pdf into inkscape and manually adjusting the spacing (it's more difficult than it sounds because I can't actually adjust the spacing but have to delete the mu and re-enter it). Is there something I'm doing wrong? Is this a known bug? How can I fix it? I can't help, but I can confirm the problem, for what *that* is worth. It seems be an unfortunate interaction between lattice graphics and the cairo_pdf() device. The space between the mu and the m does not appear with ``ordinary'' R graphics, irrespective of device, nor does it appear with lattice graphics and, e.g. the pdf() device. But it does appear with lattice graphics *and* the cairo_pdf() device. That probably means that the problem is subtle and will be difficult to impossible to fix. :-( Doubtful. Do either of you realize that `paste` is a plotmath function that is misused more often than correctly used (at least as judged by the number o errors submitted to r-help)? I see no workable example, but if I did I would be trying instead : expression(Length~mu*m) You meant expression(Length==mu*m). I thought what was wanted: main=expression(Length~group((,mu*m,))) # Or main=expression(Length~(*mu*m*)) And on a Mac (Leopard, R 2.14.1 Patched) I could not get a different display on the screen device and either pdf or cairo_pdf so your experience may have something to do with the as yet unstated OSes. And yes that helps a bit, but there's still a bit more space between the mu and the m than one would like. Compare: require(lattice) cairo_pdf(file=mung.pdf) print (xyplot (y~x,data=data.frame(x=1:10,y=1:10),main=expression(Length==mu*m))) dev.off() and pdf(file=gorp.pdf) print (xyplot (y~x,data=data.frame(x=1:10,y=1:10),main=expression(Length==mu*m))) dev.off() There's not much in it, but there's just enough to be annoying. cheers, Rolf P. S. And I have never been able to figure out *anything* about how paste() and expression() interact. It is a complete mystery, and a matter of trying things more or less at random until something more or less works. R. T. P^2. S. I just realised that there's more space between the letters of Length in the cairo_pdf version, as well. Which is, I guess, The Explanation. Is it a font thing? R. T. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general question on Spotfire
I am struggling whether I should learn Spotfire or not. I just want some statisticians inputs. Thanks On Wed, Jan 11, 2012 at 10:28 AM, Duncan Murdoch murdoch.dun...@gmail.comwrote: On 12-01-11 10:13 AM, John Smith wrote: Dear R users, I have been using R for 10 years, and I love it very much. But in my daily job for drug discovery, some people use Spotfire. I tried Spotfire on couple of data sets. It sounds I still need do some data manipulation before plot figures. For example, I can not plot figure with data arranged in rows (is this true, or I am stupid?). So far I don't feel any benefit Spotfire can provide over R. I am just wondering whether it just because I am new to Spotfire, or it's true that Spotfire is not a good tool for statistician. Also could anyone give me any suggestion how to learn Spotfire? Shouldn't you be asking this question to Spotfire users? Duncan Murdoch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generating unque patient IDs
Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. Regards, Ayyappa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with speed (replacing the loop?)
Hi, On Wed, Jan 11, 2012 at 10:50 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thanks a lot, Steve. I have one question (below): library(data.table) ## your data xx - data.table(group=c(rep(group1,5),rep(group2,5)), a=1:10, b=seq(10,100,by=10), key=group) yy - data.table(group=c(group1,group2), a=c(10,20), b=c(2,3), key=group) ## temp data.table to get your ducks in a row m - merge(xx, yy, by=group, suffixes=c(.x, .y)) Dimitri: The step above (merge) - I was thinking of it but decided against it because my xx already fills up tons of memory. When I merge xx and yy that doubles the number of variables - I am afraid my memory won't hold that much stuff... Fair enough ... how about just using `match`, then, ie: R aa - x$a * y$a[match(x$group, y$group)] Should do the trick, no? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general question on Spotfire
On Jan 11, 2012, at 16:28 , Duncan Murdoch wrote: On 12-01-11 10:13 AM, John Smith wrote: Dear R users, I have been using R for 10 years, and I love it very much. But in my daily job for drug discovery, some people use Spotfire. I tried Spotfire on couple of data sets. It sounds I still need do some data manipulation before plot figures. For example, I can not plot figure with data arranged in rows (is this true, or I am stupid?). So far I don't feel any benefit Spotfire can provide over R. I am just wondering whether it just because I am new to Spotfire, or it's true that Spotfire is not a good tool for statistician. Also could anyone give me any suggestion how to learn Spotfire? Shouldn't you be asking this question to Spotfire users? Just to clue in the casual reader, Spotfire embeds a version of S+, which is, er, sort of, like, a predecessor to R, so John is not completely off target. Documents comparing R and S+ should be useful to him. There are books that are bilingual, such as Venables and Ripley MASS and S Programming, but I also spotted this on TIBCO's own site: http://spotfire.tibco.com/community/blogs/stn/archive/2010/11/04/differences-between-r-and-spotfire-s.aspx Also, there are (claimed to be) facilities to integrate R itself in Spotfire, which could be a rather expedient solution. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with speed (replacing the loop?)
It is common that performance problems are addressed by using more memory. If your algorithm needs to join those tables and do calculations, then you can either pay the piper in memory (usually the most appropriate answer) or you can reinvent those optimized algorithms in a compiled language and figure out how to thread them together to minimize memory use (a dangerous course of action). Try working on chunks of xx? Or getting more memory in your computer? --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thanks a lot, Steve. I have one question (below): library(data.table) ## your data xx - data.table(group=c(rep(group1,5),rep(group2,5)), a=1:10, b=seq(10,100,by=10), key=group) yy - data.table(group=c(group1,group2), a=c(10,20), b=c(2,3), key=group) ## temp data.table to get your ducks in a row m - merge(xx, yy, by=group, suffixes=c(.x, .y)) Dimitri: The step above (merge) - I was thinking of it but decided against it because my xx already fills up tons of memory. When I merge xx and yy that doubles the number of variables - I am afraid my memory won't hold that much stuff... ## your answers will be in the aa and bb columns result - transform(m, aa=a.x * a.y, bb=b.x * b.y) Truth be told, if you use normal data.frames, the code will look very similar to above, so you can try that, too. HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Generating unque patient IDs
Hi Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. I would start with ?abbreviate and check uniqueness with ?unique or ?duplicated Regards Petr Regards, Ayyappa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] test for condition during whole r session or after each command
Dear List, is there any way to test for certain conditions during the whole r session or after the execution of each command? I am debugging my code and sometimes a certain logical error causes a program error much later in the script/function so especially with loops etc it is hard to backtrace the condition or situation that actually created the error. Any ideas? Jannis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Generating unque patient IDs
Does this do it for you: sprintf(%010.0f, seq(10.0, length = 250, by = 1.0)) [1] 10 11 12 13 14 15 16 [8] 17 18 19 100010 100011 100012 100013 [15] 100014 100015 100016 100017 100018 100019 100020 [22] 100021 100022 100023 100024 100025 100026 100027 [29] 100028 100029 100030 100031 100032 100033 100034 [36] 100035 100036 100037 100038 100039 100040 100041 [43] 100042 100043 100044 100045 100046 100047 100048 [50] 100049 100050 100051 100052 100053 100054 100055 [57] 100056 100057 100058 100059 100060 100061 100062 [64] 100063 100064 100065 100066 100067 100068 100069 [71] 100070 100071 100072 100073 100074 100075 100076 [78] 100077 100078 100079 100080 100081 100082 100083 [85] 100084 100085 100086 100087 100088 100089 100090 [92] 100091 100092 100093 100094 100095 100096 100097 On Wed, Jan 11, 2012 at 11:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. I would start with ?abbreviate and check uniqueness with ?unique or ?duplicated Regards Petr Regards, Ayyappa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating unque patient IDs
On Jan 11, 2012, at 11:12 AM, Ayyappa Chaturvedula wrote: Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. Are you sure? I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. They should not be input or managed as numbers but rather as character variables. If you want them output without quotes, then fine, R can do that when specified. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with speed (replacing the loop?)
Thanks a lot, Steve! match sounds very promising - that means I only need a loop across predictors. As far as get more memory advice is concerned: I already have more memory :) On Wed, Jan 11, 2012 at 11:14 AM, Steve Lianoglou mailinglist.honey...@gmail.com wrote: Hi, On Wed, Jan 11, 2012 at 10:50 AM, Dimitri Liakhovitski dimitri.liakhovit...@gmail.com wrote: Thanks a lot, Steve. I have one question (below): library(data.table) ## your data xx - data.table(group=c(rep(group1,5),rep(group2,5)), a=1:10, b=seq(10,100,by=10), key=group) yy - data.table(group=c(group1,group2), a=c(10,20), b=c(2,3), key=group) ## temp data.table to get your ducks in a row m - merge(xx, yy, by=group, suffixes=c(.x, .y)) Dimitri: The step above (merge) - I was thinking of it but decided against it because my xx already fills up tons of memory. When I merge xx and yy that doubles the number of variables - I am afraid my memory won't hold that much stuff... Fair enough ... how about just using `match`, then, ie: R aa - x$a * y$a[match(x$group, y$group)] Should do the trick, no? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact -- Dimitri Liakhovitski marketfusionanalytics.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Generating unque patient IDs
Unfortunately the rounding effect (which I assumed was related to the automatic conversion from integer to numeric) is only going to show up above 2147483647L, so I question whether you really demonstrated a solution to what I understood was the fundamental problem. -- David. On Jan 11, 2012, at 11:50 AM, jim holtman wrote: Does this do it for you: sprintf(%010.0f, seq(10.0, length = 250, by = 1.0)) [1] 10 11 12 13 14 15 16 [8] 17 18 19 100010 100011 100012 100013 [15] 100014 100015 100016 100017 100018 100019 100020 [22] 100021 100022 100023 100024 100025 100026 100027 [29] 100028 100029 100030 100031 100032 100033 100034 [36] 100035 100036 100037 100038 100039 100040 100041 [43] 100042 100043 100044 100045 100046 100047 100048 [50] 100049 100050 100051 100052 100053 100054 100055 [57] 100056 100057 100058 100059 100060 100061 100062 [64] 100063 100064 100065 100066 100067 100068 100069 [71] 100070 100071 100072 100073 100074 100075 100076 [78] 100077 100078 100079 100080 100081 100082 100083 [85] 100084 100085 100086 100087 100088 100089 100090 [92] 100091 100092 100093 100094 100095 100096 100097 On Wed, Jan 11, 2012 at 11:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. I would start with ?abbreviate and check uniqueness with ?unique or ?duplicated Regards Petr Regards, Ayyappa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rgl/ x11 problem
Hello- I am having problems with plot3d... I keep receiving the follow messages: when I attempt to load the package: library(rgl, pos=4), I get this error message: [6] WARNING: Warning in rgl.init(initValue) : Warning in rgl.init(initValue) : Warning in rgl.init(initValue) : Warning in rgl.init(initValue) : RGL: no suitable visual available Warning in fun(...) : error in rgl_init When I attempt to use the plot3d function, I get this message: [7] ERROR: rgl_dev_getcurrent This is a new problem, I have been using these packages/ functions without problems for at least a year. I am using Rcommander on Ubuntu 11.10. I have read that this may be an X11 problem, if so, I don't know how to fix this. Any help would be very much appreciated. Thanks for your time, -jack [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] CairoPDF and greek letter spacing
As a workaround you could use escape characters, then adjust the font style as necessary. cairo_pdf(file = zend.pdf) print(xyplot(y ~ x, data = data.frame(x = 1:10, y = 1:10), main = Length (\u03BCm))) dev.off() Regards Chris Campbell MANGO SOLUTIONS Data Analysis that Delivers +44 1249 767700 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Walker, John Stephen Sent: 11 January 2012 16:01 To: David Winsemius; Rolf Turner Cc: r-help@r-project.org Subject: Re: [R] CairoPDF and greek letter spacing Thanks for the input, but I can confirm the either form of the paste command i.e. expression(paste(Length (, mu*m, ))) or my original which had expression(paste(Length (, mu, m))), and the expression command alone have the same effect. It may be limited to Linux machines (which I did state I used in my first post) as when I previously used an OS X machine I didn't see the bug. I don't use OS X anymore, I've implemented a completely Linux based lab and would like to get this sorted out. Any suggestions? Rolf, if you can repoduce it then I think we have a bug report but to whom? R or Cairo? John From: David Winsemius [dwinsem...@comcast.net] Sent: Monday, January 09, 2012 10:36 PM To: Rolf Turner Cc: Walker, John Stephen; r-help@r-project.org Subject: Re: [R] CairoPDF and greek letter spacing On Jan 10, 2012, at 12:22 AM, Rolf Turner wrote: On 10/01/12 15:25, David Winsemius wrote: On Jan 9, 2012, at 8:19 PM, Rolf Turner wrote: On 10/01/12 11:40, John Walker wrote: I have a small problem with R graphics output. When I use the lattice package and CairoPDF to generate publication quality graphs I often use the expression to create an axis title that has microlitres or micrometers as a unit. I use something like the following 'expression(paste(Length (, mu,m )))' as an argument to the xlabel function. The command works but the mu and 'm' have a space between them. It looks like 'u m' rather than 'um'. It only seems to happen with the CairoPDF output on my linux machine, it's fine on the X11 device. I've fixed it in the past by importing the pdf into inkscape and manually adjusting the spacing (it's more difficult than it sounds because I can't actually adjust the spacing but have to delete the mu and re-enter it). Is there something I'm doing wrong? Is this a known bug? How can I fix it? I can't help, but I can confirm the problem, for what *that* is worth. It seems be an unfortunate interaction between lattice graphics and the cairo_pdf() device. The space between the mu and the m does not appear with ``ordinary'' R graphics, irrespective of device, nor does it appear with lattice graphics and, e.g. the pdf() device. But it does appear with lattice graphics *and* the cairo_pdf() device. That probably means that the problem is subtle and will be difficult to impossible to fix. :-( Doubtful. Do either of you realize that `paste` is a plotmath function that is misused more often than correctly used (at least as judged by the number o errors submitted to r-help)? I see no workable example, but if I did I would be trying instead : expression(Length~mu*m) You meant expression(Length==mu*m). I thought what was wanted: main=expression(Length~group((,mu*m,))) # Or main=expression(Length~(*mu*m*)) And on a Mac (Leopard, R 2.14.1 Patched) I could not get a different display on the screen device and either pdf or cairo_pdf so your experience may have something to do with the as yet unstated OSes. And yes that helps a bit, but there's still a bit more space between the mu and the m than one would like. Compare: require(lattice) cairo_pdf(file=mung.pdf) print (xyplot (y~x,data=data.frame(x=1:10,y=1:10),main=expression(Length==mu*m))) dev.off() and pdf(file=gorp.pdf) print (xyplot (y~x,data=data.frame(x=1:10,y=1:10),main=expression(Length==mu*m))) dev.off() There's not much in it, but there's just enough to be annoying. cheers, Rolf P. S. And I have never been able to figure out *anything* about how paste() and expression() interact. It is a complete mystery, and a matter of trying things more or less at random until something more or less works. R. T. P^2. S. I just realised that there's more space between the letters of Length in the cairo_pdf version, as well. Which is, I guess, The Explanation. Is it a font thing? R. T. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __
Re: [R] Odp: Generating unque patient IDs
One of the reasons that I specified the 'seq' command as it was was to make sure it used numerics: x - seq(123456789012.0, length = 10, by = 1.0) x [1] 123456789012 123456789013 123456789014 123456789015 123456789016 123456789017 123456789018 [8] 123456789019 123456789020 123456789021 str(x) num [1:10] 123456789012 123456789013 123456789014 123456789015 123456789016 ... On Wed, Jan 11, 2012 at 12:14 PM, David Winsemius dwinsem...@comcast.net wrote: Unfortunately the rounding effect (which I assumed was related to the automatic conversion from integer to numeric) is only going to show up above 2147483647L, so I question whether you really demonstrated a solution to what I understood was the fundamental problem. -- David. On Jan 11, 2012, at 11:50 AM, jim holtman wrote: Does this do it for you: sprintf(%010.0f, seq(10.0, length = 250, by = 1.0)) [1] 10 11 12 13 14 15 16 [8] 17 18 19 100010 100011 100012 100013 [15] 100014 100015 100016 100017 100018 100019 100020 [22] 100021 100022 100023 100024 100025 100026 100027 [29] 100028 100029 100030 100031 100032 100033 100034 [36] 100035 100036 100037 100038 100039 100040 100041 [43] 100042 100043 100044 100045 100046 100047 100048 [50] 100049 100050 100051 100052 100053 100054 100055 [57] 100056 100057 100058 100059 100060 100061 100062 [64] 100063 100064 100065 100066 100067 100068 100069 [71] 100070 100071 100072 100073 100074 100075 100076 [78] 100077 100078 100079 100080 100081 100082 100083 [85] 100084 100085 100086 100087 100088 100089 100090 [92] 100091 100092 100093 100094 100095 100096 100097 On Wed, Jan 11, 2012 at 11:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. I would start with ?abbreviate and check uniqueness with ?unique or ?duplicated Regards Petr Regards, Ayyappa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odp: Generating unque patient IDs
Dear all, I am sorry if I misstated the problem. The roundig issue is with NONMEM software not with R. But the suggestions are helpful. Regards,Ayyappa Chaturvedula On Jan 11, 2012, at 12:14 PM, David Winsemius dwinsem...@comcast.net wrote: Unfortunately the rounding effect (which I assumed was related to the automatic conversion from integer to numeric) is only going to show up above 2147483647L, so I question whether you really demonstrated a solution to what I understood was the fundamental problem. -- David. On Jan 11, 2012, at 11:50 AM, jim holtman wrote: Does this do it for you: sprintf(%010.0f, seq(10.0, length = 250, by = 1.0)) [1] 10 11 12 13 14 15 16 [8] 17 18 19 100010 100011 100012 100013 [15] 100014 100015 100016 100017 100018 100019 100020 [22] 100021 100022 100023 100024 100025 100026 100027 [29] 100028 100029 100030 100031 100032 100033 100034 [36] 100035 100036 100037 100038 100039 100040 100041 [43] 100042 100043 100044 100045 100046 100047 100048 [50] 100049 100050 100051 100052 100053 100054 100055 [57] 100056 100057 100058 100059 100060 100061 100062 [64] 100063 100064 100065 100066 100067 100068 100069 [71] 100070 100071 100072 100073 100074 100075 100076 [78] 100077 100078 100079 100080 100081 100082 100083 [85] 100084 100085 100086 100087 100088 100089 100090 [92] 100091 100092 100093 100094 100095 100096 100097 On Wed, Jan 11, 2012 at 11:32 AM, Petr PIKAL petr.pi...@precheza.cz wrote: Hi Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. I would start with ?abbreviate and check uniqueness with ?unique or ?duplicated Regards Petr Regards, Ayyappa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generating unque patient IDs
Dear Ayyappa Unique identifiers can be created from numbers using factor. These are coded as integers in R which you could use to relabel your dataset. x - rep(16:18, each = 2) x [1] 16 16 17 17 18 18 y - factor(x) levels(y) [1] 16 17 18 z - as.numeric(y) z [1] 1 1 2 2 3 3 Regards, Chris Campbell MANGO SOLUTIONS Data Analysis that Delivers +44 1249 767700 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ayyappa Chaturvedula Sent: 11 January 2012 16:12 To: r-help@r-project.org Subject: [R] Generating unque patient IDs Dear group, I am trying to prepare a NONMEM friendly dataset for population PK analysis. My patient IDs are 10 digit long and NONMEM is losing precison and rouding the last couple of digits. I need to generate unique Patient IDs fromt he current 10-digit IDs. Ihave total 250 subjects so I appreciate if anybody can suggest me a way to code this in R. Regards, Ayyappa [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. LEGAL NOTICE This message is intended for the use o...{{dropped:10}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 sample wilcox.test != kruskal.test
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of syrvn Sent: Tuesday, January 10, 2012 10:28 AM To: r-help@r-project.org Subject: [R] 2 sample wilcox.test != kruskal.test Hello, I think I am right in saying that a 2 sample wilcox.test is equal to a 2 sample kruskal.test and a 2 sample t.test is equal to a 2 sample anova. This is also stated in the ?kruskal.test man page: The Wilcoxon rank sum test (wilcox.test) as the special case for two samples; lm together with anova for performing one-way location analysis under normality assumptions; with Student's t test (t.test) as the special case for two samples. From this example it seems like it doesn't but I cannot figure out what I am doing wrong. x - c(10,11,15,8,16,12,20) y - c(10,14,18,25,28,30,35) f - c(rep(a,7), rep(b,7)) d - c(x,y) wilcox.test(x,y) kruskal.test(x,y) kruskal.test(x~y) kruskal.test(f~d) t.test(x,y) anova(lm(x~y)) summary(aov(lm(x~y))) And why does kruskal.test(x~y) differ from kruskal.test(f~d)?? You have received answers about the kruskal.test. But, to make a final point, if your purpose for these statements t.test(x,y) anova(lm(x~y)) summary(aov(lm(x~y))) Was to compare the t.test results with anova, you have misspecified the call to lm(). To get comparable results you should look at t.test(x,y) summary(lm(d ~ as.factor(f))) The difference between the two is that t.test() use the Welch adjustment to the degrees of freedom. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general question on Spotfire
Peter et. al: 1. I agree with Duncan: wrong list. 2. AFAIK, Spotfire **already** can interface with R. -- Bert On Wed, Jan 11, 2012 at 8:17 AM, peter dalgaard pda...@gmail.com wrote: On Jan 11, 2012, at 16:28 , Duncan Murdoch wrote: On 12-01-11 10:13 AM, John Smith wrote: Dear R users, I have been using R for 10 years, and I love it very much. But in my daily job for drug discovery, some people use Spotfire. I tried Spotfire on couple of data sets. It sounds I still need do some data manipulation before plot figures. For example, I can not plot figure with data arranged in rows (is this true, or I am stupid?). So far I don't feel any benefit Spotfire can provide over R. I am just wondering whether it just because I am new to Spotfire, or it's true that Spotfire is not a good tool for statistician. Also could anyone give me any suggestion how to learn Spotfire? Shouldn't you be asking this question to Spotfire users? Just to clue in the casual reader, Spotfire embeds a version of S+, which is, er, sort of, like, a predecessor to R, so John is not completely off target. Documents comparing R and S+ should be useful to him. There are books that are bilingual, such as Venables and Ripley MASS and S Programming, but I also spotted this on TIBCO's own site: http://spotfire.tibco.com/community/blogs/stn/archive/2010/11/04/differences-between-r-and-spotfire-s.aspx Also, there are (claimed to be) facilities to integrate R itself in Spotfire, which could be a rather expedient solution. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stacked barplot colour coding
acacia21 wrote on 01/09/2012 07:01:28 PM: Hi all, i'm fairly new to R and its graphing, but having unsuccessfully 'googled' and checked this forum to find answer to my problem, i'm posting my question here. I'm trying to plot stacked barplot. I have simple data that looks like this: bgag 0.41 2.81 0.37 2.91 0.31 2.06 0.32 2.39 every row indicates a factor (1,2,3,4, see below in names.arg). Now when i plot this using following function for stacked barplots: plot-barplot(t(data), main=txt, ylim=c(0,10), col=c(white, grey90), ylab=Total biomass (g), space=0.1, names.arg=c(1, 2, 3, 4)) i get a lovely stacked graph and it color-codes with white (bg) and grey90 (ag) in each individual stacked bar. I have a total of 4 stacked bars as i have 4 factors. Now here is the question: i would like to add density lines across entire stacked bar or any other graphic feature to distinguish between bars 1 and 2 as they indicate same factor and 3 and 4 that indicate different factor. Is there any way to do that? Surely it's possible, but not so obvious for the beginner =) thank you very much One workaround is to create a second matrix of data, with zeroes for the bars where you don't want to add emphasis. Then, overlay a second bar plot with density lines for emphasis. df - data.frame(bg=c(0.41, 0.37, 0.31, 0.32), ag=c(2.81, 2.91, 2.06, 2.39)) # matrix for full barplot m - t(df) # matrix for subset of barplot to add angled lines to mlines - m mlines[, 3:4] - 0 barplot(m, space=0.1, col=c(white, grey90), ylim=c(0, 10), ylab=Total biomass (g), names.arg=c(1, 2, 3, 4)) barplot(mlines, space=0.1, col=black, density=10, angle=35, add=TRUE) Jean P.S. You should avoid creating objects that have the same names as currently existing functions, e.g., data and plot. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general question on Spotfire
Hello, I am a Product Manager at Spotfire, focused on integrating statistical capabilities from R S+ into Spotfire, so I will make a few comments: 1. We have a quite a few customers who use Spotfire and R side-by-side for doing ad hoc data analysis. Sometimes by the same user, sometimes by different users collaborating in a group. Each fills a different niche (with Spotfire focusing on highly interactive visualizations), and often different users are more comfortable with one or the other. Sometimes users will do their initial data manipulation and analysis in R, and then move the data into Spotfire for further interaction, and presentation to other, non-R users. 2. Our focus at Spotfire has been on integrating R S+ (as Peter mentions below), so that it's easy to create and share interactive Spotfire applications that leverage analytics from R S+. We want to help customers put the power of R S+ into the hands of more users, in applications that are friendly and familiar to them. If you'd like more info on that, check out the info on the Statistics Services product at spotfire.tibco.com, or this recorded webcast: http://www.screencast.com/t/So5Kz7gJI4 Regards Lou -- Lou Bajuk-Yorgan Sr. Director, Product Management Spotfire, TIBCO Software 206-802-2328 lba...@tibco.com http://spotfire.tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of peter dalgaard Sent: Wednesday, January 11, 2012 8:17 AM To: Duncan Murdoch Cc: r-help@r-project.org Subject: Re: [R] general question on Spotfire On Jan 11, 2012, at 16:28 , Duncan Murdoch wrote: On 12-01-11 10:13 AM, John Smith wrote: Dear R users, I have been using R for 10 years, and I love it very much. But in my daily job for drug discovery, some people use Spotfire. I tried Spotfire on couple of data sets. It sounds I still need do some data manipulation before plot figures. For example, I can not plot figure with data arranged in rows (is this true, or I am stupid?). So far I don't feel any benefit Spotfire can provide over R. I am just wondering whether it just because I am new to Spotfire, or it's true that Spotfire is not a good tool for statistician. Also could anyone give me any suggestion how to learn Spotfire? Shouldn't you be asking this question to Spotfire users? Just to clue in the casual reader, Spotfire embeds a version of S+, which is, er, sort of, like, a predecessor to R, so John is not completely off target. Documents comparing R and S+ should be useful to him. There are books that are bilingual, such as Venables and Ripley MASS and S Programming, but I also spotted this on TIBCO's own site: http://spotfire.tibco.com/community/blogs/stn/archive/2010/11/04/differences-between-r-and-spotfire-s.aspx Also, there are (claimed to be) facilities to integrate R itself in Spotfire, which could be a rather expedient solution. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot maps with R
Dear Alex Two other packages that create maps are: maps mapproj alaios wrote Dear all I would like to use R and make some maps. I want to have strict control, over the details of the produced map, like remove borders, city names, add markers, add labels. Is there any package apart Rgooglemaps that can do something like that? B.R Alex [[alternative HTML version deleted]] __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - --- Heather A. Wright, PhD candidate Ecology and Evolution of Plankton Stazione Zoologica Anton Dohrn Villa Comunale 80121 - Napoli, Italy -- View this message in context: http://r.789695.n4.nabble.com/Plot-maps-with-R-tp4284932p4285112.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotOHLC(alpha3): Error in plotOHLC(alpha3) : x is not a open/high/low/close time series
Hi Joshua, Thanks. I had used irts because I thought I had to. The tick data I have has some minutes in which there is no data, and others when there are hundreds, or even thousands. If xts supports irregular data, the that is one less step for me to worry about. Alas, your suggestion didn't help: z - xts(y[,2], y[,1]) alpha3 - to.minutes3(z, OHLC=TRUE) plotOHLC(alpha3) Error in plotOHLC(alpha3) : x is not a open/high/low/close time series str(alpha3) An ‘xts’ object from 2010-06-30 15:47:00 to 2011-10-31 15:14:00 containing: Data: num [1:98865, 1:4] 9215 9220 9205 9195 9195 ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr [1:4] z.Open z.High z.Low z.Close Indexed by objects of class: [POSIXct,POSIXt] TZ: xts Attributes: NULL Is there anything else I might try? Thanks again, Ted -- View this message in context: http://r.789695.n4.nabble.com/plotOHLC-alpha3-Error-in-plotOHLC-alpha3-x-is-not-a-open-high-low-close-time-series-tp4283217p4286124.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Assist me on how I can arrange trend data of rainfall and temperature for analysis
I am a student doing my MSc Research Methods, i am working on my thesis research on analysing and modelling of crop failure risks due to drought in selected districts in Malawi. The analysis and modelling will focus on two crop stages of development: just after planting and flowering stages. I have rainfall, temperature, Relative humidity, wind speed, coordinate location data for the past 30 years for each of the district i have chosen based on the Malawi National Program action against climate change. I wish to model using the extreme value statistical theory in R by extRemes package. How do I arrange this data? How do I produce maps using R? I am a good user of R however; I have just been exposed to this wonderful and powerful open statistical software recently. I am good using it in analysing designed experimental data. Would anyone good in the area I have indicated help me. My email address is mukha...@yahoo.co.uk. I appreciate in advance. - Collins Tamonde Mukhala MSc Research Methods student Jomo Kenyatta University of Agriculture and Technology P.O. Box 62000-00200 City Square, Nairobi Kenya Main Campus-Juja -- View this message in context: http://r.789695.n4.nabble.com/Assist-me-on-how-I-can-arrange-trend-data-of-rainfall-and-temperature-for-analysis-tp4285032p4285032.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Accomplishing a loop on multiple columns
Hello, I have a question concerning ‘for loops’ on multiple columns. I made 91 columns with results (all made together with a for loop) and I want to us lm to fit the model. I want to compare the results of all these calculated columns (91) with one column with observed values. I use the function lm to fit the model and calculate r.squared. I manage to do this for each column separately: For example: my calculated results are in the dataframe ‘results6”, my observed results in data, (data$observed). #To calculate R2 for column 1: lm.modelobs1 - lm(results6[,c(1)] ~ data$observed) R2.1 - summary(lm.modelobs1)[r.squared] #To calculate R2 for column 91: lm.modelobs91 - lm(results6[,c(91)] ~ data$observed) R2.91 - summary(lm.modelobs91)[r.squared] But I think there has to be a method to do this automatically and not 91 times. I tried to use a for loop: ###(length(C) = 91) results7-data.frame(lm.modelobs=rep(NA,length(C))) for (i in (1:91)) { results7$lm.modelobs[i] - lm(results6[i] ~ data$observed) R2.[i] - summary(lm.modelobs[i])[r.squared] } I also tried just to calculate results7$lm.modelobs[i] without directly calculating r.squared but I also didn’t manage. It seems like it’s not possible to use the referral to a column in a for loop or a function. (if I just ask R the data in column 5 with ‘ results6[5] ’, that works. ‘ results6[,c(5)]’ gives the same but replacing results6[i] by results6[,c([i])] in the for loop is apparently also no a solution). I’m looking for a manner to repeat a calculation/function on several columns. I kind of need this as well further in my script, not only in this part… I would greatly appreciate any suggestions! Thanks! Nerak -- View this message in context: http://r.789695.n4.nabble.com/Accomplishing-a-loop-on-multiple-columns-tp4284974p4284974.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 64bit R under 32bit winxp
Thanks At 2012-01-11 16:55:32,Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: You cannot install 64-bit R on 32-bit OS, but you can install a 32-bit R on a 64-bit OS, and you can later install 64-bit R as well. That is, installing 32-bit R does not interfere with your option to later install a 64-bit R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. ÃÏÐÀ lm_meng...@163.com wrote: Hi all: My OS is 32bit winxp,but I wanna install 64bit R2.14.1. From the following website,it says You can also go back and add 64-bit components to a 32-bit install, or vice versa http://cran.r-project.org/bin/windows/rw-FAQ.html#Can-both-32_002d-and-64_002dbit-R-be-installed-on-the-same-machine_003f Does it mean that I can install and run 64bit R2.14.1 under 32bit winxp?If so,how can I add 64-bit components to a 32-bit install? Many thanks for your help. My best [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Max value of an integer
On 01/11/2012 12:01 PM, Rui Esteves wrote: Is there any constant that represents the maximum value of an integer? Yes, there is (assuming you refer to the 'integer' type). See ?.Machine. .Machine$integer.max [1] 2147483647 as.integer(2147483647) [1] 2147483647 as.integer(2147483648) [1] NA Warning message: NAs introduced by coercion Double-precision numbers (i.e. the 'numeric' type) are able to represent a larger range of integer values, from -(2^.Machine$double.digits) to +(2^.Machine$double.digits). If you go outside that range, some integers are not exactly representable. options(digits = 22) print(max.num - 2 ^ .Machine$double.digits) [1] 9007199254740992 (max.num) - (max.num - 1) [1] 1 (max.num + 1) - (max.num) [1] 0 (max.num + 2) - (max.num + 1) [1] 2 If I need to setup by myself what is the maximum value? I don't understand what you mean. -- Mikko Korpela __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Getting Sphericity Tests for Within Subject Repeated Measure Anova (using car package)
Dear John, thanks for your help and sorry for answering this late. My question is a follow up question of an older thread posted several mongths ago, but your statement helped a lot. Thanks, Max - M a x i m i l i a n M ue l l e r PhD-Student Department of Business Studies Leuphana Universität Lüneburg -- View this message in context: http://r.789695.n4.nabble.com/Getting-Sphericity-Tests-for-Within-Subject-Repeated-Measure-Anova-using-car-package-tp841030p4285034.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem concerning withRestarts and R2WinBUGS
Dear R-users, I have a question regarding the withRestarts function in R. I´m running a simulation code in which I analyse data using both lme and R2WinBUGS. Now, I want to run this code for 1000 replications, however the model I´m using is a little ´sensitive´, so sometimes the WinBUGS analysis crashes. In that case I want the code to ignore the current results, and redo all steps. This means drawing new data, analysing the data with lme, and than analysing the data using WinBUGS. So basically, I want my code to continue untill I get 1000 succesful replications, and ignore the replications that failed. I figured I might be able to do so by putting the data generation and analysis steps in a second function embedded in the overall one, and calling this second function using the withRestarts and invokeRestart options, but I can´t get this to work. Can somebody help me figure this out by telling me if this is even possible, and if so, how withRestarts and invokeRestarts should be called to do this? Alternatively, if any of you know an other way to aqcuire the desired effect I´m also very interested. Kind regards, Joran Jongerling [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGL- Drawing Circle
thanks Duncan. I think want my circle to be in user coordinates. I tried the first code u gave me and it gives me a circle. but how can i: 1) change the radius ? 2) place the circle at a given x,y,z coordinate? 3) turn it 90 degree up like these circle plate bus stop? http://www.geocities.co.jp/yuganatabi/bus-stop-aba.html 4) and finally, set the orientation angle of the circle plate? grateful if u give me some directions or at least let me know if it is possible or not. graham -- View this message in context: http://r.789695.n4.nabble.com/RGL-Drawing-Circle-tp4278717p4285503.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Accomplishing a loop on multiple columns
Lists are the answer. LIST-list() for(i in 1:ncol(results6)) { LIST[[i]]-lm(results6[,i]~data$observed) } You'll now have a 91 entry list of lm(). You can then do something like this: LIST2-list() for(i in 1:length(LIST)) { LIST2[[i]]-LIST[[i]]$r.squared } This should now be a list of 91 R-squared, which you can unlist() and save in matrix form if you want. - Isaac Research Assistant Quantitative Finance Faculty, UTS -- View this message in context: http://r.789695.n4.nabble.com/Accomplishing-a-loop-on-multiple-columns-tp4284974p4285136.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] odfWeave UTF-8 error and latin characters
Hello, I am using R and Libreoffice on Ubuntu 11.10 (64-bit) and have been experiencing similar problems with character encoding (Swedish utf8) in odfWeave. Here is an example of what it looks like: Should be: Hör Ärland dåligt? Appears as: Hör Ãrland dÃ¥ligt? I found a (pretty clumsy) solution which I post below. Has anyone been able to solve this in a more elegant way? Setup: sessionInfo() R version 2.13.1 (2011-07-08) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=sv_SE.UTF-8 LC_NUMERIC=C other attached packages: [1] odfWeave_0.7.17 XML_3.2-0 lattice_0.19-30 Problem: I have some R syntax for tables in the file in.odt: vl5, echo=FALSE, results=xml= irre - xtabs(~Species, data=iris) irre - data.frame(irre) colnames(irre) - c(växt, antal) row.names(irre) - c(å, ä, ö) odfTable(irre) odfTableCaption(Tabell åäö) @ Running odfWeave on this with odfWeave(in.odt, out.odt) yields lots of output, ending with this Warning message: ‘content.Rnw’ has unknown encoding: assuming Latin-1. On opening the output file (odt.out), Swedish characters appear jumbled. I had a look at the content.Rnw file, which was correctly coded with utf-8. The same was true for the content.xml file in the odt source (this had to be unzipped). I then tried downgrading to XML 3.2, as suggested elsewhere. This didn't help. I then looked for tools for converting an odt file from one kind of encoding to another, again to no avail. Solution: Save the odt file in flat xml format (Libreoffice save as second last option). Convert the resulting .fodt file FROM utf-8 TO latin 1 (aka ISO_8859-1) with iconv from a bash terminal: iconv -t ISO_8859-1 -f UTF-8 -o converted.fodt out.fodt This produces a correctly encoded file! -- View this message in context: http://r.789695.n4.nabble.com/odfWeave-UTF-8-error-and-latin-characters-tp2544333p4285335.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Constructing a data.frame from csv files
Dear Sir, Thanks a lot for your guidance. I have understood my mistake. It was naming the columns viz. colnames(price_returns) = entity_returns which was creating the problems. Code is running excellently once I got rid of this particular line. I will use melt from reshape etc to get the required data.frame. Thanks again. With warm regards Vincy --- On Wed, 1/11/12, jim holtman jholt...@gmail.com wrote: From: jim holtman jholt...@gmail.com Subject: Re: [R] Constructing a data.frame from csv files To: Vincy Pyne vincy_p...@yahoo.ca Cc: r-help@r-project.org Received: Wednesday, January 11, 2012, 1:49 PM The error message says it all: the dataframes that you are creating, and then trying to 'rbind', do not have the same columns. You need to at least show what the first couple of lines of each of you input files are, or output the names of the columns as you are reading the files. This is some elementary debugging that you will have to learn. On Wed, Jan 11, 2012 at 7:38 AM, Vincy Pyne vincy_p...@yahoo.ca wrote: Dear R helpers, Following is my R code where I am trying to calculate returns and then trying to create a data.frame. Since, I am not aware how many instruments I will be dealing so I have constructed a function. My R code is as follows - library(plyr) mydata - data.frame(instru_name = c(instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_A,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B,instru_B), date = c(10-Jan-12,9-Jan-12,8-Jan-12, 7-Jan-12, 6-Jan-12,5-Jan-12,4-Jan-12,3-Jan-12,2-Jan-12,1-Jan-12, 31-Dec-11, 30-Dec-11,29-Dec-11,28-Dec-11,10-Jan-12,9-Jan-12,8-Jan-12, 7-Jan-12,6-Jan-12,5-Jan-12,4-Jan-12,3-Jan-12,2-Jan-12,1-Jan-12,31-Dec-11,30-Dec-11,29-Dec-11,28-Dec-11), price = c(11.9,10.5,13,14.5,14.4,14.8,10.1,12,14.3, 10.7,11.2,10.2,10.2,10.8,41.9,40.5,43,44.5,44.4,48.8,42.1,44,46.3,48.7,46.2,44.2,42.2,40.8)) attach(mydata) opt_return_volatilty = function(price, instru_name) { price_returns = matrix(data = NA, nrow = (length(price)-1), ncol = 1) for (i in(1:(length(price)-1))) { price_returns[i] = log(price[i]/price[i+1]) } volatility = sd(price_returns) entity_returns = unique(instru_name) colnames(price_returns) = entity_returns write.csv(price_returns, file = paste(entity_returns, .csv, sep = ), row.names = FALSE) return(data.frame(list(volatility = volatility))) } entity_volatility - ddply(.data=mydata, .variables = instru_name, .fun=function(x) opt_return_volatilty(price = x$price, instru_name = x$instru_name)) entity_volatility instru_name volatility 1 instru_A 0.17746897 2 instru_B 0.06565341 fileNames - list.files(pattern = instru.*.csv) fileNames [1] instru_A.csv instru_B.csv # _ # MY QUERY # I need to construct the data frame consisting of all the returns. I.e. I need to have # a data.frame like instru_A instru_B 0.125163143 0.033983853 -0.2135741 -0.059898142 -0.109199292 -0.034289073 0.006920443 0.00224972 -0.027398974 -0.094490843 I am using following Code input - do.call(rbind, lapply(fileNames, function(.name) { .data - read.csv(.name, header = TRUE, as.is = TRUE) .data$file - .name .data })) # I get following error. Error in match.names(clabs, names(xi)) : names do not match previous names Kindly guide Regards Vincy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with segmented
Hi there. Here's the error message. Error in seg.lm.fit(y, XREG, Z, PSI, weights, offs, opz) : (Some) estimated psi out of its range I have tried many ways to specify the arguments, but apparently the error message is related to the estimated break point being invalid. However, my estimation fall within range of my data. Thank for your help, Phil -- View this message in context: http://r.789695.n4.nabble.com/Problem-with-segmented-tp4282398p4285505.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New to R, Curious about Project Idea
Good morning, I am a student whom is currently working on a term project for my GIS Program. I am looking for a software package which can aid me in my project, and I was curious if R would be able to address my goals. My project includes power outage data from a hydro company (point data, with UTM coordinates attached), which is available in an Access database, or in a Shapefile. I would like to be able to take this poweroutage data, and then perform a spatial analysis of this data, perhaps as a hot-spot analysis, or in a points per raster square style analysis. With the completed analysis, I would like to be able to use an open source web mapping platform to display it for the 'company' I am performing this for as part of my project. Any insight you could provide me would be greatly, greatly appreciated. Thanks, Phil -- View this message in context: http://r.789695.n4.nabble.com/New-to-R-Curious-about-Project-Idea-tp4285576p4285576.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] meta-analysis normal quantile plot metafor
Hello, I once used the metawin software to perform a meta-analysis (see metawinsoft, Rosenberg et al.) and produced normal qqplot to test for a potential bias in the dataset. I now want to re-use the same dataset with the package metafor by W. Viechtbauer (great package btw). I run the qqnorm.rma.uni function. I use standardized effect sizes as in metawin. QQplot generated with metafor differs from the plot obtained with metawin: most of the datapoint fall outside the confidence envelope (using the same confidence level). I don't understand very well how the pseudo confidence envelope was created in metafor. Is it more conservative than that from metawin or created using the package envelope ? Unfortunately I do not have access to metawin's code so that I cannot compare implementations but the manual let me think that metawin print classical confidence interval... Thanks for input ! Ricc More precisions: R version 2.13.1 (2011-07-08) Platform: x86_64-pc-linux-gnu (64-bit) metafor_1.6-0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fix and edit don't work: unable to open X Input Method-segfault
On Sun, 08-Jan-2012 at 03:32PM -0600, Paul Johnson wrote: | I can't run fix() or edit() anymore. Did I break my system? | | I'm running Debian Linux with R-2.14.1. As far as I can tell, the R | packages came from Debian's testing wheezy repository. I would | like to know if users on other types of systems see the same | problem. If no, then, obviously, it is a Debian-only issue and I I compiled R-2.14.1 on CentOS and on Kubuntu (64 bit) without any problem. Since Kubuntu is a Debian based distro, I don't think it's a Debian problem. However, if R-2.14.0 still works for you but R-2.14.1 does not, that's an indication that it would be the debs you used. | can approach it from that point of view. And if no other Debian | users see same, it means it is a me-only problem, and that's | discouraging :) [...] | sessionInfo() | R version 2.14.1 (2011-12-22) | Platform: x86_64-pc-linux-gnu (64-bit) | | locale: | [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C | [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 | [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 | [7] LC_PAPER=C LC_NAME=C | [9] LC_ADDRESS=C LC_TELEPHONE=C | [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C | | attached base packages: | [1] grid stats graphics grDevices utils datasets methods | [8] base | | other attached packages: | [1] ggplot2_0.8.9 proto_0.3-9.2 reshape_0.8.4 plyr_1.6 | fix(mpg) | Error in dataentry(datalist, modes) : invalid device | In addition: Warning message: | In edit.data.frame(get(subx, envir = parent), title = subx, ...) : | unable to open X Input Method That looks like an OS issue (assuming it's not a problem with the deb). | | Same happens no matter what packages are loaded, so far as I can tell. | Here it is without ggplot2, in case you were suspicious of those | particular datasets. | | | library(datasets) | datasets() | Error: could not find function datasets No surprise there. Everyone will get that message since there is no datasets function. | help(package=datasets) | fix(CO2) | Error in dataentry(datalist, modes) : invalid device | In addition: Warning message: | In edit.data.frame(get(subx, envir = parent), title = subx, ...) : | unable to open X Input Method There's that X message again. [...] HTH -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) . Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rjags installation trouble
It looks like you are cross-linking to an earlier version of the JAGS library at run time. Check /sbin/ldconfig -p | grep jags - When compiling rjags, you can hard-code the location of the jags library using the [GNU-specific] configure option --enable-rpath. Martyn On Tue, 2012-01-10 at 10:40 -0500, Ben Bolker wrote: Trying to install latest rjags (3-5) from CRAN with JAGS 3.2.0 installed on Ubuntu 10.04, with r-devel ... the bottom line is that it fails while loading with /libs/rjags.so: undefined symbol: _ZN7Console15checkAdaptationERb Has anyone else seen this or is it a glitch somewhere in my system? thanks Ben Bolker == bolker@ubuntu-10-new:~/R/pkgs/rjags$ jags Welcome to JAGS 3.2.0 on Tue Jan 10 10:38:31 2012 JAGS is free software and comes with ABSOLUTELY NO WARRANTY Loading module: basemod: ok Loading module: bugs: ok . === install.packages(rjags) starts out fine: Installing package(s) into ‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library’ (as ‘lib’ is unspecified) trying URL 'http://probability.ca/cran/src/contrib/rjags_3-5.tar.gz' Content type 'application/x-gzip' length 66429 bytes (64 Kb) opened URL == downloaded 64 Kb * installing *source* package ‘rjags’ ... ** package ‘rjags’ successfully unpacked and MD5 sums checked checking for prefix by checking for jags... /usr/bin/jags [snip snip] ** libs g++ -I/usr/local/lib/R/include -DNDEBUG -I/usr/include/JAGS -I/usr/local/include-fpic -g -O2 -c jags.cc -o jags.o g++ -I/usr/local/lib/R/include -DNDEBUG -I/usr/include/JAGS -I/usr/local/include-fpic -g -O2 -c parallel.cc -o parallel.o g++ -shared -L/usr/local/lib -o rjags.so jags.o parallel.o -L/usr/lib -ljags installing to /mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs ** R [snip snip] but fails at: ** building package indices Error : .onLoad failed in loadNamespace() for 'rjags', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared object '/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs/rjags.so': /mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs/rjags.so: undefined symbol: _ZN7Console15checkAdaptationERb ERROR: installing package indices failed * removing ‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags’ * restoring previous ‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags’ The downloaded source packages are in ‘/tmp/RtmpSKHIz5/downloaded_packages’ Warning message: In install.packages(rjags) : installation of package ‘rjags’ had non-zero exit status sessionInfo() R Under development (unstable) (2012-01-01 r58032) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_CA.utf8 LC_NUMERIC=C [3] LC_TIME=en_CA.utf8LC_COLLATE=en_CA.utf8 [5] LC_MONETARY=en_CA.utf8LC_MESSAGES=en_CA.utf8 [7] LC_PAPER=CLC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.15.0 --- This message and its attachments are strictly confidenti...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] turning a list of vectors into a data.frame (as rows of the DF)?
As a newer R practicioner, it seems I stump myself weekly (at least) with issues that have spinning my wheels. Here is yet another... I'm trying to turn a list of numeric vectors (of uneual length) inot a dataframe. Each vector held in the list represents a row, and there are some rows of unequal length. I would like NAs as placeholders for missing data in the shorter vectors. I think I'm missing something quite basic. v1 - c(1,2,3,4) v2 - c(1,2) lst1 - list(v1,v2) Of course there is the intuitive: as.data.frame(lst1) However, the recycling rule (expectedly) reclycles 1,2 versus using NAs as placeholders. Then, looking into Teetor's R Cookbook, there is a piece of code that looked (from the description) like it might do the trick: do.call(rbind, Map(as.data.frame,lst1) But I get the error: Error in match.names(clabs, names(xi)) : names do not match previous names Thinking the source of the error had to do with the vectors of unequal lenght, I tried Hadley's rbind.fill thusly: library(reshape) do.call(rbind.fill, Map(as.data.frame,lst1) Which produced a dataset, but gain, not in the desired format. Thanks in advance to anyone that can bring my frustrations to end! C [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] general question on Spotfire
Roughly 5 years ago, a Spotfire rep at the Joint Statistical Meetings told me they routinely interfaced with both R and S-Plus. I'm not 100% certain, but I believe they have many customers who use that facility today. Spencer On 1/11/2012 10:37 AM, Bert Gunter wrote: Peter et. al: 1. I agree with Duncan: wrong list. 2. AFAIK, Spotfire **already** can interface with R. -- Bert On Wed, Jan 11, 2012 at 8:17 AM, peter dalgaardpda...@gmail.com wrote: On Jan 11, 2012, at 16:28 , Duncan Murdoch wrote: On 12-01-11 10:13 AM, John Smith wrote: Dear R users, I have been using R for 10 years, and I love it very much. But in my daily job for drug discovery, some people use Spotfire. I tried Spotfire on couple of data sets. It sounds I still need do some data manipulation before plot figures. For example, I can not plot figure with data arranged in rows (is this true, or I am stupid?). So far I don't feel any benefit Spotfire can provide over R. I am just wondering whether it just because I am new to Spotfire, or it's true that Spotfire is not a good tool for statistician. Also could anyone give me any suggestion how to learn Spotfire? Shouldn't you be asking this question to Spotfire users? Just to clue in the casual reader, Spotfire embeds a version of S+, which is, er, sort of, like, a predecessor to R, so John is not completely off target. Documents comparing R and S+ should be useful to him. There are books that are bilingual, such as Venables and Ripley MASS and S Programming, but I also spotted this on TIBCO's own site: http://spotfire.tibco.com/community/blogs/stn/archive/2010/11/04/differences-between-r-and-spotfire-s.aspx Also, there are (claimed to be) facilities to integrate R itself in Spotfire, which could be a rather expedient solution. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Spencer Graves, PE, PhD President and Chief Technology Officer Structure Inspection and Monitoring, Inc. 751 Emerson Ct. San José, CA 95126 ph: 408-655-4567 web: www.structuremonitoring.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence Interval from Moments?
Hi all, I'm wondering whether it is possible to construct a confidence interval using only the mean, variance, skewness and kurtosis, i.e. without any of the population? If anyone could help with this it'd be much appreciated (even if just a confirmation of it being impossible!). Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Confidence-Interval-from-Moments-tp4284937p4284937.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Getting a root edge error when trying to read the phylocom megatree into R
Hi, I'm trying to use the picante package in R to build phylogenetic trees, based on a list of taxa I have data for and the phylocom APG3 megatree (version R20091110; http://www.phylodiversity.net/phylomatic/). However, trying to read the tree in R using: read.tree(R20091110.new.txt) Gives the error message: Error in if (sum(obj[[i]]$edge[, 1] == ROOT) == 1 dim(obj[[i]]$edge)[1] : missing value where TRUE/FALSE needed This seems like such a basic question, but I don't know what the problem with the root edge is. Are there any modifications you have to make to the megatree in order to read it in R? Thanks so much for your help! Best, Megan Bartlett [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Restricting R session
On 10.01.2012 20:30, Antonio Rodriges wrote: Hello, Is it possible to use R on public server where each user has its own restricted R session? This entirely depends on the definition of restricted, otherwise the answer is yes. In particular, how to prohibit some set of functions, for example, from base package? You can't: R is free software. Well, of course you could build your own version of R that did not ship those functions. Anyway, typical restrictions are handled by settig permissions on the file system and/or arrange quotas for space / memory / CPU resources. How to limit session operating memory and CPU time? Ask he manual of your operating system. Best, Uwe Ligges What additional security considerations must be taken care of? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Storing/Restoring R objects
One of my projects has generated quite a few objects (data frames) related to one portion of this project. They can be listed with the ls() function. What I would like to do is move them to another directory so that data frames for other portions of the project can be more easily seen and used. .RData is a binary file. Are there tools that let me work with this file? What if I rename it and start a new .RData file when I next invoke R? Could I then specify which .RData file should be available on demand? I have not seen (or remembered if I did) discussions on this topic. Please advise. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotOHLC(alpha3): Error in plotOHLC(alpha3) : x is not a open/high/low/close time series
On Wed, Jan 11, 2012 at 11:10 AM, Ted Byers r.ted.by...@gmail.com wrote: Hi Joshua, Thanks. I had used irts because I thought I had to. The tick data I have has some minutes in which there is no data, and others when there are hundreds, or even thousands. If xts supports irregular data, the that is one less step for me to worry about. Alas, your suggestion didn't help: z - xts(y[,2], y[,1]) alpha3 - to.minutes3(z, OHLC=TRUE) plotOHLC(alpha3) Error in plotOHLC(alpha3) : x is not a open/high/low/close time series plotOHLC requires a 4-column ts object with columns explicitly named Open, High, Low, Close, in that order. alpha3, as I've defined above, will have 4 columns but may not have those explicit column names. You would have to set them yourself. Then you could run: plotOHLC(as.ts(alpha3)) For example, this works: library(xts) data(sample_matrix) x - as.ts(sample_matrix) plotOHLC(x) str(alpha3) An ‘xts’ object from 2010-06-30 15:47:00 to 2011-10-31 15:14:00 containing: Data: num [1:98865, 1:4] 9215 9220 9205 9195 9195 ... - attr(*, dimnames)=List of 2 ..$ : NULL ..$ : chr [1:4] z.Open z.High z.Low z.Close Indexed by objects of class: [POSIXct,POSIXt] TZ: xts Attributes: NULL Is there anything else I might try? You could try quantmod::chartSeries(to.minutes3(z, OHLC=TRUE)). I'm not familiar with the charting capabilities in the tseries package, but those in quantmod are quite extensive. See also www.quantmod.com. Thanks again, Ted Best, -- Joshua Ulrich | FOSS Trading: www.fosstrading.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 2 sample wilcox.test != kruskal.test
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of syrvn Sent: Wednesday, January 11, 2012 2:36 AM To: r-help@r-project.org Subject: Re: [R] 2 sample wilcox.test != kruskal.test Hi, thanks for your answer. Unfortunately I cannot reproduce your results. In my example the results still differ when I use your approach: x - c(10,11,15,8,16,12,20) y - c(10,14,18,25,28,30,35) f - as.factor(c(rep(a,7), rep(b,7))) d - c(x,y) kruskal.test(x,y) Kruskal-Wallis rank sum test data: x and y Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232 kruskal.test(x~y) Kruskal-Wallis rank sum test data: x by y Kruskal-Wallis chi-squared = 6, df = 6, p-value = 0.4232 kruskal.test(d~f) Kruskal-Wallis rank sum test data: d by f Kruskal-Wallis chi-squared = 3.6816, df = 1, p-value = 0.05502 kruskal.test(f~d) Kruskal-Wallis rank sum test data: f by d Kruskal-Wallis chi-squared = 11.1429, df = 12, p-value = 0.5167 I know the last kruskal.test(f~d) is not correct as the factor is always placed as the second bit but I still tried it that way just to be sure... Cheers You don't provide context, so I don't know what results you can't reproduce. But you need to reread the help page for the kruskal.test and look at the examples give. If x and y are the results for two independent groups, then the call to the kruskal.test should be either kruskal.test(list(x,y)) #or, kruskal.test(d~f) Hope this is helpful, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 2D filter in R?
Hi all, I am looking for a command for doing 2D filtering (rectangular or Gaussian) in R... I have looked at ksmooth, filter and convolve but they seem to be 1D... Any thoughts? Thanks a lot! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with segmented
We really need the small reproducible example requested in the posting guide, including sample data, the actual R commands you used, the libraries required, and your OS and version of R. Sarah On Wed, Jan 11, 2012 at 9:08 AM, Filoche pmassico...@hotmail.com wrote: Hi there. Here's the error message. Error in seg.lm.fit(y, XREG, Z, PSI, weights, offs, opz) : (Some) estimated psi out of its range I have tried many ways to specify the arguments, but apparently the error message is related to the estimated break point being invalid. However, my estimation fall within range of my data. Thank for your help, Phil -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stacked barplot colour coding
This graph would be easier under lattice graphics. biomass - data.frame(bg=c(0.41, 0.37, 0.31, 0.32), ag=c(2.81, 2.91, 2.06, 2.39)) b2 - stack(biomass) names(b2) - c(mass, type) b2$type - factor(b2$type, levels=c(bg,ag)) b2$AB - rep(c(A,A,B,B), 2) b2$location - rep(1:4, 2) b2 barchart(mass ~ location | AB, group=type, data=b2, horizontal=FALSE, stack=TRUE, col=c(white, grey90), scales=list(x=list(relation=sliced)), ylim=c(0, 10), ylab=Total biomass (g)) I did several things above. First, I gave the data frame a name that has some semblance of what the data is (I don't know more about it, but you can do even better with an appropriate name). Second, I recognize that there are three factors---using the technical definition of a factor. You were using the term factor very informally and inconsistently. You said that there are four factors and also that there are two factors. I interpret that as the factor location with 4 levels, and the factor AB with two levels. You also have the factor type with two levels, these are the columns of your original data organization. I invented the names location, AB, and type. You can pick more appropriate names. Third, since you want to distinguish the AB levels, I did it graphically by putting them in separate panels. I suggest you learn lattice graphics. Rich On Mon, Jan 9, 2012 at 8:01 PM, acacia21 chriss...@hotmail.com wrote: Hi all, i'm fairly new to R and its graphing, but having unsuccessfully 'googled' and checked this forum to find answer to my problem, i'm posting my question here. I'm trying to plot stacked barplot. I have simple data that looks like this: bg ag 0.412.81 0.372.91 0.312.06 0.322.39 every row indicates a factor (1,2,3,4, see below in names.arg). Now when i plot this using following function for stacked barplots: plot-barplot(t(data), main=txt, ylim=c(0,10), col=c(white, grey90), ylab=Total biomass (g), space=0.1, names.arg=c(1, 2, 3, 4)) i get a lovely stacked graph and it color-codes with white (bg) and grey90 (ag) in each individual stacked bar. I have a total of 4 stacked bars as i have 4 factors. Now here is the question: i would like to add density lines across entire stacked bar or any other graphic feature to distinguish between bars 1 and 2 as they indicate same factor and 3 and 4 that indicate different factor. Is there any way to do that? Surely it's possible, but not so obvious for the beginner =) thank you very much -- View this message in context: http://r.789695.n4.nabble.com/stacked-barplot-colour-coding-tp4280685p4280685.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New to R, Curious about Project Idea
It sounds quite possible but you'll probably get more specialized help if you ask on the r-sig-geo mailing list. Michael On Wed, Jan 11, 2012 at 9:36 AM, arbeaupg parb...@gmail.com wrote: Good morning, I am a student whom is currently working on a term project for my GIS Program. I am looking for a software package which can aid me in my project, and I was curious if R would be able to address my goals. My project includes power outage data from a hydro company (point data, with UTM coordinates attached), which is available in an Access database, or in a Shapefile. I would like to be able to take this poweroutage data, and then perform a spatial analysis of this data, perhaps as a hot-spot analysis, or in a points per raster square style analysis. With the completed analysis, I would like to be able to use an open source web mapping platform to display it for the 'company' I am performing this for as part of my project. Any insight you could provide me would be greatly, greatly appreciated. Thanks, Phil -- View this message in context: http://r.789695.n4.nabble.com/New-to-R-Curious-about-Project-Idea-tp4285576p4285576.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Restricting R session
Thank you, Uwe, below are my comments In particular, how to prohibit some set of functions, for example, from base package? You can't: R is free software. This does not imply it must be inflexible and unsuitable for cloud services Well, of course you could build your own version of R that did not ship those functions. Anyway, typical restrictions are handled by settig permissions on the file system and/or arrange quotas for space / memory / CPU resources. Disk quotas and permissions are helpful in case when R function depends on disk access. For other functions something more special must be devised. How to limit session operating memory and CPU time? Ask he manual of your operating system. The most straightforward Linux command is ulimit which limits CPU/memory per shell/user/system wide. However, sessions are child processes of R and this requires testing whether limits will cover all sessions or each session separately. -- Kind regards, Antonio Rodriges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning a list of vectors into a data.frame (as rows of the DF)?
Most methods take the rows of data.frame()s to be very significant (indicating multiple values from a single observation) so what you're doing seems like it may be against the spirit of R, but if you want a simple NA padding at the end, this should do it: listToDF - function(inputList, fill = NA){ # Use fill = NULL for regular recycling behavior maxLen = max(sapply(inputList, length)) for(i in seq_along(inputList)) inputList[[i]] - c(inputList[[i]], rep(fill, maxLen - length(inputList[[i]]))) return(as.data.frame(inputList)) } listToDF(list(x = 1:4, y = 1:2)) Michael On Wed, Jan 11, 2012 at 2:40 PM, Chris Conner connerpha...@yahoo.com wrote: As a newer R practicioner, it seems I stump myself weekly (at least) with issues that have spinning my wheels. Here is yet another... I'm trying to turn a list of numeric vectors (of uneual length) inot a dataframe. Each vector held in the list represents a row, and there are some rows of unequal length. I would like NAs as placeholders for missing data in the shorter vectors. I think I'm missing something quite basic. v1 - c(1,2,3,4) v2 - c(1,2) lst1 - list(v1,v2) Of course there is the intuitive: as.data.frame(lst1) However, the recycling rule (expectedly) reclycles 1,2 versus using NAs as placeholders. Then, looking into Teetor's R Cookbook, there is a piece of code that looked (from the description) like it might do the trick: do.call(rbind, Map(as.data.frame,lst1) But I get the error: Error in match.names(clabs, names(xi)) : names do not match previous names Thinking the source of the error had to do with the vectors of unequal lenght, I tried Hadley's rbind.fill thusly: library(reshape) do.call(rbind.fill, Map(as.data.frame,lst1) Which produced a dataset, but gain, not in the desired format. Thanks in advance to anyone that can bring my frustrations to end! C [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing/Restoring R objects
R is the natural tool to operate on a .RData file: you can name it however you wish either with your OS or with the save() command in R. You can load any .RData files with the load() command but the startup routine only looks for .RData (to my knowledge) unless you put specific instructions in your .Rprofile. What I might suggest is (if there is a natural modularity to your project) saving the sets of relevant data frames in .RData files (which can hold multiple objects) and then writing a little prompt to ask your users which one he would like to load using readline(): e.g., cat(Here are available data sets, dir(pattern = .RData), \n) load(dir()[pmatch(readline(Which data set would you like to load), dir())]) Does this help? Michael On Wed, Jan 11, 2012 at 2:56 PM, Rich Shepard rshep...@appl-ecosys.com wrote: One of my projects has generated quite a few objects (data frames) related to one portion of this project. They can be listed with the ls() function. What I would like to do is move them to another directory so that data frames for other portions of the project can be more easily seen and used. .RData is a binary file. Are there tools that let me work with this file? What if I rename it and start a new .RData file when I next invoke R? Could I then specify which .RData file should be available on demand? I have not seen (or remembered if I did) discussions on this topic. Please advise. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing/Restoring R objects
On 11.01.2012 20:56, Rich Shepard wrote: One of my projects has generated quite a few objects (data frames) related to one portion of this project. They can be listed with the ls() function. What I would like to do is move them to another directory so that data frames for other portions of the project can be more easily seen and used. .RData is a binary file. Are there tools that let me work with this file? What if I rename it and start a new .RData file when I next invoke R? Could I then specify which .RData file should be available on demand? See ?save and ?load Uwe Ligges I have not seen (or remembered if I did) discussions on this topic. Please advise. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RGL- Drawing Circle
On 11.01.2012 15:08, gli wrote: thanks Duncan. I think want my circle to be in user coordinates. I tried the first code u gave me and it gives me a circle. but how can i: 1) change the radius ? 2) place the circle at a given x,y,z coordinate? 3) turn it 90 degree up like these circle plate bus stop? http://www.geocities.co.jp/yuganatabi/bus-stop-aba.html 4) and finally, set the orientation angle of the circle plate? What about readin the help pages for the functions Duncan cited? Uwe Ligges grateful if u give me some directions or at least let me know if it is possible or not. graham -- View this message in context: http://r.789695.n4.nabble.com/RGL-Drawing-Circle-tp4278717p4285503.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] New to R, Curious about Project Idea
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arbeaupg Sent: Wednesday, January 11, 2012 6:36 AM To: r-help@r-project.org Subject: [R] New to R, Curious about Project Idea Good morning, I am a student whom is currently working on a term project for my GIS Program. I am looking for a software package which can aid me in my project, and I was curious if R would be able to address my goals. My project includes power outage data from a hydro company (point data, with UTM coordinates attached), which is available in an Access database, or in a Shapefile. I would like to be able to take this poweroutage data, and then perform a spatial analysis of this data, perhaps as a hot-spot analysis, or in a points per raster square style analysis. With the completed analysis, I would like to be able to use an open source web mapping platform to display it for the 'company' I am performing this for as part of my project. Any insight you could provide me would be greatly, greatly appreciated. Thanks, Phil Not my area of expertise, but if you go to your favorite CRAN mirror and look at the task views, you will find info on spatial analysis. Also, Googling 'R GIS' brings up a ton of hits. Good luck, Dan Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] summarizing a complex dataframe
I need some help summarizing complex data frames (small example below): m1_1 m2_1 m3_1 m1_2 m2_2 m3_2 i1111222 i1211222 i2221222 For an arbitrary number of columns (say m1 …. m199) where the column names have variable patterns, and such that each set of columns is repeated (with potentially unique data) an arbitrary number of times (say _1 … _1000), I would like to summarize by row the mean values of (m1, m2, m3, … m199) over all replicates (_1, _2, _3, … _1000). I need to do this with a large number of dataframes of variable nrow, ncolumn, and colnames. I've tried various loops creating new dataframes and reassigning cell values in loops or using rbind and bind, but run into trouble in each case. Any ideas? Thanks, Chris __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence Interval from Moments?
Assuming a distribution defined solely by those moments it is possible (e.g., z- or t-test confidence intervals) but this isn't really the place to discuss such things since there's no R content to your question: try stats.stackexchange.com Michael On Wed, Jan 11, 2012 at 4:56 AM, lambdatau lewis@rbs.com wrote: Hi all, I'm wondering whether it is possible to construct a confidence interval using only the mean, variance, skewness and kurtosis, i.e. without any of the population? If anyone could help with this it'd be much appreciated (even if just a confirmation of it being impossible!). Thanks. -- View this message in context: http://r.789695.n4.nabble.com/Confidence-Interval-from-Moments-tp4284937p4284937.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning a list of vectors into a data.frame (as rows of the DF)?
Perhaps the following does what you want. It extends each element of your list to a common length, converts that to a matrix, then to a data.frame: f - function(data) { nCol - max(vapply(data, length, 0)) data - lapply(data, function(row) c(row, rep(NA, nCol-length(row data - matrix(unlist(data), nrow=length(data), ncol=nCol, byrow=TRUE) data.frame(data) } E.g., rawData - list(c(1,2,3), c(11,12), integer(), 31) f(rawData) X1 X2 X3 1 1 2 3 2 11 12 NA 3 NA NA NA 4 31 NA NA What sort of data is this? If it is longitudinal it might be more straigtforward to store it as a three-column data.frame (columns subject, time, value). Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Chris Conner Sent: Wednesday, January 11, 2012 11:41 AM To: HelpR Subject: [R] turning a list of vectors into a data.frame (as rows of the DF)? As a newer R practicioner, it seems I stump myself weekly (at least) with issues that have spinning my wheels. Here is yet another... I'm trying to turn a list of numeric vectors (of uneual length) inot a dataframe. Each vector held in the list represents a row, and there are some rows of unequal length. I would like NAs as placeholders for missing data in the shorter vectors. I think I'm missing something quite basic. v1 - c(1,2,3,4) v2 - c(1,2) lst1 - list(v1,v2) Of course there is the intuitive: as.data.frame(lst1) However, the recycling rule (expectedly) reclycles 1,2 versus using NAs as placeholders. Then, looking into Teetor's R Cookbook, there is a piece of code that looked (from the description) like it might do the trick: do.call(rbind, Map(as.data.frame,lst1) But I get the error: Error in match.names(clabs, names(xi)) : names do not match previous names Thinking the source of the error had to do with the vectors of unequal lenght, I tried Hadley's rbind.fill thusly: library(reshape) do.call(rbind.fill, Map(as.data.frame,lst1) Which produced a dataset, but gain, not in the desired format. Thanks in advance to anyone that can bring my frustrations to end! C [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] turning a list of vectors into a data.frame (as rows of the DF)?
On Jan 11, 2012, at 1:40 PM, Chris Conner wrote: As a newer R practicioner, it seems I stump myself weekly (at least) with issues that have spinning my wheels. Here is yet another... I'm trying to turn a list of numeric vectors (of uneual length) inot a dataframe. Each vector held in the list represents a row, and there are some rows of unequal length. I would like NAs as placeholders for missing data in the shorter vectors. I think I'm missing something quite basic. v1 - c(1,2,3,4) v2 - c(1,2) lst1 - list(v1,v2) Of course there is the intuitive: as.data.frame(lst1) However, the recycling rule (expectedly) reclycles 1,2 versus using NAs as placeholders. Then, looking into Teetor's R Cookbook, there is a piece of code that looked (from the description) like it might do the trick: do.call(rbind, Map(as.data.frame,lst1) But I get the error: Error in match.names(clabs, names(xi)) : names do not match previous names Thinking the source of the error had to do with the vectors of unequal lenght, I tried Hadley's rbind.fill thusly: library(reshape) do.call(rbind.fill, Map(as.data.frame,lst1) Which produced a dataset, but gain, not in the desired format. Thanks in advance to anyone that can bring my frustrations to end! C [[alternative HTML version deleted]] There may be an easier way, but try this: list2df - function(x) { MAX.LEN - max(sapply(x, length), na.rm = TRUE) DF - data.frame(lapply(x, function(x) c(x, rep(NA, MAX.LEN - length(x) colnames(DF) - paste(V, seq(ncol(DF)), sep = ) DF } list2df(lst1) V1 V2 1 1 1 2 2 2 3 3 NA 4 4 NA HTH, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rjags installation trouble
On 12-01-11 11:45 AM, Martyn Plummer wrote: It looks like you are cross-linking to an earlier version of the JAGS library at run time. Check /sbin/ldconfig -p | grep jags - When compiling rjags, you can hard-code the location of the jags library using the [GNU-specific] configure option --enable-rpath. Martyn Thanks. I managed to sort it out by assiduously cleaning out all older versions of JAGS ... Ben Bolker On Tue, 2012-01-10 at 10:40 -0500, Ben Bolker wrote: Trying to install latest rjags (3-5) from CRAN with JAGS 3.2.0 installed on Ubuntu 10.04, with r-devel ... the bottom line is that it fails while loading with /libs/rjags.so: undefined symbol: _ZN7Console15checkAdaptationERb Has anyone else seen this or is it a glitch somewhere in my system? thanks Ben Bolker == bolker@ubuntu-10-new:~/R/pkgs/rjags$ jags Welcome to JAGS 3.2.0 on Tue Jan 10 10:38:31 2012 JAGS is free software and comes with ABSOLUTELY NO WARRANTY Loading module: basemod: ok Loading module: bugs: ok . === install.packages(rjags) starts out fine: Installing package(s) into ‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library’ (as ‘lib’ is unspecified) trying URL 'http://probability.ca/cran/src/contrib/rjags_3-5.tar.gz' Content type 'application/x-gzip' length 66429 bytes (64 Kb) opened URL == downloaded 64 Kb * installing *source* package ‘rjags’ ... ** package ‘rjags’ successfully unpacked and MD5 sums checked checking for prefix by checking for jags... /usr/bin/jags [snip snip] ** libs g++ -I/usr/local/lib/R/include -DNDEBUG -I/usr/include/JAGS -I/usr/local/include-fpic -g -O2 -c jags.cc -o jags.o g++ -I/usr/local/lib/R/include -DNDEBUG -I/usr/include/JAGS -I/usr/local/include-fpic -g -O2 -c parallel.cc -o parallel.o g++ -shared -L/usr/local/lib -o rjags.so jags.o parallel.o -L/usr/lib -ljags installing to /mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs ** R [snip snip] but fails at: ** building package indices Error : .onLoad failed in loadNamespace() for 'rjags', details: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared object '/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs/rjags.so': /mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags/libs/rjags.so: undefined symbol: _ZN7Console15checkAdaptationERb ERROR: installing package indices failed * removing ‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags’ * restoring previous ‘/mnt/hgfs/bolker/Documents/LOCAL/lib/R/site-library/rjags’ The downloaded source packages are in ‘/tmp/RtmpSKHIz5/downloaded_packages’ Warning message: In install.packages(rjags) : installation of package ‘rjags’ had non-zero exit status sessionInfo() R Under development (unstable) (2012-01-01 r58032) Platform: i686-pc-linux-gnu (32-bit) locale: [1] LC_CTYPE=en_CA.utf8 LC_NUMERIC=C [3] LC_TIME=en_CA.utf8LC_COLLATE=en_CA.utf8 [5] LC_MONETARY=en_CA.utf8LC_MESSAGES=en_CA.utf8 [7] LC_PAPER=CLC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_CA.utf8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] tools_2.15.0 --- This message and its attachments are strictly confiden...{{dropped:8}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Storing/Restoring R objects
?save ?load You may need to create single use environment objects to hold the whole file while you separate multiple objects. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Rich Shepard rshep...@appl-ecosys.com wrote: One of my projects has generated quite a few objects (data frames) related to one portion of this project. They can be listed with the ls() function. What I would like to do is move them to another directory so that data frames for other portions of the project can be more easily seen and used. .RData is a binary file. Are there tools that let me work with this file? What if I rename it and start a new .RData file when I next invoke R? Could I then specify which .RData file should be available on demand? I have not seen (or remembered if I did) discussions on this topic. Please advise. Rich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] rjava on FreeBSD
Trying to install Rjava on FreeBSD 9 and am getting the following error: install.packages('rJava') trying URL 'http://cran.cnr.Berkeley.edu/src/contrib/rJava_0.9-3.tar.gz' Content type 'application/x-gzip' length 537153 bytes (524 Kb) opened URL == downloaded 524 Kb * installing *source* package 'rJava' ... ** package 'rJava' successfully unpacked and MD5 sums checked checking for gcc... gcc46 -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc46 -std=gnu99 accepts -g... yes checking for gcc46 -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc46 -std=gnu99 -E checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for ANSI C header files... yes checking for sys/wait.h that is POSIX.1 compatible... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for an ANSI C-conforming const... yes checking whether time.h and sys/time.h may both be included... yes configure: checking whether gcc46 -std=gnu99 supports static inline... yes checking whether setjmp.h is POSIX.1 compatible... yes checking whether sigsetjmp is declared... yes checking whether siglongjmp is declared... yes checking Java support in R... present: interpreter : '/usr/local/bin/java' archiver: '/usr/local/bin/jar' compiler: '/usr/local/bin/javac' header prep.: '/usr/local/bin/javah' cpp flags : '-I/usr/local/jdk1.6.0/jre/../include -I/usr/local/jdk1.6.0/jre/../include/freebsd' java libs : '-L/usr/local/jdk1.6.0/jre/lib/i386/server -L/usr/local/jdk1.6.0/jre/lib/i386 -L/usr/local/jdk1.6.0/jre/../lib/i386 -L -L/usr/java/packages/lib/i386 -L/lib -L/usr/lib -L/usr/local/lib -ljvm' checking whether JNI programs can be compiled... yes checking JNI data types... configure: error: One or more JNI types differ from the corresponding native type. You may need to use non-standard compiler flags or a different compiler in order to fix this. ERROR: configuration failed for package 'rJava' * removing '/usr/local/lib/R/library/rJava' * removing '/usr/local/lib/R/library/rJava' The downloaded packages are in '/tmp/RtmpWtRfKe/downloaded_packages' Updating HTML index of packages in '.Library' Warning messages: 1: In install.packages(rJava) : installation of package 'rJava' had non-zero exit status 2: In file.create(f.tg) : cannot create file '/usr/local/share/doc/R/html/packages.html', reason 'Permission denied' 3: In make.packages.html(.Library) : cannot update HTML package index Ok, so I tried doing so as root, because it would solve the 'Permissions denied' issue and was rewarded with: install.packages('rJava') trying URL 'http://cran.cnr.Berkeley.edu/src/contrib/rJava_0.9-3.tar.gz' Content type 'application/x-gzip' length 537153 bytes (524 Kb) opened URL == downloaded 524 Kb * installing *source* package 'rJava' ... ** package 'rJava' successfully unpacked and MD5 sums checked checking for gcc... gcc46 -std=gnu99 checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc46 -std=gnu99 accepts -g... yes checking for gcc46 -std=gnu99 option to accept ISO C89... none needed checking how to run the C preprocessor... gcc46 -std=gnu99 -E checking for grep that handles long lines and -e... /usr/bin/grep checking for egrep... /usr/bin/grep -E checking for ANSI C header files... yes checking for sys/wait.h that is POSIX.1 compatible... yes checking for sys/types.h... yes checking for sys/stat.h... yes checking for stdlib.h... yes checking for string.h... yes checking for memory.h... yes checking for strings.h... yes checking for inttypes.h... yes checking for stdint.h... yes checking for unistd.h... yes checking for string.h... (cached) yes checking sys/time.h usability... yes checking sys/time.h presence... yes checking for sys/time.h... yes checking for unistd.h... (cached) yes checking for an ANSI C-conforming const... yes checking whether time.h and sys/time.h may both be included... yes configure: checking whether gcc46 -std=gnu99
Re: [R] Storing/Restoring R objects
On Jan 11, 2012, at 2:56 PM, Rich Shepard wrote: One of my projects has generated quite a few objects (data frames) related to one portion of this project. They can be listed with the ls() function. What I would like to do is move them to another directory so that data frames for other portions of the project can be more easily seen and used. .RData is a binary file. Are there tools that let me work with this file? The R executable would be one such tool. The code is there if you want to go through it, but the unserialization process is notoriously complex. There is no counterpart to SAS Proc CONTENTS What if I rename it and start a new .RData file when I next invoke R? Then R will not find it. (Actually there will be no .RData file until R executes save.image() at the end of the session. R does not create a fresh .RData at the beginning of a session.) Could I then specify which .RData file should be available on demand? ?Startup You can have different ,RData files in different directories. ?save ?load I have not seen (or remembered if I did) discussions on this topic. Please advise. Please read more help pages. They generally have many useful links. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summarizing a complex dataframe
Hi, On Wed, Jan 11, 2012 at 3:55 PM, Christopher G Oakley coak...@bio.fsu.edu wrote: I need some help summarizing complex data frames (small example below): m1_1 m2_1 m3_1 m1_2 m2_2 m3_2 i1 1 1 1 2 2 2 i1 2 1 1 2 2 2 i2 2 2 1 2 2 2 For an arbitrary number of columns (say m1 …. m199) where the column names have variable patterns, and such that each set of columns is repeated (with potentially unique data) an arbitrary number of times (say _1 … _1000), [snip] Perhaps your job would be easier if you change the layout of your data frame, for instance you can have experiment.name and replicate columns, so your clean data.frame would look like: experiment.name replicate region count m1 1 i1 1 m2 1 i1 1 m3 1 i1 1 ... You can use the reshape (or reshape2) package to help you whip your old table into a new one using a formula interface, if you like. You can then use your favorite split-apply-combine[1] method (via plyr, data.table, sqldf, or even base::tapply) to calculate summary statistics over the values of interest in each group/subgroup, whatever. HTH, -steve [1] The Split-Apply-Combine Strategy for Data Analysis: http://www.jstatsoft.org/v40/i01 -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.