Re: [R] apply a function down each column
Dear Steve, my solution looks like it would work, but it does not. I attached a text file with an extract of my data. Maybe you can try it yourself. I want to compare C1 with M1, C2 with M2, C3 with M3,,, for each column. I do not really know what the problem is. R complains about a syntax error. The function I am applying counts the common strings between the two. Greg Hirson helped me to write it. lettermatch - function(a, b) { tb - merge(as.data.frame(table(strsplit(a, ))), as.data.frame(table(strsplit(b, ))), by=Var1) sum(apply(tb[-1], 1, min)) } For example for the second column I tried: for (x in 1:(nrow(dat)-1)) { a - as.character(dat[(2x-1),1]) b - as.character(dat[(2x),1]) lettermatch(a,b) } or a - as.character(dat[seq(1, nrow(dat), by=2),2]) b - as.character(dat[seq(2, nrow(dat), by=2), 2]) all.results - lettermatch(a,b) With dat-read.delim(data_lgs.txt,stringsAsFactors=FALSE) I can leave the as.character away in the formula above. Laetitia Individuals Seq1Seq2Seq3Seq4 C1 AATTCCGGCTTT M1 C2 AATTCCGGCTTT M2 AGGGAACTCCGGCGTT C3 AGGGAACTCCGGCGTT M3 AGGGAACTCCGGCGTT C4 AATTCCGGCCTT M4 AAATCGGGCTTT C5 AGGGACTTCCCGCTTT M5 AGGGCTTTCCTT C6 AGGGCTTTCCTT M6 AAAGCCTTCTTT C7 AAAGACCCCCCGGTTT M7 AAGGAACCCCGG C8 AATTCCGGCCTT M8 AATTCCGGCCTT C9 M9 C11 AGGGAAACCGGGGGTT M11 AATTCCGGCCTT Am 11.01.2010 um 15:18 schrieb Steve Lianoglou: Hi, On Mon, Jan 11, 2010 at 8:41 AM, Laetitia Schmid laeti...@gmt.su.se wrote: Hello World, I have a function that makes pairwise comparisons between two strings. I would like to apply this function to my data (which consists of columns with different strings) in the way that it compares the first with the second entry, and then the third with the fourth, and then the fifth with the sixth, and so on down each column... So (2x-1) and (2x) would be the different entries to be compared! dat= my data: for the first column: compare dat[(2x-1),1] with dat[(2x),1] and x would be 1:i, i=length(dat[,1]) I think the best way to do that is a loop: a - as.character(dat[(2x-1),1]) b - as.character(dat[(2x),1]) for (i in 1:length(dat[,1]) my_function(a, b)) Can somebody help me to apply a function with a loop in the way I want to a column? It seems as if you got it already, don't you? for (x in 1:(nrow(dat)-1)) { a - dat[(2x-1),1] b - dat[(2x), 1] my_function(a,b) } Is there a specification of tapply for that? I don't think so, but depending on what you want to do, the size of your data, and the amount of RAM you have, it might be faster to compare everything at once (assuming `my_function` can be vectorized), for instance: a - dat[seq(1, nrow(dat), by=2),1] b - dat[seq(2, nrow(dat), by=2), 1] all.results - my_function(a,b) Also, as an aside, I see you keep calling as.character on your data when you extract it from your data.frame. Is your data being converted to factors? You can look to set stringsAsFactors=FALSE if this is the case and you are reading in data using read.table/delim/etc (see: ?read.table) Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drought Severity Index (DSI)
Does anyone have the code to calculate the drought severity index? -- Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment, University of Oxford South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] HoltWinters Forecasting
In situations where an ancient version and a less ancient version of R give different results for the same data it's a good idea to check the NEWS file. For HoltWinters() you will find an entry for R VERSION 2.8.0 that indicates a change in the default number of start periods used to autodetect start values from 3 to 2. While this may be good for 'tidy' series, your series has a pretty messy start. Another way to look for obvious differences in an older version of a function and a newer one is to check the help page and note the arguments and their default values. Either way, you would find: start.periods = 3 (R 2.7.2) start.periods = 2 (R 2.9.1) So, to get results more reasonable than the obvious junk produced with start.periods = 2, try 3. Do upgrade to 2.10.1, though. -Peter Ehlers RobertNZ wrote: Hi R-users, I have a question relating to the HoltWinters() function. I am trying to forecast a series using the Holt Winters methodology but I am getting some unusual results. I had previously been using R for Windows version 2.7.2 and have just started using R 2.9.1. While using version 2.7.2 I was getting reasonable results however upon changing versions I found I started to see unusual results. If anybody would be able provide assistance with this it would be much appreciated! The series in question is ‘x’ below. x = c(18, 18, 16, 19, 12, 12, 13, 12, 7, 9, 9, 9, 12.5, 16, 20, 22, 22, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23, 22, 17, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14.5, 15, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 6, 5, 5, 5) x.ts = ts(data = x, start = c(1999,1), frequency = 12) ## USING R 2.9.1 I get the following results using HoltWinters(). I see that the smoothing parameters are greater than 1 for alpha and beta which I believe is unusual as I think the optim() function that is used to calculate these ## parameters in HoltWinters() should constrain the results to [0,1]? hw.ts = HoltWinters(x.ts) hw.ts Holt-Winters exponential smoothing with trend and additive seasonal component. Call: HoltWinters(x = x.ts) Smoothing parameters: alpha: 1.046340 beta : 3.198345 gamma: 1 Coefficients: [,1] a -6.491044e+44 b -3.313740e+46 s1 -4.035877e+40 s2 9.734997e+40 s3 -2.348192e+41 s4 5.664108e+41 s5 -1.366248e+42 s6 3.295545e+42 s7 -7.949232e+42 s8 1.917446e+43 s9 -4.625098e+43 s10 1.115626e+44 s11 -2.691018e+44 s12 6.491044e+44 Subsequently using the predict() function the following results are produced which are quite unusual. x.pred = predict(hw.ts, n.ahead = 10, + prediction.interval = T, conf.level = 0.8) x.pred fit upr lwr Jan 2010 -3.378655e+46 -3.102583e+46 -3.654726e+46 Feb 2010 -6.692381e+46 -5.448602e+46 -7.936160e+46 Mar 2010 -1.000615e+47 -7.533863e+46 -1.247845e+47 Apr 2010 -1.331981e+47 -9.385468e+46 -1.725416e+47 May 2010 -1.663375e+47 -1.103422e+47 -2.223327e+47 Jun 2010 -1.994702e+47 -1.250080e+47 -2.739324e+47 Jul 2010 -2.326189e+47 -1.380352e+47 -3.272025e+47 Aug 2010 -2.657291e+47 -1.494943e+47 -3.819640e+47 Sep 2010 -2.989320e+47 -1.596167e+47 -4.382472e+47 Oct 2010 -3.319116e+47 -1.681697e+47 -4.956534e+47 ### Applying the same code to time series x.ts using R 2.7.2 yields the following results hw.ts = HoltWinters(x.ts) hw.ts Holt-Winters exponential smoothing without trend and with additive seasonal component. Call: HoltWinters(x = x.ts) Smoothing parameters: alpha: 0.8560487 beta : 0 gamma: 1 Coefficients: [,1] a6.3820972 s1 -1.6592841 s2 -1.4172832 s3 -0.3896275 s4 1.1195576 s5 1.3899338 s6 1.8304666 s7 1.3751008 s8 0.5919732 s9 -0.5971810 s10 -0.7390197 s11 -1.0104958 s12 -1.3820972 The subsequent forecast this time is more reasonable x.pred fit uprlwr Jan 2010 4.722813 7.943789 1.5018372 Feb 2010 4.964814 9.204797 0.7248308 Mar 2010 5.992470 11.050160 0.9347798 Apr 2010 7.501655 13.262123 1.7411862 May 2010 7.772031 14.158405 1.3856573 Jun 2010 8.212564 15.168751 1.2563766 Jul 2010 7.757198 15.239932 0.2744638 Aug 2010 6.974070 14.948660 -1.0005194 Sep 2010 5.784916 14.222739 -2.6529065 Oct 2010 5.643078 14.519993 -3.2338377 It would be much appreciated if anyone could help me with understanding why I am seeing these unusual results when using R 2.9.1 compared with R 2.7.2? I wonder if there is something that I have not considered or if there are any remedies that I could take to fix this? Thanks in advance, Robert -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing
[R] Problems with betareg()
Hi, In using the betareg package, I encounter the following error message: Error in lm.wfit(x, linkfun(y), weights, offset = offset) : NA/NaN/Inf in foreign function call (arg 4) Any help will be most appreciated. Thanks in advance. Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply a function down each column
See inline below. Laetitia Schmid wrote: Dear Steve, my solution looks like it would work, but it does not. I attached a text file with an extract of my data. Maybe you can try it yourself. I want to compare C1 with M1, C2 with M2, C3 with M3,,, for each column. I do not really know what the problem is. R complains about a syntax error. The function I am applying counts the common strings between the two. Greg Hirson helped me to write it. lettermatch - function(a, b) { tb - merge(as.data.frame(table(strsplit(a, ))), as.data.frame(table(strsplit(b, ))), by=Var1) sum(apply(tb[-1], 1, min)) } For example for the second column I tried: for (x in 1:(nrow(dat)-1)) { a - as.character(dat[(2x-1),1]) Shouldn't that be 2*x-1?? -Peter Ehlers b - as.character(dat[(2x),1]) lettermatch(a,b) } or a - as.character(dat[seq(1, nrow(dat), by=2),2]) b - as.character(dat[seq(2, nrow(dat), by=2), 2]) all.results - lettermatch(a,b) With dat-read.delim(data_lgs.txt,stringsAsFactors=FALSE) I can leave the as.character away in the formula above. Laetitia IndividualsSeq1Seq2Seq3Seq4 C1AATTCCGGCTTT M1 C2AATTCCGGCTTT M2AGGGAACTCCGGCGTT C3AGGGAACTCCGGCGTT M3AGGGAACTCCGGCGTT C4AATTCCGGCCTT M4AAATCGGGCTTT C5AGGGACTTCCCGCTTT M5AGGGCTTTCCTT C6AGGGCTTTCCTT M6AAAGCCTTCTTT C7AAAGACCCCCCGGTTT M7AAGGAACCCCGG C8AATTCCGGCCTT M8AATTCCGGCCTT C9 M9 C11AGGGAAACCGGGGGTT M11AATTCCGGCCTT Am 11.01.2010 um 15:18 schrieb Steve Lianoglou: Hi, On Mon, Jan 11, 2010 at 8:41 AM, Laetitia Schmid laeti...@gmt.su.se wrote: Hello World, I have a function that makes pairwise comparisons between two strings. I would like to apply this function to my data (which consists of columns with different strings) in the way that it compares the first with the second entry, and then the third with the fourth, and then the fifth with the sixth, and so on down each column... So (2x-1) and (2x) would be the different entries to be compared! dat= my data: for the first column: compare dat[(2x-1),1] with dat[(2x),1] and x would be 1:i, i=length(dat[,1]) I think the best way to do that is a loop: a - as.character(dat[(2x-1),1]) b - as.character(dat[(2x),1]) for (i in 1:length(dat[,1]) my_function(a, b)) Can somebody help me to apply a function with a loop in the way I want to a column? It seems as if you got it already, don't you? for (x in 1:(nrow(dat)-1)) { a - dat[(2x-1),1] b - dat[(2x), 1] my_function(a,b) } Is there a specification of tapply for that? I don't think so, but depending on what you want to do, the size of your data, and the amount of RAM you have, it might be faster to compare everything at once (assuming `my_function` can be vectorized), for instance: a - dat[seq(1, nrow(dat), by=2),1] b - dat[seq(2, nrow(dat), by=2), 1] all.results - my_function(a,b) Also, as an aside, I see you keep calling as.character on your data when you extract it from your data.frame. Is your data being converted to factors? You can look to set stringsAsFactors=FALSE if this is the case and you are reading in data using read.table/delim/etc (see: ?read.table) Hope that helps, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with betareg()
On Tue, 12 Jan 2010, Al Leong wrote: Hi, In using the betareg package, I encounter the following error message: Error in lm.wfit(x, linkfun(y), weights, offset = offset) : NA/NaN/Inf in foreign function call (arg 4) Any help will be most appreciated. Thanks in advance. We can't possibly help you with this amount of information. Please provide a small reproducible example, preferably with a (small) artificial data set or with one available in R. (Also see the posting guide, linked at the end of this e-mail.) Z Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] find the corresponding mean y
Not very clear what you want, but perhaps... ?sortedXyData ...might help. hth KJ luciferyan anniehyh...@googlemail.com wrote in message news:1263231740556-1011427.p...@n4.nabble.com... Hello, I have 49 paired data, x, y. I have sampled x (where replacement is true), and find its mean. How can I find the corresponding mean y, which is the paired data of above sample x? Thank you very much, Annie -- View this message in context: http://n4.nabble.com/find-the-corresponding-mean-y-tp1011427p1011427.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot prices and dates in a nice way
Dear all, I currently experience the problem with nicely plotting price data against the dates. Data - read.csv(C:/IBM.csv, header = TRUE, sep = ,) plot(Data[,1], Data[,2]) I cannot find the way how can I choose the # of breaks for the x axis - dates in this case? Thanks a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] warning inside loop
Dear William, thank you kindly for this solution: it provides exactly what I need, especially due to the fact that the encapsulating function returns a list, from which I can extract all the information I need. kind regards, Rense Nieuwenhuis William Dunlap wrote: -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Rense Sent: Monday, January 11, 2010 3:07 PM To: r-help@r-project.org Subject: [R] warning inside loop Hi, I'm running some data simulations using (mixed effects)* regression models that show difficulty to converge. Therefore, I seek a way of capturing warnings (of false convergence) inside a loop. Inside that loop, I modify data and estimate a model. I do so many times with slightly different modifications of the data. Next, I extract some of the model parameters and store these in a matrix. However, as some of the models do not converge well, some of the stored parameters are extracted from the ill-converged models. Therefore, I seek a way of automatically detecting whether the estimation procedure has resulted in a warning, so I can distinguish between the well- and ill-converged models. I have been trying to use functions as warnings(), as well as using the object last.warning, but unfortunately to no avail. Try withCallingHandlers(), as in the following function with returns the value of the expression along with any warning messages as a list: withWarnings function (expr) { warnings - character() retval - withCallingHandlers(expr, warning = function(ex) { warnings - c(warnings, conditionMessage(ex)) invokeRestart(muffleWarning) }) list(Value = retval, Warnings = warnings) } environment: R_GlobalEnv Typical usage would be: lapply(-1:1, function(i)withWarnings(log(i))) [[1]] [[1]]$Value [1] NaN [[1]]$Warnings [1] NaNs produced [[2]] [[2]]$Value [1] -Inf [[2]]$Warnings character(0) [[3]] [[3]]$Value [1] 0 [[3]]$Warnings character(0) Perhaps there is some encapsulation of this already in some package, as try() encapsulates error catching. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com Although I cannot provide a reproducible example, I schematically represent the procedure I seek to use below: for (i in 1:10) { modify data estimate model evaluate whether estimation produced warning extract model parameters, and store whether warning occured } I hope any one can give some guidelines on how to deal with warnings inside a loop. With Kind regards, Rense *Although I use the lme4 package for that actual analysis, I sent my question to this mailinglist (instead of the R mixed list) because I believe this is a general issue, rather than one associated exclusively with mixed models. -- View this message in context: http://n4.nabble.com/warning-inside-loop-tp1011667p1011667.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/warning-inside-loop-tp1011667p1011979.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.spss: option to.data.frame and string variables
Dear R-users, I am using R version 2.10.1 and package foreign version 0.8-39 under windows. When reading .sav-Files (PASW Statistics 18.0.1) containing string variables, these are automatically converted to factors when using option to.data.frame = TRUE (see example below). It's clear to me why this happens (the default behaviour of a call to as.data.frame). But this is not always what one might want (or even be aware of). So maybe one of the following improvements could be made? * Add a description of this behaviour in ?read.spss. * Or (even better): Add an extra argument, like: read.spss(C:\\temp\\test.sav, to.data.frame = TRUE, stringsAsFactors = FALSE). Just a suggestion; kind regards Heinrich. # EXAMPLE: Suppose there is a simple file test.sav, containing one variable (x) of type STRING with 3 values (a,b,c). library(foreign) test - read.spss(C:\\temp\\test.sav) test $x [1] abc attr(,label.table) attr(,label.table)$x NULL attr(,codepage) [1] 1252 is.factor(test$x) [1] FALSE is.character(test$x) [1] TRUE # Ok, that's just fine. But things change when using option to.data.frame = TRUE: test - read.spss(C:\\temp\\test.sav, to.data.frame = TRUE) test x 1 a 2 b 3 c is.factor(test$x) [1] TRUE is.character(test$x) [1] FALSE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for windows 64 bit
Dear all, I just download and set this new version of R. I am now trying to download the packages I need which are sperseM and quantreg. I downloaded and insert into the library file the quantreg pacjkage and it seems to work. However, when I try to do the same with sparseM I get the following error message: Loading required package: SparseM Error in inDL(x, as.logical(local), as.logical(now), ...) : unable to load shared library 'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll': LoadLibrary failure: %1 non è un'applicazione di Win32 valida. Any help for it? Thanks a lot alessia 2010/1/11 Henrique Dallazuanna www...@gmail.com: Try this version (beta of development version): http://www.stats.ox.ac.uk/pub/RWin/Win64/R-2.11.0dev-win64.exe On Mon, Jan 11, 2010 at 2:29 PM, alessia matano alexis@gmail.com wrote: Dear all, do you know if there is any particular version of R to implement with windows 64 bit, in such a way to increase the amount of memory it can use? How should I increase the memory, and more importantly to set a higher max vector size? It still stops me saying Could not allocate vector of size 145 thanks to all alessia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot ylab on the right
Hello, I have a graphic and I want to plot the yaxis AND ylab on the right. I manage to plot axis on the right with axis(4) but I don't know how to write the ylab on the right. barplot(data, name=leg, xlab=Probability, ylab=Number of links, axes=F) axis(4) And another question : Is there an easy way to indicate that directly on barplot command ? Thank you, M [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for windows 64 bit
On 12.01.2010 11:33, alessia matano wrote: Loading required package: SparseM Error in inDL(x, as.logical(local), as.logical(now), ...) : unable to load shared library 'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll': LoadLibrary failure: %1 non è un'applicazione di Win32 valida. These packages have libraries that work for 32-bit R. You need to compile 64-bit versions from sources using the setup as described in Brian Ripley's message on R-devel and in the R Installation and Administration manual. Package repositories for 64-bit versions are not yet online for that *experimental* 64-bit version - although I am currently syncing them. The latter means an also *experimental* repository of 64-bit packages for Windows may go online in the CRAN network within very few days. Note that you will need to set that repository manually for now. In the meantime, you can get packages from my more or less private repository as follows: install.packages(c(SparseM, quantreg), contriburl=http://www.statistik.tu-dortmund.de/~ligges/CRAN/bin/windows64/contrib/2.11;, dependencies = TRUE) Best, Uwe Ligges __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot ylab on the right
On 01/12/2010 09:41 PM, Mister Vanhalen wrote: Hello, I have a graphic and I want to plot the yaxis AND ylab on the right. I manage to plot axis on the right with axis(4) but I don't know how to write the ylab on the right. barplot(data, name=leg, xlab=Probability, ylab=Number of links, axes=F) axis(4) Hi Mister Vanhalen, Try this: mtext(Number of links,4) And another question : Is there an easy way to indicate that directly on barplot command ? A quick look at the axis command has not revealed it. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] drop1: warning message
No idea, you may want to tell us all your calls and give a reproducible example as well as properly formatted code that is more readable than the stuff below. I guess you omitted some other warnings that tell you about too few obervations to fit the model (the fitted probabilities numerically 0 or 1 occurred is a weak indication that your model may be overspecified). Best, Uwe Ligges On 12.01.2010 08:17, Martin Bulla wrote: Dear R collegues, I am new to R and do not understad the warning messages, which appear when I am trying to apply type III procedure to my glm: inkm$Gamie glm.inkmZ drop1(glm.inkmZ,test=Ch) Single term deletions Model: InkMb ~ IncStart + Vol + Gamie + PC1 + PC2 + PC3 + SpotsN + PC1:SpotsN Df DevianceAIC LRT Pr(Chi) 9.3879 27.388 IncStart1 14.6274 30.627 5.2395 0.0220791 * Vol 1 15.9723 31.972 6.5844 0.0102876 * Gamie 1 13.8659 29.866 4.4780 0.0343330 * PC2 1 9.6899 25.690 0.3020 0.5826585PC3 1 10.7326 26.733 1.3447 0.2462124PC1:SpotsN 1 21.6517 37.652 12.2638 0.0004618 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Warning messages: 1: In glm.fit(x[, jj, drop = FALSE], y, wt, offset = object$offset, : fitted probabilities numerically 0 or 1 occurred 2: In glm.fit(x[, jj, drop = FALSE], y, wt, offset = object$offset, : fitted probabilities numerically 0 or 1 occurred Where did PC1 disappiered? What is the meaning of these warning messages? Similarly, I do not understand where did PCs dissapeared in the following calculations: inkm$Gamie glm.inkm0 drop1(glm.inkm0,test=Ch) Single term deletions Model: InkMb ~ IncStart + Vol + Gamie + PC1 + PC2 + PC3 + SpotsN + Gamie:PC1 + Gamie:PC2 + Gamie:PC3 Df DevianceAICLRT Pr(Chi)18.368 40.368 IncStart 1 26.078 46.078 7.7106 0.005490 ** Vol1 26.515 46.515 8.1475 0.004312 ** SpotsN 1 18.663 38.664 0.2958 0.586537 Gamie:PC1 1 20.418 40.418 2.0505 0.152155 Gamie:PC2 1 18.460 38.460 0.0919 0.761818 Gamie:PC3 1 19.832 39.832 1.4647 0.226180 --- Any suggestions would help me greatly. Best regards, Martin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] read.spss: option to.data.frame and string variables
It would be an significant undertaking to annotate all the places where the default behavior of strings-to-factors conversion might trip up the unwary. You are not the first by any means to complain. You might: a) take the step that the Mayo Clinic has taken of setting the default in options() to FALSE, or b) make your own read.spss with your desired arguments, and then put it in your .Rprofile. -- David On Jan 12, 2010, at 5:28 AM, RINNER Heinrich wrote: Dear R-users, I am using R version 2.10.1 and package foreign version 0.8-39 under windows. When reading .sav-Files (PASW Statistics 18.0.1) containing string variables, these are automatically converted to factors when using option to.data.frame = TRUE (see example below). It's clear to me why this happens (the default behaviour of a call to as.data.frame). But this is not always what one might want (or even be aware of). So maybe one of the following improvements could be made? * Add a description of this behaviour in ?read.spss. * Or (even better): Add an extra argument, like: read.spss(C:\\temp\ \test.sav, to.data.frame = TRUE, stringsAsFactors = FALSE). Just a suggestion; kind regards Heinrich. # EXAMPLE: Suppose there is a simple file test.sav, containing one variable (x) of type STRING with 3 values (a,b,c). library(foreign) test - read.spss(C:\\temp\\test.sav) test $x [1] abc attr(,label.table) attr(,label.table)$x NULL attr(,codepage) [1] 1252 is.factor(test$x) [1] FALSE is.character(test$x) [1] TRUE # Ok, that's just fine. But things change when using option to.data.frame = TRUE: test - read.spss(C:\\temp\\test.sav, to.data.frame = TRUE) test x 1 a 2 b 3 c is.factor(test$x) [1] TRUE is.character(test$x) [1] FALSE __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot prices and dates in a nice way
On 12.01.2010 10:39, Trafim Vanishek wrote: Dear all, I currently experience the problem with nicely plotting price data against the dates. Data- read.csv(C:/IBM.csv, header = TRUE, sep = ,) plot(Data[,1], Data[,2]) I cannot find the way how can I choose the # of breaks for the x axis - dates in this case? No idea since we do not know hat is in IBM.csv and hence we do not what Data actually includes ... Uwe Ligges Thanks a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drought Severity Index (DSI)
On 01/12/2010 07:21 PM, Muhammad Rahiz wrote: Does anyone have the code to calculate the drought severity index? Hi Muhammad, If you mean does anyone have the algorithm? it seems pretty hard to find. Even that old standby Wikipedia didn't have a description of Palmer's algorithm. If you happen to have the algorithm but not the R code, perhaps mentioning that might get some responses. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Beginer data.frame
Hello, I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am using XP. I have a data which has a data.frame format called x.df (read from a csv file). I want to take from this data observations for which the variable Code starts with an R. I took all the Code and put them into a vector vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE) Then I created a function that is supposed to take all the lines in the my data x.df for which Code equals one value of vec. See the code below where I created a loop to do that. myfunc-function(data,var2,var1) + { + i=1 + while (i632){ + line-subset(data,var2==var1[i]) + if (i==1){ + df-line + df-data.frame(df) + } + else { + line-data.frame(line) + df-rbind(df,line) + } + i-i+1 + } + fix(df) + } The results of my program higly depend on the few last lines of the program. If I put fix(df), as above, the function opens a window with my data and it seems a sensible results (I have not checked in details but I barely have what I am suppose to get). myfunc-function(data,var2,var1) ... + } + df-data.frame(df) + print(is.data.frame(df)) + } myfunc(x.df,x.df$Code,vec) [1] TRUE print(is.data.frame(df)) [1] FALSE In the case above I ask whether or not the df is a data.frame and the answer is true, when the program has ended, I ask again and the answer is false. Could anyone tell me what to do to get this data and could anyone tell me why those differences in the results? as.data.frame(df) Erreur dans as.data.frame.default(df) : impossible de convertir automatiquement la classe function en un tableau de données (data.frame) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drought Severity Index (DSI)
Hello Jim, Thanks for the response. I have the algorithm based on the original paper (Bryant et al. 1994) and subsequent modification by Philips and McGregor(1994). The latter gives the formula for calculating the index in MS Excel. I'm trying to translate it to R. Muhammad Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk Jim Lemon wrote: On 01/12/2010 07:21 PM, Muhammad Rahiz wrote: Does anyone have the code to calculate the drought severity index? Hi Muhammad, If you mean does anyone have the algorithm? it seems pretty hard to find. Even that old standby Wikipedia didn't have a description of Palmer's algorithm. If you happen to have the algorithm but not the R code, perhaps mentioning that might get some responses. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading a file with mixed cyrillic/latin characters
Dear useRs, I am trying to read a tab-delimited Unicode text file containing both latin and cyrillic characters and failing miserably. The file looks like this (I hope it comes across right): A B C 3 foo ФОО 5 bar БАР read.table(foo.txt,sep=\t,header=TRUE) I am guessing that I can use the fileEncoding argument to read.table() to read this, but I can find no list of supported values of fileEncoding, and fileEncoding=Unicode gives an error. The FAQ and the FAQ for Windows don't help. I have searched both the list archives and RSeek and am still seeking enlightenment. I am running R 2.10.1 on Windows XP, sessionInfo() below. Cheers Stephan R version 2.10.1 (2009-12-14) i386-pc-mingw32 locale: [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5] LC_TIME=German_Germany.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for windows 64 bit
Fine, it worked. I will try in this way. Just the last question and I won't bother you further today. My machine right now has just 6 giga of RAM (it will be increased to 16 in a few days), and I see that with this experimental version memory.limit is 6135. How is the command to increase the memory usage until the maximum I can (5 giga?). If I am writing memory.limit(5000) it still gives me the error: don't be silly! Your machine has a 4Gb address limit which is quite odd. Many thanks Best A. 2010/1/12 alessia matano alexis@gmail.com: ok, perfect! I will try with it...many many thanks. Have you got there also the quantreg package, which has actually the same problem of sparseM (32bit version)? best alessia 2010/1/12 Uwe Ligges lig...@statistik.tu-dortmund.de: On 12.01.2010 12:09, alessia matano wrote: I am sorry, I know it is an experimental version, and I have been misleading saying a new version. Therefore, I will wait for when they will be available officially, since it is just a few days. Or just use today my private repository I indicated in the other mail. Uwe Ligges However, I tried also to go to the cran pages and download them and insert into the library. For quantreg it worked, for sparseM it did not probably because it's a win32 version, as you said. 2010/1/12 Prof Brian Ripleyrip...@stats.ox.ac.uk: On Tue, 12 Jan 2010, alessia matano wrote: Dear all, I just download and set this new version of R. I am now trying to download the packages I need which are sperseM and quantreg. I downloaded and insert into the library file the quantreg pacjkage and it seems to work. However, when I try to do the same with sparseM I get the following error message: Loading required package: SparseM Error in inDL(x, as.logical(local), as.logical(now), ...) : unable to load shared library 'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll': LoadLibrary failure: %1 non è un'applicazione di Win32 valida. Any help for it? Please do refer to the posting referred to in that thread (and Henrique, please do not post just the URL without the explanations). https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html You cannot mix 32-bit Windows binary packages with this experimental port (it is not a 'new version'): you need to install from the package sources. If that is too difficult for you, please do not try to use unsupported experimental builds (and Uwe Ligges may have some binary packages available for test in a few days). Thanks a lot alessia 2010/1/11 Henrique Dallazuannawww...@gmail.com: Try this version (beta of development version): http://www.stats.ox.ac.uk/pub/RWin/Win64/R-2.11.0dev-win64.exe On Mon, Jan 11, 2010 at 2:29 PM, alessia matanoalexis@gmail.com wrote: Dear all, do you know if there is any particular version of R to implement with windows 64 bit, in such a way to increase the amount of memory it can use? How should I increase the memory, and more importantly to set a higher max vector size? It still stops me saying Could not allocate vector of size 145 thanks to all alessia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginer data.frame
Hi Jean-Baptiste, two points: 1) Your variable df is a *local* variable which you define in your function myfunc(), so it is not known outside myfunc(). When you ask is.data.frame(df), R looks at the global definition of df - which is the density function of the F distribution. To make your function run (especially interactively) will require a major rewrite. 2) Generally, using variable names that are already used as R objects (like df in your example) is a bad idea. For an example of the problems you can run into, see 1) above. 3) Loops are not the R way. Depending on what you want to do with the subset of your data.frame, you may want to do something like this: x.df[substr(x.df$Code,1,1)==R,] Look at ?substr to learn more - this function is vectorized, meaning that it takes a vector input and returns a vector output. Look at section 2.7 in An introduction to R. Good luck! Stephan Jean-Baptiste Combes schrieb: Hello, I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am using XP. I have a data which has a data.frame format called x.df (read from a csv file). I want to take from this data observations for which the variable Code starts with an R. I took all the Code and put them into a vector vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE) Then I created a function that is supposed to take all the lines in the my data x.df for which Code equals one value of vec. See the code below where I created a loop to do that. myfunc-function(data,var2,var1) + { + i=1 + while (i632){ + line-subset(data,var2==var1[i]) + if (i==1){ + df-line + df-data.frame(df) + } + else { + line-data.frame(line) + df-rbind(df,line) + } + i-i+1 + } + fix(df) + } The results of my program higly depend on the few last lines of the program. If I put fix(df), as above, the function opens a window with my data and it seems a sensible results (I have not checked in details but I barely have what I am suppose to get). myfunc-function(data,var2,var1) ... + } + df-data.frame(df) + print(is.data.frame(df)) + } myfunc(x.df,x.df$Code,vec) [1] TRUE print(is.data.frame(df)) [1] FALSE In the case above I ask whether or not the df is a data.frame and the answer is true, when the program has ended, I ask again and the answer is false. Could anyone tell me what to do to get this data and could anyone tell me why those differences in the results? as.data.frame(df) Erreur dans as.data.frame.default(df) : impossible de convertir automatiquement la classe function en un tableau de données (data.frame) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginer data.frame
Hi! Jean-Baptiste Combes wrote: Hello, I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am using XP. I have a data which has a data.frame format called x.df (read from a csv file). I want to take from this data observations for which the variable Code starts with an R. I took all the Code and put them into a vector vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE) I am not sure if I understood you correctly, but could a simple: subset(x.df, substring(Code,1,1)==R) be an appropriate solution? HTH, Kimmo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginer data.frame
On Jan 12, 2010, at 6:17 AM, Jean-Baptiste Combes wrote: Hello, I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am using XP. I have a data which has a data.frame format called x.df (read from a csv file). I want to take from this data observations for which the variable Code starts with an R. I took all the Code and put them into a vector vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE) vec is going to be a vector of row numbers that can be used to address the data.frame Then I created a function that is supposed to take all the lines in the my data x.df for which Code equals one value of vec. See the code below where I created a loop to do that. That seems to be a very short R one-liner: data[vec, ] ?[ -- David. myfunc-function(data,var2,var1) + { + i=1 + while (i632){ #where does that come from ? + line-subset(data,var2==var1[i]) + if (i==1){ + df-line + df-data.frame(df) + } + else { + line-data.frame(line) + df-rbind(df,line) + } + i-i+1 + } + fix(df) + } The results of my program higly depend on the few last lines of the program. If I put fix(df), as above, the function opens a window with my data and it seems a sensible results (I have not checked in details but I barely have what I am suppose to get). myfunc-function(data,var2,var1) ... + } + df-data.frame(df) + print(is.data.frame(df)) + } myfunc(x.df,x.df$Code,vec) [1] TRUE print(is.data.frame(df)) [1] FALSE In the case above I ask whether or not the df is a data.frame and the answer is true, when the program has ended, I ask again and the answer is false. Could anyone tell me what to do to get this data and could anyone tell me why those differences in the results? as.data.frame(df) Erreur dans as.data.frame.default(df) : impossible de convertir automatiquement la classe function en un tableau de données (data.frame) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] time series analys by resting the effect of a covariate
Hi, Does anyone know a way to estimate the existence of a temporal trend (each unit of the sample is a count) by resting the possible effect of a covariate (i.e. climatic factor)? I have periodical counts of several species of waterbirds during the last 13 years and I want to know if, resting the effect of the flooded area available, there would be a temporal trend. I know the flooded area is correlated with the time series data of most of species I'm taking in account. I had a look to time series section of the http://cran.r-project.org but I didn't find nothing on this issue. Thanks for any response _ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] coerce vector into array - change filling sequence
Dear all, When I coerce a vector into a multi dimensional array, I would like R to start filling the array along the last dimension, then the 2nd last etc. Let's jump straight into an example. x - 1 : 24 y - array(dim=c(2,2,6)) I would like to have: y[1,1,1] = 1 y[1,1,2] = 2 ... y[1,1,6] = 6 y[1,2,1] = 7 y[1,2,2] = 8 ... y[2,1,1] = 13 ... y[2,2,1] = 19 if I do y- array(x, dim=c(2,2,6)), i think I will get y[1,1,1] = 1 y[2,1,1] = 2 (or something not I want) instead. Of course, I need a fast solution, as I am actually dealing with array of much larger size. Any input will be appreciated Thanks a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] coerce vector into array - change filling sequence
Hi, you can permute array dimensions using aperm(): x - 1 : 24 z - array(x, dim=c(6,2,2)) y - aperm(z,perm=c(3,2,1)) y[1,1,] HTH, Stephan Kohleth Chia schrieb: Dear all, When I coerce a vector into a multi dimensional array, I would like R to start filling the array along the last dimension, then the 2nd last etc. Let's jump straight into an example. x - 1 : 24 y - array(dim=c(2,2,6)) I would like to have: y[1,1,1] = 1 y[1,1,2] = 2 ... y[1,1,6] = 6 y[1,2,1] = 7 y[1,2,2] = 8 ... y[2,1,1] = 13 ... y[2,2,1] = 19 if I do y- array(x, dim=c(2,2,6)), i think I will get y[1,1,1] = 1 y[2,1,1] = 2 (or something not I want) instead. Of course, I need a fast solution, as I am actually dealing with array of much larger size. Any input will be appreciated Thanks a lot __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional Sampling
Hi, I am hoping someone can help me with a sampling question. I am using the following function to sample 10 unique observations: x - sample(1:100, 10, replace=F) Given the first 10 observations, I need to sample another 5 unique observations from the remainder. I essentially want to do a Monte Carlo type analysis on the results. I would appreciate any feedback. Thanks -- View this message in context: http://n4.nabble.com/Conditional-Sampling-tp1012072p1012072.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot ylab on the right
It's works perfectly ! ;) Thank you very much !! M A On Tue, Jan 12, 2010 at 11:56 AM, Jim Lemon j...@bitwrit.com.au wrote: On 01/12/2010 09:41 PM, Mister Vanhalen wrote: Hello, I have a graphic and I want to plot the yaxis AND ylab on the right. I manage to plot axis on the right with axis(4) but I don't know how to write the ylab on the right. barplot(data, name=leg, xlab=Probability, ylab=Number of links, axes=F) axis(4) Hi Mister Vanhalen, Try this: mtext(Number of links,4) And another question : Is there an easy way to indicate that directly on barplot command ? A quick look at the axis command has not revealed it. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Beginer data.frame
See help(grepl) so using built in data frame CO2 this gets rows whose Plant column start with Qn: subset(CO2, grepl(^Qn, Plant)) On Tue, Jan 12, 2010 at 6:17 AM, Jean-Baptiste Combes jeanbaptiste.combes.a...@googlemail.com wrote: Hello, I use R 2.10, and I am new in R (I used to use SAS and lately Stata), I am using XP. I have a data which has a data.frame format called x.df (read from a csv file). I want to take from this data observations for which the variable Code starts with an R. I took all the Code and put them into a vector vec-grep(R[A-Z][A-Z],x.df$Code,value=TRUE) Then I created a function that is supposed to take all the lines in the my data x.df for which Code equals one value of vec. See the code below where I created a loop to do that. myfunc-function(data,var2,var1) + { + i=1 + while (i632){ + line-subset(data,var2==var1[i]) + if (i==1){ + df-line + df-data.frame(df) + } + else { + line-data.frame(line) + df-rbind(df,line) + } + i-i+1 + } + fix(df) + } The results of my program higly depend on the few last lines of the program. If I put fix(df), as above, the function opens a window with my data and it seems a sensible results (I have not checked in details but I barely have what I am suppose to get). myfunc-function(data,var2,var1) ... + } + df-data.frame(df) + print(is.data.frame(df)) + } myfunc(x.df,x.df$Code,vec) [1] TRUE print(is.data.frame(df)) [1] FALSE In the case above I ask whether or not the df is a data.frame and the answer is true, when the program has ended, I ask again and the answer is false. Could anyone tell me what to do to get this data and could anyone tell me why those differences in the results? as.data.frame(df) Erreur dans as.data.frame.default(df) : impossible de convertir automatiquement la classe function en un tableau de données (data.frame) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Pharmacokinetic and pharmacodynamic modeling and simulation
Pharmacokinetic and pharmacodynamic modeling and simulation By Dr. Jan Freijer March 18, 2010 Amsterdam, The Netherlands http://www.can.nl/events/details.php?id=57 This course is aimed at users of R or S-PLUS in the bio-pharmaceutical sciences who would like to use R for clinical trial simulations. topics include Working with packages in R - MASS - odesolve Random generation from univariate distributions - density, distribution function, quantile function and random generation - various distributions Random generation from multivariate distributions - normal distribution - working with the covariance matrix - simulating PK and PK-PD model parameters Solving differential equations - solving differential equations in R - testing the numerical solution versus analytical solution - PK models for single dose oral or IV administration - PK models for multiple dose oral or IV administration - implementing PD models Clinical trial simulations - combining the structural model with the random effects model - uncertainty versus variability - example: two compartment PK model with indirect response model Location: Amsterdam Date: March 18th Time: 10:00h.-16:30h. Price : EURO 395,- excluding VAT Register: - phone : +31-(0)20-560-8400 - Email : pau...@can.nl - Web : http://www.can.nl/events/details.php?id=57 There is a maximum of 12 participants. You may register by replying to this email and provide us with the following information. Name : M / F Title : Department : Institute : Address: City : Zip: Telephone : Fax: Email : Please let us know if you have any questions. Please feel free to send this message on to your colleagues and friends for whom it might be interesting!! Kind regards, Dick Verkerk _ CANdiensten, Nieuwpoortkade 23-25, NL-1055 RX Amsterdam tel: +31 20 5608410 fax: +31 20 5608448 verk...@candiensten.nl _ Your Partner in Mathematics and Statistics! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Sampling
On 12-Jan-10 12:58:13, ehcpieterse wrote: Hi, I am hoping someone can help me with a sampling question. I am using the following function to sample 10 unique observations: x - sample(1:100, 10, replace=F) Given the first 10 observations, I need to sample another 5 unique observations from the remainder. I essentially want to do a Monte Carlo type analysis on the results. I would appreciate any feedback. Thanks -- If your second sampling is to be on the same footing as the first, i.e. a random subset of 5 out of the remaining 90, then this is equivalent to sampling 15 in the first place and using the first 10 of these as your first sample: set.seed(54321) x0 - sample(1:100, 15, replace=F) x - x0[1:10] y - x0[-(1:10)] x0 # [1] 43 50 18 27 21 83 5 20 32 34 13 61 4 57 86 x # [1] 43 50 18 27 21 83 5 20 32 34 y # [1] 13 61 4 57 86 set.seed(54321) sample(1:100, 10, replace=F) # [1] 43 50 18 27 21 83 5 20 32 34 However, if the manner of taking the second sample from the remainder will depend on the results of the first sample, then further consideration is necessary. If this is the case, can you indicate how the values in the first sample would influence how the second sample is to be obtained? Another approach, which leaves more options, is to use sample.int(): set.seed(54321) X - (1:100) ## (or any other 100 values) n - sample.int(100,10,replace=FALSE) ## returns subset of (1:100) x - X[n] Y - X[-n] y - sample(Y,5,replace=FALSE) x # [1] 43 50 18 27 21 83 5 20 32 34 ## (as before) y # [1] 14 70 4 66 96## (as before) Hoping this helps, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 12-Jan-10 Time: 13:21:21 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble with installing SJava
Colleagues, How i can solve this error when i install SJava package Thanks R CMD INSTALL -c /usr/local/lib/R/SJava_0.69-0.tar.gz * installing to library ‘/usr/local/lib/R/site-library’ * installing *source* package ‘SJava’ ... checking for java... /usr/lib/jvm/java-6-sun/bin/java Java VM /usr/lib/jvm/java-6-sun/bin/java checking for javah... /usr/lib/jvm/java-6-sun/bin/javah Looking in /usr/lib/jvm/java-6-sun/include Looking in /usr/lib/jvm/java-6-sun/include/linux checking for g++... g++ checking for C++ compiler default output... a.out checking whether the C++ compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking for gcc... gcc checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ANSI C... none needed checking for Rf_initEmbeddedR in -lR... no No R shared library found configure: creating ./config.status config.status: creating Makevars config.status: creating src/Makevars config.status: creating src/RSJava/Makefile config.status: creating Makefile_rules config.status: creating inst/scripts/RJava.bsh config.status: creating inst/scripts/RJava.csh config.status: creating R/zzz.R config.status: creating cleanup config.status: creating inst/scripts/RJava Copying the cleanup script to the scripts/ directory Building libRSNativeJava.so in /tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava if test ! -d /usr/local/lib/R/site-library/SJava/libs ; then \ mkdir /usr/local/lib/R/site-library/SJava/libs ; \ fi gcc -std=gnu99 -g -O2 -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -c CtoJava.c CtoJava.cweb:148: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'vm1_args' CtoJava.cweb:215: error: static declaration of 'std_env' follows non-static declaration CtoJava.cweb:195: error: previous declaration of 'std_env' was here CtoJava.cweb: In function 'create_Java_vm': CtoJava.cweb:256: error: 'vm1_args' undeclared (first use in this function) CtoJava.cweb:256: error: (Each undeclared identifier is reported only once CtoJava.cweb:256: error: for each function it appears in.) make: *** [CtoJava.o] Error 1 Generating JNI header files from Java classes. RForeignReference, RManualFunctionActionListener, ROmegahatInterpreter REvaluator * Warning: At present, to use the library you must set the LD_LIBRARY_PATH environment variable to /usr/local/lib/R/site-library/SJava/libs:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386/server:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/../lib/i386::/usr/java/packages/lib/i386:/lib:/usr/lib or use one of the RJava.bsh or RJava.csh scripts * ** libs gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -IRSJava -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -I/usr/local/include-fpic -g -O2 -c ConverterExamples.c -o ConverterExamples.o ConverterExamples.cweb: In function ‘RS_JAVA_setFunctionConverter’: ConverterExamples.cweb:213: warning: assignment discards qualifiers from pointer target type ConverterExamples.cweb: In function ‘RS_JAVA_toJavaFunctionConverter’: ConverterExamples.cweb:312: warning: passing argument 1 of ‘getOmegahatReferenceValue’ discards qualifiers from pointer target type gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -IRSJava -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -I/usr/local/include-fpic -g -O2 -c Converters.c -o Converters.o Converters.cweb: In function ‘RS_JAVA_removeConverter’: Converters.cweb:399: warning: assignment discards qualifiers from pointer target type gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -IRSJava -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -I/usr/local/include-fpic -g -O2 -c REmbed.c -o REmbed.o gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include
Re: [R] Drought Severity Index (DSI)
A few years ago, I work with Stuart Gage who had developed a Heat / Precipitation Index as a measure of drought severity. It works just as well if not better than the Palmer Drought Index. You can find the formula in this only pdf report: Climate Variability in the North Central Region: http://goes.msu.edu/publications/pdfs_ps/CGCEO%2085.pdf Google search: Stuart Gage Heat Precipitation Index If you can not download the file. Write to me individually and I'll send it directly. Steve Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Muhammad Rahiz muhammad.ra...@o uce.ox.ac.uk To Sent by: Jim Lemon j...@bitwrit.com.au r-help-boun...@r- cc project.org r-help@r-project.org r-help@r-project.org Subject 01/12/2010 06:21 Re: [R] Drought Severity Index AM(DSI) Hello Jim, Thanks for the response. I have the algorithm based on the original paper (Bryant et al. 1994) and subsequent modification by Philips and McGregor(1994). The latter gives the formula for calculating the index in MS Excel. I'm trying to translate it to R. Muhammad Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk Jim Lemon wrote: On 01/12/2010 07:21 PM, Muhammad Rahiz wrote: Does anyone have the code to calculate the drought severity index? Hi Muhammad, If you mean does anyone have the algorithm? it seems pretty hard to find. Even that old standby Wikipedia didn't have a description of Palmer's algorithm. If you happen to have the algorithm but not the R code, perhaps mentioning that might get some responses. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drought Severity Index (DSI)
Thanks Steve! Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk steve_fried...@nps.gov wrote: A few years ago, I work with Stuart Gage who had developed a Heat / Precipitation Index as a measure of drought severity. It works just as well if not better than the Palmer Drought Index. You can find the formula in this only pdf report: Climate Variability in the North Central Region: http://goes.msu.edu/publications/pdfs_ps/CGCEO%2085.pdf Google search: Stuart Gage Heat Precipitation Index If you can not download the file. Write to me individually and I'll send it directly. Steve Steve Friedman Ph. D. Spatial Statistical Analyst Everglades and Dry Tortugas National Park 950 N Krome Ave (3rd Floor) Homestead, Florida 33034 steve_fried...@nps.gov Office (305) 224 - 4282 Fax (305) 224 - 4147 Muhammad Rahiz muhammad.ra...@o uce.ox.ac.uk To Sent by: Jim Lemon j...@bitwrit.com.au r-help-boun...@r- cc project.org r-help@r-project.org r-help@r-project.org Subject 01/12/2010 06:21 Re: [R] Drought Severity Index AM(DSI) Hello Jim, Thanks for the response. I have the algorithm based on the original paper (Bryant et al. 1994) and subsequent modification by Philips and McGregor(1994). The latter gives the formula for calculating the index in MS Excel. I'm trying to translate it to R. Muhammad Muhammad Rahiz | Doctoral Student in Regional Climate Modeling Climate Research Laboratory, School of Geography the Environment Oxford University Centre for the Environment South Parks Road, Oxford, OX1 3QY, United Kingdom Tel: +44 (0)1865-285194 Mobile: +44 (0)7854-625974 Email: muhammad.ra...@ouce.ox.ac.uk Jim Lemon wrote: On 01/12/2010 07:21 PM, Muhammad Rahiz wrote: Does anyone have the code to calculate the drought severity index? Hi Muhammad, If you mean does anyone have the algorithm? it seems pretty hard to find. Even that old standby Wikipedia didn't have a description of Palmer's algorithm. If you happen to have the algorithm but not the R code, perhaps mentioning that might get some responses. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Sampling
Thanks Ted, your solution does make perfect sense. The only question I still have is that I would like to sample the remaining 5 observations after I have randomly selected the first 10. Given the initial 10, I would like to sample the following 5 say 1,000 times to get a simulated conditional sample, if that makes any sense. I want to build this into an iterative process to see how the first sample affects the resulting samples. Even though all the observations have the same probabilty to get sampled, they each have a different expected value. -- View this message in context: http://n4.nabble.com/Conditional-Sampling-tp1012072p1012114.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sparseM and kronecker product_R latest version
am == alessia matano alexis@gmail.com on Mon, 11 Jan 2010 16:20:57 +0100 writes: am Many thanks for it. am However it is strange that when I put the numbers rather than ncol(R) am (a matrix with ncol=36698) it worked. Look below dim(res2) am [1] 170471 25822 D- as.matrix.csr(0,nrow(tmpb),25822) D- as.matrix.csr(0,nrow(tmpb),ncol(res2)) am Error in if (length(x) == nrow * ncol) x - matrix(x, nrow, ncol) else { : am missing value where TRUE/FALSE needed am In addition: Warning message: am In nrow * ncol : NAs produced by integer overflow am But probably it is true what you said, anyway. yes, it is true. The clue is that typeof(25822) is double and not integer. am So, do you suggest me to use directly the simple matrix command? or a am kind of sparse matrix within your package?!? the latter, e.g., library(Matrix) D - Matrix(0, nrow = 113289, ncol=36698) ## or D. - sparseMatrix(x=double(0), i=integer(0), j=integer(0), dims = c(113289,36698)) identical(D, D.) ##-- TRUE ## and, e.g., Dk - kronecker(D, Diagonal(x=5:2)) identical(Dk, D %x% Diagonal(x = 5:2)) [1] TRUE dim(D) [1] 113289 36698 dim(Dk) [1] 453156 146792 Regards, Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Sampling
Would the following work, or is there a reason why it would not? risk.set - 1:100 first.10 - sample(risk.set, 10) remainder - setdiff(risk.set, first.10) for ( i in 1:1000 ) { next.5 - sample(remainder, 5) do.something.with(next.5) } Best, Magnus On 1/12/2010 9:00 AM, ehcpieterse wrote: Thanks Ted, your solution does make perfect sense. The only question I still have is that I would like to sample the remaining 5 observations after I have randomly selected the first 10. Given the initial 10, I would like to sample the following 5 say 1,000 times to get a simulated conditional sample, if that makes any sense. I want to build this into an iterative process to see how the first sample affects the resulting samples. Even though all the observations have the same probabilty to get sampled, they each have a different expected value. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Sampling
On 12-Jan-10 14:00:24, ehcpieterse wrote: Thanks Ted, your solution does make perfect sense. The only question I still have is that I would like to sample the remaining 5 observations after I have randomly selected the first 10. Given the initial 10, I would like to sample the following 5 say 1,000 times to get a simulated conditional sample, if that makes any sense. I want to build this into an iterative process to see how the first sample affects the resulting samples. Even though all the observations have the same probabilty to get sampled, they each have a different expected value. -- OK, if I now understand you, you are interested in the properties of the remaining (90) observations, given that they do not include any of the (10) cases sampled in the first round. In that case, I think you should adopt the sample.int() approach I also suggested: X - (1:100) ## (or any other 100 values) n - sample.int(100,10,replace=FALSE) ## returns subset of (1:100) x - X[n] Y - X[-n] ## The set remaining after the first 10 were taken ## Now you can sample repeatedly from Y until your eyes fall out. ## So build up a matrix of (say) 1000 samples from Y: M - sample(Y,5,replace=FALSE) for(i in (2:1000)){ M - rbind(M,sample(Y,5,replace=FALSE)) } The repeated samples M of 5 from Y of course imply replacing each sample of 5 back in Y, so they are available at each turn. You can not, of course, sample 1000*5 from 100 without replacement! (Each sample of 5 is obtained without replacement, however). I hope this is getting close! Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 12-Jan-10 Time: 14:34:13 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Solving graph theory problems with R ? (minimum vertex cover)
On 1/12/2010 12:12 AM, Johannes Hüsing wrote: Tal Galili schrieb: My specific problem is called: Minimum vertex cover for a hypergraph I know nothing about the problem at hand, but on the Wikipedia page it says that the problem can be formulated as an integer linear program. There is an R packages that interfaces to a linear programming package (Rglpk), which may or may not help you. There are also two graph/network analysis packages available for R, 'igraph' and 'sna'. I don't think either of them has a formal support for hypergraphs, but it is possible that they could be jerry-rigged to solve your problem. Even if not, the people involved may be able to help. For example, the igraph mailing list (igraph-h...@nongnu.org) is pretty active and the developers are very helpful. Best, Magnus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] The TeX-source for the package manual.
I have noted that the later versions of Rcmd check cleans out the directory pkg.Rcheck so that only package-manual.log and package-manual.pdf are left. Formerly the package-manual.tex was around too --- very handy for various purposes. Is there a way to generate the .tex - version of the manual for a package? br. Bendix __ Bendix Carstensen Senior Statistician Steno Diabetes Center Niels Steensens Vej 2-4 DK-2820 Gentofte Denmark +45 44 43 87 38 (direct) +45 30 75 87 38 (mobile) b...@steno.dk http://www.biostat.ku.dk/~bxc www.steno.dk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] List arguments from data frame columns in formula
Hi! I'm trying to run logistic regression on a dataset which is contained in dataframe data (y is in the first col, and 28 parameters for the model). How can I write formula for function `glm` without listing explicitly all 28 paramaters? `glm(data[,1]~data[,2]+data[,3]+data[,4]+...,family=binomial)` As an option I can use `glm.fit(data[,-1],data[,1],family = binomial(link=logit))`. But the obtained object cannot be used in function `predict.glm`. Thanks, Natalia -- View this message in context: http://n4.nabble.com/List-arguments-from-data-frame-columns-in-formula-tp1012146p1012146.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] List arguments from data frame columns in formula
Try this: glm(y ~ ., family = binomial, data = data, ...) On Tue, Jan 12, 2010 at 9:45 AM, npobedina npobed...@gmail.com wrote: Hi! I'm trying to run logistic regression on a dataset which is contained in dataframe data (y is in the first col, and 28 parameters for the model). How can I write formula for function `glm` without listing explicitly all 28 paramaters? `glm(data[,1]~data[,2]+data[,3]+data[,4]+...,family=binomial)` As an option I can use `glm.fit(data[,-1],data[,1],family = binomial(link=logit))`. But the obtained object cannot be used in function `predict.glm`. Thanks, Natalia -- View this message in context: http://n4.nabble.com/List-arguments-from-data-frame-columns-in-formula-tp1012146p1012146.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple symbols per single line in a legend
Hello everybody, Is it possible to coax legend() into displaying more than one simbol per line in legend? I have a graph like the one attached to this mail; I would like to reorganize the legend in such a way that the duplicate text would be omitted, i.e., the first line would read square triangledown increasing frequency and the second one would read circle triangleup decreasing frequency. Before resorting to box() and text() I would like to check whether some clever method already exists that would solve my problem. :) Thanks in advance. All the best, Primoz attachment: example.png__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problem with bio3d package
Hello, I have a problem, when I run read.fasta.pdb I get this error: pdb/seq: 1 name: dcluster.1 Error: subscript out of bounds My FASTA file is the sequence of the bovine insuline. Thanks in advance, Rg -- View this message in context: http://n4.nabble.com/problem-with-bio3d-package-tp1012116p1012116.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Sampling
Thanks Ted, it's exactly what I'm after. Thanks for the help. -- View this message in context: http://n4.nabble.com/Conditional-Sampling-tp1012072p1012180.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] List arguments from data frame columns in formula
Thanks a lot for help! :) I've invented a more complicated way to make it work: f - as.formula(paste(data.y~, paste( names(data.x), collapse = +))) fml-glm(f, data=data.x,family=binomial) Try this: glm(y ~ ., family = binomial, data = data, ...) -- View this message in context: http://n4.nabble.com/List-arguments-from-data-frame-columns-in-formula-tp1012146p1012230.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optim: abnormal termination in lnsrch
I'm using optim() to minimize a certain function. Often the minimization ends with the message: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH What is optim() trying to say? What have I to change in my function to make the minimization succeed? Do you think using BBoptim() instead of optim() changes anything? Thanks for your help! mario -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optim: abnormal termination in lnsrch (resend)
[sorry, forgot some details...] I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to minimize a certain function. Often the minimization ends with the message: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH What is optim() trying to say? What have I to change in my function to make the minimization succeed? Do you think using BBoptim() instead of optim() changes anything? Thanks for your help! mario -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim: abnormal termination in lnsrch (resend)
You forgot a lot of details. Can you send us more information about the fn and also some minimal code that can reproduce the problem? Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mario Valle Sent: Tuesday, January 12, 2010 11:53 AM To: R-help@r-project.org Subject: [R] optim: abnormal termination in lnsrch (resend) [sorry, forgot some details...] I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to minimize a certain function. Often the minimization ends with the message: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH What is optim() trying to say? What have I to change in my function to make the minimization succeed? Do you think using BBoptim() instead of optim() changes anything? Thanks for your help! mario -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Placing eps files from R into Adobe InDesign documents: specifying fontfamily
This is a solution I am posting for a problem that others may have. If you want to: 1. Place lattice graphics from R into an Adobe InDesign document, and 2. Use the export as eps function in R to maximize resolution (it is much better than exporting as a metafile or bitmap), and 3. Use long strings of text in your titles or captions or to label your axes. Then you will have problems because: 1. Adobe InDesign doesn't recognize R fontfamilies in eps files, and 2. Adobe InDesing replaces the default R font, Helvetica, with a fontfamily that looks like Courier, which significantly changes the physical length of each character string and disrupts the spacing and justification of titles, captions, and axis labels. The way I solved this problem is: 1. When you execute a graph in R, use the Hershey family of fonts. I particulary like HersheySans. You can specify the fontfamily for axes and strips and titles as shown in the example below. 2. Export the file in eps by right-clicking the on-screen display and selecting 'save as postscript'. 3. Place the file in InDesign using the usual command (ctrl-D) and selecting the file. You will still get the Adobe error about unrecognized fonts, but the automatic replacement that Adobe uses for the Hershey fontfamily is much better than the one they use for Helvetica, your spacing and justification will be almost perfect, and you get the excellent resolution and small size of vector-graphics. In the following example the fontfamily is specified for strips, axes, and titles with these lines: par.strip.text=list(fontfamily=HersheySans) #for strips scales=list(alternating=1, tck=c(1,0), fontfamily=HersheySans) #for axes xlab=list(Combined Score, fontfamily=HersheySans) #for titles such as 'main', 'sub', 'xlab' and 'ylab' Here is the example: bwplot(school.name~score|assessment+course_code, data=temp2.stack, plot.points=FALSE, drop.unused.levels=TRUE, panel=function(..., box.ratio, varwidth) { panel.violin(..., col=cornsilk, varwidth=FALSE, box.ratio=box.ratio) panel.bwplot(..., box.ratio=0.1) }, layout=c(2,3,1), par.strip.text=list(fontfamily=HersheySans), scales=list(alternating=1, tck=c(1,0), fontfamily=HersheySans, x=list(relation=same, cex=0.7, rot=90), y=list(relation=same, cex=0.7, rot=0)), xlab=list(Combined Score, fontfamily=HersheySans), ylab=list(School(State)(students), fontfamily=HersheySans) ) -- View this message in context: http://n4.nabble.com/Placing-eps-files-from-R-into-Adobe-InDesign-documents-specifying-fontfamily-tp1012186p1012186.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting p values
Dear colleges I need to get the p values for a table with 15000 entries of t values. Does any of you know how to do it? I can, of course, get one by one but that is not sensible. Thanks Rosario __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] barplot: border color when stacked
Dear R-users, I am using R version 2.10.1 under windows. In a barplot, I want to mark one of the bars with a special border color. For example: barplot(c(3, 7, 11), border = c(NA, red, NA)) But how to do this when the bars are stacked? for example: barplot(matrix(1:6, ncol=3)) # border of second bar (i.e. the one with total height = 7) should be red again, I try: barplot(matrix(1:6, ncol=3), border = c(NA, red, NA)) Obviously, this doesn't give me what I want. Your advice would be appreciated; kind regards Heinrich. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot: border color when stacked
You can edit the barplot function to do this: mybarplot - function (height, width = 1, space = NULL, names.arg = NULL, legend.text = NULL, beside = FALSE, horiz = FALSE, density = NULL, angle = 45, col = NULL, border = par(fg), main = NULL, sub = NULL, xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, xpd = TRUE, log = , axes = TRUE, axisnames = TRUE, cex.axis = par(cex.axis), cex.names = par(cex.axis), inside = TRUE, plot = TRUE, axis.lty = 0, offset = 0, add = FALSE, args.legend = NULL, ...) { if (!missing(inside)) .NotYetUsed(inside, error = FALSE) if (is.null(space)) space - if (is.matrix(height) beside) c(0, 1) else 0.2 space - space * mean(width) if (plot axisnames is.null(names.arg)) names.arg - if (is.matrix(height)) colnames(height) else names(height) if (is.vector(height) || (is.array(height) (length(dim(height)) == 1))) { height - cbind(height) beside - TRUE if (is.null(col)) col - grey } else if (is.matrix(height)) { if (is.null(col)) col - grey.colors(nrow(height)) } else stop('height' must be a vector or a matrix) if (is.logical(legend.text)) legend.text - if (legend.text is.matrix(height)) rownames(height) stopifnot(is.character(log)) logx - logy - FALSE if (log != ) { logx - length(grep(x, log)) 0L logy - length(grep(y, log)) 0L } if ((logx || logy) !is.null(density)) stop(Cannot use shading lines in bars when log scale is used) NR - nrow(height) NC - ncol(height) if (beside) { if (length(space) == 2) space - rep.int(c(space[2L], rep.int(space[1L], NR - 1)), NC) width - rep(width, length.out = NR) } else { width - rep(width, length.out = NC) } offset - rep(as.vector(offset), length.out = length(width)) delta - width/2 w.r - cumsum(space + width) w.m - w.r - delta w.l - w.m - delta log.dat - (logx horiz) || (logy !horiz) if (log.dat) { if (min(height + offset, na.rm = TRUE) = 0) stop(log scale error: at least one 'height + offset' value = 0) if (logx !is.null(xlim) min(xlim) = 0) stop(log scale error: 'xlim' = 0) if (logy !is.null(ylim) min(ylim) = 0) stop(log scale error: 'ylim' = 0) rectbase - if (logy !horiz !is.null(ylim)) ylim[1L] else if (logx horiz !is.null(xlim)) xlim[1L] else 0.9 * min(height, na.rm = TRUE) } else rectbase - 0 if (!beside) height - rbind(rectbase, apply(height, 2L, cumsum)) rAdj - offset + (if (log.dat) 0.9 * height else -0.01 * height) delta - width/2 w.r - cumsum(space + width) w.m - w.r - delta w.l - w.m - delta if (horiz) { if (is.null(xlim)) xlim - range(rAdj, height + offset, na.rm = TRUE) if (is.null(ylim)) ylim - c(min(w.l), max(w.r)) } else { if (is.null(xlim)) xlim - c(min(w.l), max(w.r)) if (is.null(ylim)) ylim - range(rAdj, height + offset, na.rm = TRUE) } if (beside) w.m - matrix(w.m, ncol = NC) if (plot) { opar - if (horiz) par(xaxs = i, xpd = xpd) else par(yaxs = i, xpd = xpd) on.exit(par(opar)) if (!add) { plot.new() plot.window(xlim, ylim, log = log, ...) } xyrect - function(x1, y1, x2, y2, horizontal = TRUE, ...) { if (horizontal) rect(x1, y1, x2, y2, ...) else rect(y1, x1, y2, x2, ...) } if (beside) xyrect(rectbase + offset, w.l, c(height) + offset, w.r, horizontal = horiz, angle = angle, density = density, col = col, border = border) else { for (i in 1L:NC) { xyrect(height[1L:NR, i] + offset[i], w.l[i], height[-1, i] + offset[i], w.r[i], horizontal = horiz, angle = angle, density = density, col = col, border = border[ifelse(i length(border), 1, i)]) Line edited } } if (axisnames !is.null(names.arg)) { at.l - if (length(names.arg) != length(w.m)) { if (length(names.arg) == NC) colMeans(w.m) else stop(incorrect number of names) } else w.m axis(if (horiz) 2 else 1, at = at.l, labels = names.arg, lty = axis.lty, cex.axis = cex.names, ...) } if (!is.null(legend.text)) { legend.col - rep(col, length.out = length(legend.text)) if ((horiz beside) || (!horiz !beside)) { legend.text -
Re: [R] apply a function down each column
Laetitia, I was just responding to your comment that R complains about a syntax error. But I realize now that 2x would probably cause an unexpected symbol error. Here's what I get when I run your loop; what do you get? for (x in 1:(nrow(dat)-1)) { + a - as.character(dat[(2x-1),1]) Error: unexpected symbol in: for (x in 1:(nrow(dat)-1)) { a - as.character(dat[(2x b - as.character(dat[(2x),1]) Error: unexpected symbol in b - as.character(dat[(2x lettermatch(a,b) Error in strsplit(a, ) : object 'a' not found } Error: unexpected '}' in } and here's what I get when I fix the obvious syntax error: for (x in 1:(nrow(dat)-1)) { + a - as.character(dat[(2*x-1),1]) + b - as.character(dat[(2*x),1]) + lettermatch(a,b) + } Error in fix.by(by.x, x) : 'by' must specify valid column(s) That leaves two problems: 1) you're looking at the wrong column in dat[,1]; that should be dat[,2], etc. 2) that error message indicates that your index variable (x) gets to invalid values. Try this: for (x in 1:(nrow(dat)/2)) { a - dat[(2*x-1),2] # odd rows b - dat[(2*x),2]# even rows print(lettermatch(a,b)) } You don't need the as.character() if you have character data. Always do a str(dat) before you do any analysis. -Peter Ehlers Laetitia Schmid wrote: Dear Peter, thank you for the suggestion. Unfortunately the star did not help. Did it work for you? For me it seems incomplete somehow. Laetitia From: Peter Ehlers [ehl...@ucalgary.ca] Sent: Tuesday, January 12, 2010 09:54 AM To: Laetitia Schmid Cc: Steve Lianoglou; r-help@r-project.org Subject: Re: [R] apply a function down each column See inline below. Laetitia Schmid wrote: Dear Steve, my solution looks like it would work, but it does not. I attached a text file with an extract of my data. Maybe you can try it yourself. I want to compare C1 with M1, C2 with M2, C3 with M3,,, for each column. I do not really know what the problem is. R complains about a syntax error. The function I am applying counts the common strings between the two. Greg Hirson helped me to write it. lettermatch - function(a, b) { tb - merge(as.data.frame(table(strsplit(a, ))), as.data.frame(table(strsplit(b, ))), by=Var1) sum(apply(tb[-1], 1, min)) } For example for the second column I tried: for (x in 1:(nrow(dat)-1)) { a - as.character(dat[(2x-1),1]) Shouldn't that be 2*x-1?? -Peter Ehlers b - as.character(dat[(2x),1]) lettermatch(a,b) } or a - as.character(dat[seq(1, nrow(dat), by=2),2]) b - as.character(dat[seq(2, nrow(dat), by=2), 2]) all.results - lettermatch(a,b) With dat-read.delim(data_lgs.txt,stringsAsFactors=FALSE) I can leave the as.character away in the formula above. Laetitia IndividualsSeq1Seq2Seq3Seq4 C1AATTCCGGCTTT M1 C2AATTCCGGCTTT M2AGGGAACTCCGGCGTT C3AGGGAACTCCGGCGTT M3AGGGAACTCCGGCGTT C4AATTCCGGCCTT M4AAATCGGGCTTT C5AGGGACTTCCCGCTTT M5AGGGCTTTCCTT C6AGGGCTTTCCTT M6AAAGCCTTCTTT C7AAAGACCCCCCGGTTT M7AAGGAACCCCGG C8AATTCCGGCCTT M8AATTCCGGCCTT C9 M9 C11AGGGAAACCGGGGGTT M11AATTCCGGCCTT Am 11.01.2010 um 15:18 schrieb Steve Lianoglou: Hi, On Mon, Jan 11, 2010 at 8:41 AM, Laetitia Schmid laeti...@gmt.su.se wrote: Hello World, I have a function that makes pairwise comparisons between two strings. I would like to apply this function to my data (which consists of columns with different strings) in the way that it compares the first with the second entry, and then the third with the fourth, and then the fifth with the sixth, and so on down each column... So (2x-1) and (2x) would be the different entries to be compared! dat= my data: for the first column: compare dat[(2x-1),1] with dat[(2x),1] and x would be 1:i, i=length(dat[,1]) I think the best way to do that is a loop: a - as.character(dat[(2x-1),1]) b - as.character(dat[(2x),1]) for (i in 1:length(dat[,1]) my_function(a, b)) Can somebody help me to apply a function with a loop in the way I want to a column? It seems as if you got it already, don't you? for (x in 1:(nrow(dat)-1)) { a - dat[(2x-1),1] b - dat[(2x), 1] my_function(a,b) } Is there a specification of tapply for that? I don't think so, but depending on what you want to do, the size of your data, and the amount of RAM you have, it might be faster to compare everything at once (assuming `my_function` can be vectorized), for instance: a - dat[seq(1, nrow(dat), by=2),1] b - dat[seq(2, nrow(dat), by=2), 1] all.results - my_function(a,b) Also, as an aside, I see you keep calling as.character on your data when you extract it from your data.frame.
Re: [R] sparseM and kronecker product_R latest version
I see, now I got it. and thanks for the example with matrix. best alessia 2010/1/12 Martin Maechler maech...@stat.math.ethz.ch: am == alessia matano alexis@gmail.com on Mon, 11 Jan 2010 16:20:57 +0100 writes: am Many thanks for it. am However it is strange that when I put the numbers rather than ncol(R) am (a matrix with ncol=36698) it worked. Look below dim(res2) am [1] 170471 25822 D- as.matrix.csr(0,nrow(tmpb),25822) D- as.matrix.csr(0,nrow(tmpb),ncol(res2)) am Error in if (length(x) == nrow * ncol) x - matrix(x, nrow, ncol) else { : am missing value where TRUE/FALSE needed am In addition: Warning message: am In nrow * ncol : NAs produced by integer overflow am But probably it is true what you said, anyway. yes, it is true. The clue is that typeof(25822) is double and not integer. am So, do you suggest me to use directly the simple matrix command? or a am kind of sparse matrix within your package?!? the latter, e.g., library(Matrix) D - Matrix(0, nrow = 113289, ncol=36698) ## or D. - sparseMatrix(x=double(0), i=integer(0), j=integer(0), dims = c(113289,36698)) identical(D, D.) ##-- TRUE ## and, e.g., Dk - kronecker(D, Diagonal(x=5:2)) identical(Dk, D %x% Diagonal(x = 5:2)) [1] TRUE dim(D) [1] 113289 36698 dim(Dk) [1] 453156 146792 Regards, Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optim: abnormal termination in lnsrch (resend)
Attached a script that reproduces the problem. My function is fold.val() and at the end seems the curve contained in lnsrch.dat is fitted quite well, but optim generates the error. Thanks again! mario - I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to minimize a certain function. Often the minimization ends with the message: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH What is optim() trying to say? What have I to change in my function to make the minimization succeed? Do you think using BBoptim() instead of optim() changes anything? Thanks for your help! mario -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 x1 - seq(0, 1, 0.1) fold.val - function(x, mu, vr, thresh, smooth) { a - pnorm(x1, mean=thresh, sd=smooth) d - dnorm(x1, mean=mu, sd=sqrt(vr)) y - d*(1-a) yr - rev(d*a) xr - rev(2*thresh-x1) yr1 - approx(xr, yr, x1)$y yr1[is.na(yr1)] - 0 return(approx(x1, y+yr1, x)$y) } fold.err - function(p, xx, yy) { r - fold.val(xx, mu=p[1], vr=p[2], thresh=p[3], smooth=p[4]) return(sum((r-yy)^2)) } h - read.table(lnsrch.dat, col.names=c('x', 'y')) param - c(.413, 0.00687, .5228, .01255) up- c(.45, 0.0072, .53, .014) lo- c(.4, 0.0060, .52, .011) m - optim(param, fold.err, xx=h$x, yy=h$y, method='L-BFGS-B', lower=lo, upper=up) cat(m$message, \n) plot(h$x, h$y, type='l', xlab='Distance', ylab='Density') lines(h$x, fold.val(h$x, mu=m$par[1], vr=m$par[2], thresh=m$par[3], smooth=m$par[4]), lty=3, lwd=2, col='orange') 0 0.000176362361606926 0.000586510263929619 0.000231378280914905 0.00117302052785924 0.000295589496143618 0.00175953079178886 0.000368478469738331 0.00234604105571848 0.000449107043434096 0.00293255131964809 0.00053621721990612 0.00351906158357771 0.000628406534146135 0.00410557184750733 0.000724352089678939 0.00469208211143695 0.000823019379524589 0.00527859237536657 0.000923769963807285 0.00586510263929619 0.0010263098797736 0.0064516129032258 0.00113046157071014 0.00703812316715543 0.00123574038696884 0.00762463343108504 0.00134128566231542 0.00821114369501466 0.00144585907333914 0.00879765395894428 0.00154814766557669 0.0093841642228739 0.00164720012094507 0.00997067448680352 0.00174279078421419 0.0105571847507331 0.00183552090932921 0.0111436950146628 0.00192657682698567 0.0117302052785924 0.00201721475915703 0.012316715542522 0.0021081579678769 0.0129032258064516 0.00219912372548760 0.0134897360703812 0.00228864332012703 0.0140762463343109 0.00237423713314793 0.0146627565982405 0.00245290831576602 0.0152492668621701 0.00252185164895379 0.0158357771260997 0.00257923683549348 0.0164222873900293 0.00262490019714924 0.0170087976539589 0.00266075875058930 0.0175953079178886 0.00269076521650816 0.0181818181818182 0.00272028739787151 0.0187683284457478 0.00275494219114877 0.0193548387096774 0.00279911977041479 0.0199413489736070 0.00285462086919127 0.0205278592375367 0.00291989826644678 0.0211143695014663 0.00299027032424890 0.0217008797653959 0.00305916799687892 0.0222873900293255 0.0031200913977286 0.0228739002932551 0.00316865054306491 0.0234604105571848 0.00320399165284089 0.0240469208211144 0.00322911512411607 0.024633431085044 0.00325009141115046 0.0252199413489736 0.00327448251393837 0.0258064516129032 0.00330863483141881 0.0263929618768328 0.00335664878021895 0.0269794721407625 0.00341998836253680 0.0275659824046921 0.00349792087037627 0.0281524926686217 0.00358846480856140 0.0287390029325513 0.00368937637895126 0.0293255131964809 0.00379878944173669 0.0299120234604106 0.00391534984847043 0.0304985337243402 0.00403793448762949 0.0310850439882698 0.00416520590831426 0.0316715542521994 0.00429527045546386 0.032258064516129 0.00442560376781198 0.0328445747800587 0.00455325817307603 0.0334310850439883 0.00467525233059903 0.0340175953079179 0.00478900552079331 0.0346041055718475 0.00489270623934628 0.0351906158357771 0.0049809321012 0.0357771260997067 0.00506785770230776 0.0363636363636364 0.00514095495458048 0.036950146627566 0.00520698251869515 0.0375366568914956 0.00526847101068982 0.0381231671554252 0.00532783817372676 0.0387096774193548 0.00538686868362264 0.0392961876832845 0.00544630067045661 0.0398826979472141 0.00550562536463334 0.0404692082111437 0.00556316103457661 0.0410557184750733 0.00561640167932586 0.0416422287390029 0.00566258222546389 0.0422287390029326 0.00569935446480522 0.0428152492668622 0.00572527363548914 0.0434017595307918 0.00574066288496805 0.0439882697947214 0.00574780398040942 0.044574780058651 0.00575013267506571 0.0451612903225806 0.00575179443458933 0.0457478005865103 0.00575684258138564 0.0463343108504399 0.00576851671527049 0.0469208211143695
[R] how to handle missing values . when importing data in R
hi, I have a question about importing data in R. I want to import a file which has missing value in it, and the missing values are denoted as ., I want to first read in the file, and then change the . into the number zero 0. how can I do that? thank you, karena -- View this message in context: http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Functions for QUAIDS and nonlinear SUR?
On Sat, Jan 9, 2010 at 1:21 AM, Werner W. pensterfuz...@yahoo.de wrote: I would like to estimate a quadratic almost ideal demand system in R which is estimated usually by nonlinear seemingly unrelated regression. But there is no such function in R yet The systemfit package has the function nlsystemfit() for estimating systems of non-linear equations, e.g. by non-linear SUR. However, in contrast to the systemfit() function for estimating systems of linear equations, the function nlsystemfit() is still under development and has convergence problems rather often. So, I cannot recommend using nlsystemfit() for an important analysis. :-( but it is readily available in STATA (nlsur), see B. Poi (2008): Demand-system estimation: Update, Stata Journal 8(4). Now I am thinking, what is quicker learning to program STATA which seems not really comfortable for programming or implement the method in R which might be above my head in terms of econometrics. You do not have to start from scratch but you could improve the nlsystemfit() function, e.g. by implementing analytical gradients of the objective function -- and I could assist you with this. If you are interested in improving nlsystemfit(), please apply at R-Forge [1] for getting write access to systemfit's SVN repository. [1] http://r-forge.r-project.org/projects/systemfit/ /Arne -- Arne Henningsen http://www.arne-henningsen.name __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to handle missing values . when importing data in R
?read.table na.strings='.' Then change all NAs to zero df$col[is.na(df$col)] - 0 On Tue, Jan 12, 2010 at 12:46 PM, karena dr.jz...@gmail.com wrote: hi, I have a question about importing data in R. I want to import a file which has missing value in it, and the missing values are denoted as ., I want to first read in the file, and then change the . into the number zero 0. how can I do that? thank you, karena -- View this message in context: http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting p values
On 12/01/2010 10:47 AM, Rosario Garcia Gil wrote: Dear colleges I need to get the p values for a table with 15000 entries of t values. Does any of you know how to do it? I can, of course, get one by one but that is not sensible. Put the t values into a vector, then use pt() in an appropriate way to calculate them all at once. (An appropriate way depends on details like whether you want one or two tailed value, degrees of freedom, etc.) Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate the percentages of the numbers in every column.
Dear friends, I have a table like this, I have A B C D ... levels, the first column you see is just the index, and there are different numbers in the table. A B C D ... 10 2 1 0 21 0 2 1 32 3 0 0 40 0 1 0 50 2 3 1 ... I want to calculate the frequencies or the percentages of the numbers in every column. How do I get a table like this, the first column is the levels of numbers, and the numbers inside the table are the percentages. All the percentages should add up to 1 in every column. A B C D ... 0 0.2 0.3 0.1 0.1 1 0.1 0.1 0.2 0.1 2 0.1 0.2 0.2 0.2 3 0.2 0.1 0.1 0 ... Thanks your help! Kelvin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to handle missing values . when importing data in R
Hi, tim, thank you very much for the reply, but I am really a new user. How to change all NAs to zero? thanks again. karena jholtman wrote: ?read.table na.strings='.' Then change all NAs to zero df$col[is.na(df$col)] - 0 On Tue, Jan 12, 2010 at 12:46 PM, karena dr.jz...@gmail.com wrote: hi, I have a question about importing data in R. I want to import a file which has missing value in it, and the missing values are denoted as ., I want to first read in the file, and then change the . into the number zero 0. how can I do that? thank you, karena -- View this message in context: http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012318.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optimization challenge
I have a challenge that I want to share with the group. This is not homework (but I may assign it as such if I teach the appropriate class again) and I have found one solution, so don't need anything urgent. This is more for fun to see if others can find a better solution than I did. The challenge: I want to read a book in a given number of days. I want to read an integer number of chapters each day (there are more chapters than days), no stopping part way through a chapter, and at least 1 chapter each day. The chapters are very non uniform in length (some very short, a few very long, many in between) so I would like to come up with a reading schedule that minimizes the variance of the length of the days readings (read multiple short chapters on the same day, long chapters are the only one read that day). I also want to read through the book in order (no skipping ahead to combine short chapters that are not naturally next to each other. My thought was that the optim function with method=SANN would be an appropriate approach, but my first couple of tries did not give very good results. I have since come up with an optim with SANN solution that gives what I consider good results (but I accept that better is possible). Below is a data frame with the lengths of the chapters for the book that originally sparked the challenge for me (but the general idea should work for any book). Each row represents a chapter (in order) with 3 different measures of the length of the chapter. For this challenge I want to read the book in 128 days (there are 239 chapters). I will post my solutions in a few days, but I want to wait so that my direction does not influence people from trying other approaches (if there is something better than optim, that is fine). Good luck for anyone interested in the challenge, The data frame: bom3 - structure(list(Chapter = structure(1:239, .Label = c(1 Nephi 1, 1 Nephi 2, 1 Nephi 3, 1 Nephi 4, 1 Nephi 5, 1 Nephi 6, 1 Nephi 7, 1 Nephi 8, 1 Nephi 9, 1 Nephi 10, 1 Nephi 11, 1 Nephi 12, 1 Nephi 13, 1 Nephi 14, 1 Nephi 15, 1 Nephi 16, 1 Nephi 17, 1 Nephi 18, 1 Nephi 19, 1 Nephi 20, 1 Nephi 21, 1 Nephi 22, 2 Nephi 1, 2 Nephi 2, 2 Nephi 3, 2 Nephi 4, 2 Nephi 5, 2 Nephi 6, 2 Nephi 7, 2 Nephi 8, 2 Nephi 9, 2 Nephi 10, 2 Nephi 11, 2 Nephi 12, 2 Nephi 13, 2 Nephi 14, 2 Nephi 15, 2 Nephi 16, 2 Nephi 17, 2 Nephi 18, 2 Nephi 19, 2 Nephi 20, 2 Nephi 21, 2 Nephi 22, 2 Nephi 23, 2 Nephi 24, 2 Nephi 25, 2 Nephi 26, 2 Nephi 27, 2 Nephi 28, 2 Nephi 29, 2 Nephi 30, 2 Nephi 31, 2 Nephi 32, 2 Nephi 33, Jacob 1, Jacob 2, Jacob 3, Jacob 4, Jacob 5, Jacob 6, Jacob 7, Enos 1, Jarom 1, Omni 1, Words of Mormon 1, Mosiah 1, Mosiah 2, Mosiah 3, Mosiah 4, Mosiah 5, Mosiah 6, Mosiah 7, Mosiah 8, Mosiah 9, Mosiah 10, Mosiah 11, Mosiah 12, Mosiah 13, Mosiah 14, Mosiah 15, Mosiah 16, Mosiah 17, Mosiah 18, Mosiah 19, Mosiah 20, Mosiah 21, Mosiah 22, Mosiah 23, Mosiah 24, Mosiah 25, Mosiah 26, Mosiah 27, Mosiah 28, Mosiah 29, Alma 1, Alma 2, Alma 3, Alma 4, Alma 5, Alma 6, Alma 7, Alma 8, Alma 9, Alma 10, Alma 11, Alma 12, Alma 13, Alma 14, Alma 15, Alma 16, Alma 17, Alma 18, Alma 19, Alma 20, Alma 21, Alma 22, Alma 23, Alma 24, Alma 25, Alma 26, Alma 27, Alma 28, Alma 29, Alma 30, Alma 31, Alma 32, Alma 33, Alma 34, Alma 35, Alma 36, Alma 37, Alma 38, Alma 39, Alma 40, Alma 41, Alma 42, Alma 43, Alma 44, Alma 45, Alma 46, Alma 47, Alma 48, Alma 49, Alma 50, Alma 51, Alma 52, Alma 53, Alma 54, Alma 55, Alma 56, Alma 57, Alma 58, Alma 59, Alma 60, Alma 61, Alma 62, Alma 63, Helaman 1, Helaman 2, Helaman 3, Helaman 4, Helaman 5, Helaman 6, Helaman 7, Helaman 8, Helaman 9, Helaman 10, Helaman 11, Helaman 12, Helaman 13, Helaman 14, Helaman 15, Helaman 16, 3 Nephi 1, 3 Nephi 2, 3 Nephi 3, 3 Nephi 4, 3 Nephi 5, 3 Nephi 6, 3 Nephi 7, 3 Nephi 8, 3 Nephi 9, 3 Nephi 10, 3 Nephi 11, 3 Nephi 12, 3 Nephi 13, 3 Nephi 14, 3 Nephi 15, 3 Nephi 16, 3 Nephi 17, 3 Nephi 18, 3 Nephi 19, 3 Nephi 20, 3 Nephi 21, 3 Nephi 22, 3 Nephi 23, 3 Nephi 24, 3 Nephi 25, 3 Nephi 26, 3 Nephi 27, 3 Nephi 28, 3 Nephi 29, 3 Nephi 30, 4 Nephi 1, Mormon 1, Mormon 2, Mormon 3, Mormon 4, Mormon 5, Mormon 6, Mormon 7, Mormon 8, Mormon 9, Ether 1, Ether 2, Ether 3, Ether 4, Ether 5, Ether 6, Ether 7, Ether 8, Ether 9, Ether 10, Ether 11, Ether 12, Ether 13, Ether 14, Ether 15, Moroni 1, Moroni 2, Moroni 3, Moroni 4, Moroni 5, Moroni 6, Moroni 7, Moroni 8, Moroni 9, Moroni 10 ), class = factor), Words = c(908L, 879L, 1067L, 1262L, 761L, 202L, 992L, 1221L, 259L, 924L, 1315L, 860L, 1899L, 1284L, 1488L, 1618L, 2523L, 1217L, 1292L, 698L, 945L, 1506L, 1543L, 1460L, 1170L, 1300L, 1169L, 895L, 405L, 812L, 2388L, 966L, 338L, 647L, 587L, 203L, 857L, 370L, 687L, 570L, 587L, 928L, 520L, 134L, 587L, 891L, 1699L, 1483L, 1461L, 1240L, 804L, 708L, 988L, 426L, 647L, 719L, 1365L, 619L, 929L, 3758L, 511L, 1242L, 1160L, 734L, 1398L, 857L, 966L, 2112L, 1117L, 1605L, 740L, 309L, 1555L, 938L, 864L, 957L, 1271L,
[R] Non-metric multidimensional scaling (NMDS) help
Hi, I am currently working on some data and feel that NMDS would return an excellent result. With my current data set however I have been experiencing some problems and cannot carry out metaMDS. I have tried with a few smaller data sets which I created for practice sake and this has worked fine. I think it is the set up of my data set that is causing me trouble. I have 18 columns and 18 rows, as needed for the n x n matrix. However, within the data set I have a lot of zeros, i.e. more than just the zeros where column B meets row B. Do I need to get rid of these excess zeros in order for metaMDS to work? Any help is much appreciated, Seán Kelly. -- View this message in context: http://n4.nabble.com/Non-metric-multidimensional-scaling-NMDS-help-tp1012336p1012336.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] svm
Hi Steve, Thank you so much for your reply. I really needed to know how SVM works without removing the class label while receiving it in the formula parameter. It does not if I remove the class label. Cheers, Amy Date: Sat, 9 Jan 2010 15:48:49 -0500 Subject: Re: [R] svm From: mailinglist.honey...@gmail.com To: amy_4_5...@hotmail.com CC: r-help@r-project.org Hi, On Fri, Jan 8, 2010 at 11:57 AM, Amy Hessen amy_4_5...@hotmail.com wrote: Hi Steve, Thank you very much for your reply. Your code is more readable and obvious than mine No Problem. Could you please help me in these questions?: 1) Formula is an alternative to y parameter in SVM. is it correct? No, that's not correct. There are two svm functions, one that takes a formula object (svm.formula), and one that takes an x matrix, and a y vector (svm.default). The svm.formula function is called when the first argument in your svm(..) call is a formula object. This function simply parses the formula and manipulates your data object into an x matrix and y vector, then calls the svm.default function with those params ... I usually prefer to just skip the formula and provide the x and y objects directly. Load the e1071 library and look at the source code: R library(e1071) R e1071:::svm.formula You'll see what I mean. 2) I forgot to remove the class label from the dataset besides I gave the program the class label in formula parameter but the program works! Could you please clarify this point to me? The author of the e1071 package did you a favor. The predict.svm function checks to see if your svm object was built using the formula interface .. if so, it looks for you label column in the data you are trying to predict on and ignores it. Look at the function's source code (eg, type e1071:::predict.svm at the R prompt), and look for the call to the delete.response function ... you can also look at the help in ?delete.response. -steve Date: Wed, 6 Jan 2010 18:44:13 -0500 Subject: Re: [R] svm From: mailinglist.honey...@gmail.com To: amy_4_5...@hotmail.com CC: r-help@r-project.org Hi Amy, On Wed, Jan 6, 2010 at 4:33 PM, Amy Hessen amy_4_5...@hotmail.com wrote: Hi Steve, Thank you very much for your reply. Im trying to do something systematic/general in the program so that I can try different datasets without changing much in the program (without knowing the name of the class label that has different name from dataset to another ) Could you please tell me your opinion about this code:- library(e1071) mydata-read.delim(the_whole_dataset.txt) class_label - names(mydata)[1]# Ill always put the class label in the first column. myformula - formula(paste(class_label,~ .)) x - subset(mydata, select = - mydata[, 1]) mymodel-(svm(myformula, x, cross=3)) summary(model) Since you're not doing anything funky with the formula, a preference of mine is to just skip this way of calling SVM and go straight to the svm(x,y,...) method: R mydata - as.matrix(read.delim(the_whole_dataset.txt)) R train.x - mydata[,-1] R train.y - mydata[,1] R mymodel - svm(train.x, train.y, cross=3, type=C-classification) ## or R mymodel - svm(train.x, train.y, cross=3, type=eps-regression) As an aside, I also like to be explicit about the type= parameter to tell what I want my SVM to do (regression or classification). If it's not specified, the SVM picks which one to do based on whether or not your y vector is a vector of factors (does classification), or not (does regression) Do I have to the same steps with testingset? i.e. the testing set must not contain the label too? But contains the same structure as the training set? Is it correct? I guess you'll want to report your accuracy/MSE/something on your model for your testing set? Just load the data in the same way then use `predict` to calculate the metric your after. You'll have to have the labels for your data to do that, though, eg: testdata - as.matrix(read.delim('testdata.txt')) test.x - testdata[,-1] test.y - testdata[,1] preds - predict(mymodel, test.x) Let's assume you're doing classification, so let's report the accuracy: acc - sum(preds == test.y) / length(test.y) Does that help? -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact Sell your old one fast! Time for a new car? -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact
Re: [R] how to handle missing values . when importing data in
On 12-Jan-10 17:46:47, karena wrote: hi, I have a question about importing data in R. I want to import a file which has missing value in it, and the missing values are denoted as ., I want to first read in the file, and then change the . into the number zero 0. how can I do that? thank you, karena It may depend on what format the file is in, but if it is a tabular text file or a CSV file then you can use the na.strings parameter. Here is an example of a little CSV file with . used for missing: file temp.csv: -- A,B,C,D 1.1,1.2,1.3,1.4 2.1,2.2,.,2.4 3.1,.,3.3,3.4 4.1,.,.,4.4 D - read.csv(temp.csv,na.strings=.) D # A B C D # 1 1.1 1.2 1.3 1.4 # 2 2.1 2.2 NA 2.4 # 3 3.1 NA 3.3 3.4 # 4 4.1 NA NA 4.4 So the . have gone in as NA (the right thing to do in the first instance with missing data). Now you can replace these by zeros: D[is.na(D)] - 0 D # 1 1.1 1.2 1.3 1.4 # 2 2.1 2.2 0.0 2.4 # 3 3.1 0.0 3.3 3.4 # 4 4.1 0.0 0.0 4.4 Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 12-Jan-10 Time: 18:42:40 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim: abnormal termination in lnsrch (resend)
Mario, It seems likely that your function is not smooth in the parameters. This may create problems for some optimizers that require smoothness. However, I was able to get good convergence with `spg' function in my BB package. Here is how it works: require(BB) Loading required package: BB Loading required package: numDeriv m - spg(param, fold.err, xx=h$x, yy=h$y, lower=lo, upper=up) iter: 0 f-value: 7.597257 pgrad: 0.037 iter: 10 f-value: 7.551395 pgrad: 0.03674868 iter: 20 f-value: 7.5513 pgrad: 0.02421642 iter: 30 f-value: 7.551299 pgrad: 1.865619e-05 m $par [,1][,2] [,3] [,4] [1,] 0.4132586 0.006864837 0.522723 0.01279469 $value [1] 7.551299 $gradient [1] 3.019807e-08 $fn.reduction [1] 0.04595781 $iter [1] 34 $feval [1] 44 $convergence [1] 0 $message [1] Successful convergence It is also interesting that `spg' converges well from a random, infeasible starting point. set.seed(123) m - spg(runif(4), fold.err, xx=h$x, yy=h$y, lower=lo, upper=up) iter: 0 f-value: 102.6793 pgrad: 0.05 iter: 10 f-value: 7.552826 pgrad: 0.01328252 iter: 20 f-value: 7.551299 pgrad: 0.03674152 iter: 30 f-value: 7.551299 pgrad: 0.003764237 Hope this helps, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvarad...@jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h tml -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Mario Valle Sent: Tuesday, January 12, 2010 12:45 PM To: R-help@r-project.org Subject: [R] optim: abnormal termination in lnsrch (resend) Attached a script that reproduces the problem. My function is fold.val() and at the end seems the curve contained in lnsrch.dat is fitted quite well, but optim generates the error. Thanks again! mario - I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to minimize a certain function. Often the minimization ends with the message: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH What is optim() trying to say? What have I to change in my function to make the minimization succeed? Do you think using BBoptim() instead of optim() changes anything? Thanks for your help! mario -- Ing. Mario Valle Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60 v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional Sampling
The last 2 lines of your code can be replaced with: M - replicate(1000, sample(Y,5,replace=FALSE) ) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Ted Harding Sent: Tuesday, January 12, 2010 7:34 AM To: r-help@r-project.org Cc: ehcpieterse Subject: Re: [R] Conditional Sampling On 12-Jan-10 14:00:24, ehcpieterse wrote: Thanks Ted, your solution does make perfect sense. The only question I still have is that I would like to sample the remaining 5 observations after I have randomly selected the first 10. Given the initial 10, I would like to sample the following 5 say 1,000 times to get a simulated conditional sample, if that makes any sense. I want to build this into an iterative process to see how the first sample affects the resulting samples. Even though all the observations have the same probabilty to get sampled, they each have a different expected value. -- OK, if I now understand you, you are interested in the properties of the remaining (90) observations, given that they do not include any of the (10) cases sampled in the first round. In that case, I think you should adopt the sample.int() approach I also suggested: X - (1:100) ## (or any other 100 values) n - sample.int(100,10,replace=FALSE) ## returns subset of (1:100) x - X[n] Y - X[-n] ## The set remaining after the first 10 were taken ## Now you can sample repeatedly from Y until your eyes fall out. ## So build up a matrix of (say) 1000 samples from Y: M - sample(Y,5,replace=FALSE) for(i in (2:1000)){ M - rbind(M,sample(Y,5,replace=FALSE)) } The repeated samples M of 5 from Y of course imply replacing each sample of 5 back in Y, so they are available at each turn. You can not, of course, sample 1000*5 from 100 without replacement! (Each sample of 5 is obtained without replacement, however). I hope this is getting close! Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 12-Jan-10 Time: 14:34:13 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] post-hoc after ancova
I have done ancova with categorical and continuous predictor variables. The categorical predictor variable shows significant effect on the dependent variable. I would like to do a post-hoc test to see which groups in the categorical variable differ. I have explored Tukey test in multcomp package. My study is similar to the litter data. In the code it's mentioned that the contrast matrix also has some trends like otrend, atrend and ltrend. otrend = c(-1.5, -0.5, 0.5, 1.5), atrend = doselev - mean(doselev), ltrend = log(1:4) - mean(log(1:4))) Here are my questions: Are this trends absolutely essential for conducting the Tukey test? If yes, how can I set these trends? thanks, mahua [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] optim: abnormal termination in lnsrch (resend)
Mario Valle wrote: [sorry, forgot some details...] I'm using optim(param, fun, method='L-BFGS-B', lower=lo, upper=up) to minimize a certain function. Often the minimization ends with the message: ERROR: ABNORMAL_TERMINATION_IN_LNSRCH What is optim() trying to say? What have I to change in my function to make the minimization succeed? Do you think using BBoptim() instead of optim() changes anything? We need more information. You can also try the additional argument to optim for tracing the optimization. Use optim(param, fun, method='L-BFGS-B', lower=lo, upper=up ,control=list(trace=6)) to get more information. Berend -- View this message in context: http://n4.nabble.com/optim-abnormal-termination-in-lnsrch-resend-tp1012255p1012370.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Making routine faster by using apply instead of for-loop
Hey everybody, I have a small problem with a routine, which prepares some data for plotting. I've made a small example: c=10 mat=data.frame(matrix(1:(c*c),c,c)) row.names(mat)=seq(c,1,length=c) names(mat)=c(seq(2,c,length=c/2),seq(c,2,length=c/2)) v=as.numeric(row.names(mat)) w=as.numeric(names(mat)) for(i in 1:c) { for(j in 1:c) { if(v[j]+w[i]=c)(mat[i,j]=NA) }} This produces exactly the data I need to go on, but if I increase the constant c ,to for instance 500 , it takes a very long time to set the NA's. I've heard there is a much faster way to set the NA's using the command apply( ), but I don't know how. I'm looking forward for any ideas or hints, that might help me. Best regards Etienne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to handle missing values . when importing data in R
What is the structure of the data that you are reading in? Are you using 'read.table', 'scan', etc.? Are all the columns numeric, or do you just want to change some of them? If you have used 'na.strings' to cause the values of the missing data to be set to NA, then you can iterate through the appropriate columns to change NAs to zero, but this depends on your structure. For example if you know the names of the columns, you could do: for (i in c('col1', 'col3', 'col8')) df[[i]][is.na(df[[i]])] - 0 On Tue, Jan 12, 2010 at 1:06 PM, karena dr.jz...@gmail.com wrote: Hi, tim, thank you very much for the reply, but I am really a new user. How to change all NAs to zero? thanks again. karena jholtman wrote: ?read.table na.strings='.' Then change all NAs to zero df$col[is.na(df$col)] - 0 On Tue, Jan 12, 2010 at 12:46 PM, karena dr.jz...@gmail.com wrote: hi, I have a question about importing data in R. I want to import a file which has missing value in it, and the missing values are denoted as ., I want to first read in the file, and then change the . into the number zero 0. how can I do that? thank you, karena -- View this message in context: http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012298.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012318.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expand.grid game
This also has a closed form solution: choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7) [1] 229713 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Brian Diggs Sent: Thursday, December 31, 2009 3:08 PM To: baptiste auguie; David Winsemius Cc: r-help Subject: Re: [R] expand.grid game baptiste auguie wrote: 2009/12/19 David Winsemius dwinsem...@comcast.net: On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote: Dear list, In a little numbers game, I've hit a performance snag and I'm not sure how to code this in C. The game is the following: how many 8-digit numbers have the sum of their digits equal to 17? And are you considering the number 0089 to be in the acceptable set? Or is the range of possible numbers in 1079:9800 ? The latter, the first digit should not be 0. But if you have an interesting solution for the other case, let me know anyway. I should also stress that this is only for entertainment and curiosity's sake. baptiste I realize I'm late coming to this, but I was reading it in my post- vacation catch-up and it sounded interesting so I thought I'd give it a shot. After coding a couple of solutions that were exponential in time (for the number of digits), I rearranged things and came up with something that is linear in time (for the number of digits) and gives the count of numbers for all sums at once: library(plyr) nsum3 - function(digits) { digits - as.integer(digits)[[1L]] if (digits==1) { rep(1,9) } else { dm1 - nsum3(digits-1) Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9-x))})) } } nsums - llply(1:8, nsum3) nsums[[5]][17] # [1] 3675 nsums[[8]][17] # [1] 229713 The whole thing runs in well under a second on my machine (a several years old dual core Windows machine). In the results of nsum3, the i- th element is the number of numbers whose digits sum to i. The basic idea is recursion on the number of digits; if n_{t,d} is the number of d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)} n_{t- i,d-1}. (Adding the digit i to each of those numbers makes their sum t and increases the digits to d). When digits==1, then 0 isn't a valid choice and that also implies the sum of digits can't be 0, which fits well with the 1 indexing of arrays. -- Brian Diggs, Ph.D. Senior Research Associate, Department of Surgery, Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] The TeX-source for the package manual.
On 13/01/2010, at 3:40 AM, BXC (Bendix Carstensen) wrote: I have noted that the later versions of Rcmd check cleans out the directory pkg.Rcheck so that only package-manual.log and package- manual.pdf are left. Formerly the package-manual.tex was around too --- very handy for various purposes. Is there a way to generate the .tex - version of the manual for a package? On unix-alike systems one can do R CMD Rd2dvi --no-clean package name and then look in a (hidden) directory .Rd2dvinnn where ``nnn'' represents a 3 digit number. You get a message saying You may want to clean up by 'rm -rf .Rd2dvinnn' which tells you the value of ``nnn''. The tex file you want is called Rd2.tex. There is probably a similar incantation that works under Windoze. cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expand.grid game
Nice --- am I missing something or was this closed form solution not entirely trivial to find? I ought to compile the various clever solutions given in this thread someday, it's fascinating! Thanks, baptiste 2010/1/12 Greg Snow greg.s...@imail.org: This also has a closed form solution: choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7) [1] 229713 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Brian Diggs Sent: Thursday, December 31, 2009 3:08 PM To: baptiste auguie; David Winsemius Cc: r-help Subject: Re: [R] expand.grid game baptiste auguie wrote: 2009/12/19 David Winsemius dwinsem...@comcast.net: On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote: Dear list, In a little numbers game, I've hit a performance snag and I'm not sure how to code this in C. The game is the following: how many 8-digit numbers have the sum of their digits equal to 17? And are you considering the number 0089 to be in the acceptable set? Or is the range of possible numbers in 1079:9800 ? The latter, the first digit should not be 0. But if you have an interesting solution for the other case, let me know anyway. I should also stress that this is only for entertainment and curiosity's sake. baptiste I realize I'm late coming to this, but I was reading it in my post- vacation catch-up and it sounded interesting so I thought I'd give it a shot. After coding a couple of solutions that were exponential in time (for the number of digits), I rearranged things and came up with something that is linear in time (for the number of digits) and gives the count of numbers for all sums at once: library(plyr) nsum3 - function(digits) { digits - as.integer(digits)[[1L]] if (digits==1) { rep(1,9) } else { dm1 - nsum3(digits-1) Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9-x))})) } } nsums - llply(1:8, nsum3) nsums[[5]][17] # [1] 3675 nsums[[8]][17] # [1] 229713 The whole thing runs in well under a second on my machine (a several years old dual core Windows machine). In the results of nsum3, the i- th element is the number of numbers whose digits sum to i. The basic idea is recursion on the number of digits; if n_{t,d} is the number of d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)} n_{t- i,d-1}. (Adding the digit i to each of those numbers makes their sum t and increases the digits to d). When digits==1, then 0 isn't a valid choice and that also implies the sum of digits can't be 0, which fits well with the 1 indexing of arrays. -- Brian Diggs, Ph.D. Senior Research Associate, Department of Surgery, Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Drop last numeral
Hello all, Frustrated and i know you can help I need to drop the last numeral of each of my values in my data set. So for the following i have tried the ?substring but since i have to specify the length, but because my data are of varying lengths it doenst work so well Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) Bldgid-substring(as.character(Data),1,3) returns: 113 113 173 173 182 182 222 222 224 224 414 414 414 but i want 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434 ,41437 ,41437) The values thats have more than 4 numerals are whats messing things up. Tried ?formatC as well but couldn't get it to coerce things correctly. Thanks for the help JR -- View this message in context: http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop last numeral
Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) Bldgid-substring(as.character(Data),1,nchar(Data)-1) -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of LCOG1 Sent: Tuesday, January 12, 2010 1:37 PM To: r-help@r-project.org Subject: [R] Drop last numeral Hello all, Frustrated and i know you can help I need to drop the last numeral of each of my values in my data set. So for the following i have tried the ?substring but since i have to specify the length, but because my data are of varying lengths it doenst work so well Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) Bldgid-substring(as.character(Data),1,3) returns: 113 113 173 173 182 182 222 222 224 224 414 414 414 but i want 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434 ,41437 ,41437) The values thats have more than 4 numerals are whats messing things up. Tried ?formatC as well but couldn't get it to coerce things correctly. Thanks for the help JR -- View this message in context: http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. === P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S.News World Report (2009). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Non-metric multidimensional scaling (NMDS) help
On Tue, 2010-01-12 at 10:28 -0800, kellys17 wrote: Hi, I am currently working on some data and feel that NMDS would return an excellent result. With my current data set however I have been experiencing some problems and cannot carry out metaMDS. I have tried with a few smaller data sets which I created for practice sake and this has worked fine. What were the errors/warnings you received that led you to this conclusion? Please read the posting guide: http://www.R-project.org/posting-guide.html before replying to the list as from the above, there is almost no way that we can help you. I think it is the set up of my data set that is causing me trouble. I have 18 columns and 18 rows, as needed for the n x n matrix. However, within the data set I have a lot of zeros, i.e. more than just the zeros where column B meets row B. Do I need to get rid of these excess zeros in order for metaMDS to work? You can provide a dissimilarity matrix *or* a community matrix to metaMDS. If the latter, it will compute the dissimilarity for you (via metaMDSdist() ) and you can avail yourself of the argument 'zerodist' to that function --- see ?metaMDS for the arguments. From the above, I deduce that the error from metaMDS is along the lines of: ...zero or negative distance between objects X and Y... which is because isoMDS can't work with samples that have zero dissimilarity to one another. Indeed - by definition they should be placed in the same location so why ordinate them at all, just use one of the samples. Is your matrix your own dissimilarity matrix? If so, is there some reason you can't use the ones provided in vegan or metaMDS? If there is a good reason, and you want to include all samples, then you'll need to come up with a means for handling them. metaMDSdist allow you to add a small value to the zero dissimilarities. The details are in the code, but effectively all zero distances are replaced by half the smallest non zero distance. You could do a similar replacement yourself if you feel this is warranted and/or justified. minDij - min(Dij[Dij 0) / 2 Dij[Dij = 0] - minDij Will do this replacement if Dij is your matrix (replace Dij with whatever the name of your matrix is). Then supply the new matrix to metaMDS. For most applications I have needed nMDS for, I would delete the samples with duplicated species composition rather than add ad hoc amounts to samples just to get the software to produce a result. HTH G Any help is much appreciated, Seán Kelly. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop last numeral
Also try sub('[0-9]$', '', Data) [1] 113 113 173 173 182 182 222 222 224 [10] 224 41434 41437 41437 HTH, Dennis On Tue, Jan 12, 2010 at 10:36 AM, LCOG1 jr...@lcog.org wrote: Hello all, Frustrated and i know you can help I need to drop the last numeral of each of my values in my data set. So for the following i have tried the ?substring but since i have to specify the length, but because my data are of varying lengths it doenst work so well Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) Bldgid-substring(as.character(Data),1,3) returns: 113 113 173 173 182 182 222 222 224 224 414 414 414 but i want 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434 ,41437 ,41437) The values thats have more than 4 numerals are whats messing things up. Tried ?formatC as well but couldn't get it to coerce things correctly. Thanks for the help JR -- View this message in context: http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to handle missing values . when importing data in R
thank you guys. All the columns of my data are numeric. I tried both methods, and they both work. I appreciate your help. -k -- View this message in context: http://n4.nabble.com/how-to-handle-missing-values-when-importing-data-in-R-tp1012298p1012397.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for windows 64 bit
Hi Alessia, Note that, while your physical limit might be 6 GB, Windows memory management allows more memory than that to be allocated (aka Virtual Memory, or at least that's what they called it in XP). Windows swaps out memory from RAM to the hard disk and back when necessary (please excuse the explanation if you already know all this). For processing large vectors, this swapping might bring your system to a standstill. Regardless, the maximum memory for a windows process is larger than the physical RAM you have available. allie On 1/12/2010 6:27 AM, alessia matano wrote: Fine, it worked. I will try in this way. Just the last question and I won't bother you further today. My machine right now has just 6 giga of RAM (it will be increased to 16 in a few days), and I see that with this experimental version memory.limit is 6135. How is the command to increase the memory usage until the maximum I can (5 giga?). If I am writing memory.limit(5000) it still gives me the error: don't be silly! Your machine has a 4Gb address limit which is quite odd. Many thanks Best A. 2010/1/12 alessia matano alexis@gmail.com: ok, perfect! I will try with it...many many thanks. Have you got there also the quantreg package, which has actually the same problem of sparseM (32bit version)? best alessia 2010/1/12 Uwe Ligges lig...@statistik.tu-dortmund.de: On 12.01.2010 12:09, alessia matano wrote: I am sorry, I know it is an experimental version, and I have been misleading saying a new version. Therefore, I will wait for when they will be available officially, since it is just a few days. Or just use today my private repository I indicated in the other mail. Uwe Ligges However, I tried also to go to the cran pages and download them and insert into the library. For quantreg it worked, for sparseM it did not probably because it's a win32 version, as you said. 2010/1/12 Prof Brian Ripleyrip...@stats.ox.ac.uk: On Tue, 12 Jan 2010, alessia matano wrote: Dear all, I just download and set this new version of R. I am now trying to download the packages I need which are sperseM and quantreg. I downloaded and insert into the library file the quantreg pacjkage and it seems to work. However, when I try to do the same with sparseM I get the following error message: Loading required package: SparseM Error in inDL(x, as.logical(local), as.logical(now), ...) : unable to load shared library 'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll': LoadLibrary failure: %1 non è un'applicazione di Win32 valida. Any help for it? Please do refer to the posting referred to in that thread (and Henrique, please do not post just the URL without the explanations). https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html You cannot mix 32-bit Windows binary packages with this experimental port (it is not a 'new version'): you need to install from the package sources. If that is too difficult for you, please do not try to use unsupported experimental builds (and Uwe Ligges may have some binary packages available for test in a few days). Thanks a lot alessia 2010/1/11 Henrique Dallazuannawww...@gmail.com: Try this version (beta of development version): http://www.stats.ox.ac.uk/pub/RWin/Win64/R-2.11.0dev-win64.exe On Mon, Jan 11, 2010 at 2:29 PM, alessia matanoalexis@gmail.com wrote: Dear all, do you know if there is any particular version of R to implement with windows 64 bit, in such a way to increase the amount of memory it can use? How should I increase the memory, and more importantly to set a higher max vector size? It still stops me saying Could not allocate vector of size 145 thanks to all alessia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list
Re: [R] Drop last numeral
In addition to the substring and regular expression solutions, if you are certain that everything will be numeric (and integer as in your examples), then you could just convert to numeric, divide by 10, and then drop the decimal (floor or as.integer). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of LCOG1 Sent: Tuesday, January 12, 2010 11:37 AM To: r-help@r-project.org Subject: [R] Drop last numeral Hello all, Frustrated and i know you can help I need to drop the last numeral of each of my values in my data set. So for the following i have tried the ?substring but since i have to specify the length, but because my data are of varying lengths it doenst work so well Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) Bldgid-substring(as.character(Data),1,3) returns: 113 113 173 173 182 182 222 222 224 224 414 414 414 but i want 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434 ,41437 ,41437) The values thats have more than 4 numerals are whats messing things up. Tried ?formatC as well but couldn't get it to coerce things correctly. Thanks for the help JR -- View this message in context: http://n4.nabble.com/Drop-last-numeral- tp1012347p1012347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop last numeral
Try this: substr(Data,1,nchar(Data)-1) Steve From: LCOG1 jr...@lcog.org To:r-help@r-project.org Date: 13/Jan/2010 9:15 a.m. Subject: [R] Drop last numeral Hello all, Frustrated and i know you can help I need to drop the last numeral of each of my values in my data set. So for the following i have tried the ?substring but since i have to specify the length, but because my data are of varying lengths it doenst work so well Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) Bldgid-substring(as.character(Data),1,3) returns: 113 113 173 173 182 182 222 222 224 224 414 414 414 but i want 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434 ,41437 ,41437) The values thats have more than 4 numerals are whats messing things up. Tried ?formatC as well but couldn't get it to coerce things correctly. Thanks for the help JR -- View this message in context: http://n4.nabble.com/Drop-last-numeral-tp1012347p1012347.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R ( http://www.r/ )-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making routine faster by using apply instead of for-loop
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Etienne Stockhausen Sent: Tuesday, January 12, 2010 10:59 AM To: r-help@r-project.org Subject: [R] Making routine faster by using apply instead of for-loop Hey everybody, I have a small problem with a routine, which prepares some data for plotting. I've made a small example: c=10 mat=data.frame(matrix(1:(c*c),c,c)) row.names(mat)=seq(c,1,length=c) names(mat)=c(seq(2,c,length=c/2),seq(c,2,length=c/2)) v=as.numeric(row.names(mat)) w=as.numeric(names(mat)) for(i in 1:c) { for(j in 1:c) { if(v[j]+w[i]=c)(mat[i,j]=NA) }} This produces exactly the data I need to go on, but if I increase the constant c ,to for instance 500 , it takes a very long time to set the NA's. The first problem is that random (element-by-element) access to a data.frame is much slower than the equivalent access to a matrix. Rewriting your code a bit to use a matrix speeds up the c=500 case by a factor of 750. f0 - function (c = 10) { mat = matrix(1:(c * c), c, c) rownames(mat) = seq(c, 1, length = c) colnames(mat) = c(seq(2, c, length = c/2), seq(c, 2, length = c/2)) v = as.numeric(rownames(mat)) w = as.numeric(colnames(mat)) for (i in 1:c) { for (j in 1:c) { if (v[j] + w[i] = c) { mat[i, j] = NA } } } mat } Rewriting that to insert the NA's one operation speeds it up by another factor of 10 (in the c=500 case) f1 - function (c = 10) { v - seq(c, 1, length = c) w - c(seq(2, c, length = c/2), seq(c, 2, length = c/2)) mat - matrix(1:(c * c), nrow = c, ncol = c, dimnames = list(v, w)) mat[outer(w, v, `+`) = c] - NA mat } If you really want a matrix, pass the output of these functions into data.frame (with check.names=FALSE since the column names are not considered legal on data.frame: the contain duplicates and look numeric). By the way, it is generally a bad idea to use apply() on a data.frame. It is meant for matrices. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com I've heard there is a much faster way to set the NA's using the command apply( ), but I don't know how. I'm looking forward for any ideas or hints, that might help me. Best regards Etienne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Making routine faster by using apply instead of for-loop
Your code is doing too many needless things. The following takes about one second on my slow Vista laptop. n - 500 mat - matrix(1:(n*n), n) v - n:1 z - 2*1:(n/2) w - c(z, rev(z)) for(i in seq_len(n)){ for(j in seq_len(n)){ if(v[j] + w[i] = n)(mat[i,j] - NA) } } rownames(mat) - v colnames(mat) - w str(mat) You end up with matrix, but if you really want a data.frame with duplicate names, that's easy to get. Do you actually want those row/col names or are they just used to identify the cells that get NA? Depending on what you really need, the following may be good enough; takes about 0.1 seconds. n - 500 mat - matrix(1:(n*n), n) for(i in 1:(n/2)){mat[i, -(1:(2*i))] - mat[n+1-i, -(1:(2*i))] - NA} -Peter Ehlers Etienne Stockhausen wrote: Hey everybody, I have a small problem with a routine, which prepares some data for plotting. I've made a small example: c=10 mat=data.frame(matrix(1:(c*c),c,c)) row.names(mat)=seq(c,1,length=c) names(mat)=c(seq(2,c,length=c/2),seq(c,2,length=c/2)) v=as.numeric(row.names(mat)) w=as.numeric(names(mat)) for(i in 1:c) { for(j in 1:c) { if(v[j]+w[i]=c)(mat[i,j]=NA) }} This produces exactly the data I need to go on, but if I increase the constant c ,to for instance 500 , it takes a very long time to set the NA's. I've heard there is a much faster way to set the NA's using the command apply( ), but I don't know how. I'm looking forward for any ideas or hints, that might help me. Best regards Etienne __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Peter Ehlers University of Calgary 403.202.3921 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expand.grid game
How trivial is probably subjective, I don't think it is much above trivial. I would not have been surprised to see this question on an exam in my undergraduate (300 or junior level) probability course (the hard part was remembering the details from that class from over 20 years ago). My favorite test question of all time came from that course: You have a deck of poker cards with the 3's removed (and jokers), you deal yourself 5 cards at random, what is the probability of getting a straight (not including straight flushes)? This problem is simpler. Just think of the 8 places in the number as urns, and the 17 1's as balls to be put into the urns. One ball has to go in the first urn, so you have 16 left, there are choose(16+8-1,8-1) ways to distribute 16 undistinguishable balls among 8 distinguishable urns. But that includes some solutions with more than 9 balls in an urn which violates the digits restriction, so subtract off the illegal counts. If we place 10 balls in the first urn, then we have 7 remaining balls to distribute between the 8 urns or choose( 7+8-1, 7), If we place 1 ball in the first urn and 10 balls in one of the 7 other urns (7*), then there are choose( 6+8-1, 7 ) ways to distribute the remaining 6 balls in the 8 urns. Not too complicated once you remember (or look up) the formula for urns and balls. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: baptiste auguie [mailto:baptiste.aug...@googlemail.com] Sent: Tuesday, January 12, 2010 12:20 PM To: Greg Snow Cc: r-help Subject: Re: [R] expand.grid game Nice --- am I missing something or was this closed form solution not entirely trivial to find? I ought to compile the various clever solutions given in this thread someday, it's fascinating! Thanks, baptiste 2010/1/12 Greg Snow greg.s...@imail.org: This also has a closed form solution: choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7) [1] 229713 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Brian Diggs Sent: Thursday, December 31, 2009 3:08 PM To: baptiste auguie; David Winsemius Cc: r-help Subject: Re: [R] expand.grid game baptiste auguie wrote: 2009/12/19 David Winsemius dwinsem...@comcast.net: On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote: Dear list, In a little numbers game, I've hit a performance snag and I'm not sure how to code this in C. The game is the following: how many 8-digit numbers have the sum of their digits equal to 17? And are you considering the number 0089 to be in the acceptable set? Or is the range of possible numbers in 1079:9800 ? The latter, the first digit should not be 0. But if you have an interesting solution for the other case, let me know anyway. I should also stress that this is only for entertainment and curiosity's sake. baptiste I realize I'm late coming to this, but I was reading it in my post- vacation catch-up and it sounded interesting so I thought I'd give it a shot. After coding a couple of solutions that were exponential in time (for the number of digits), I rearranged things and came up with something that is linear in time (for the number of digits) and gives the count of numbers for all sums at once: library(plyr) nsum3 - function(digits) { digits - as.integer(digits)[[1L]] if (digits==1) { rep(1,9) } else { dm1 - nsum3(digits-1) Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9- x))})) } } nsums - llply(1:8, nsum3) nsums[[5]][17] # [1] 3675 nsums[[8]][17] # [1] 229713 The whole thing runs in well under a second on my machine (a several years old dual core Windows machine). In the results of nsum3, the i- th element is the number of numbers whose digits sum to i. The basic idea is recursion on the number of digits; if n_{t,d} is the number of d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)} n_{t- i,d-1}. (Adding the digit i to each of those numbers makes their sum t and increases the digits to d). When digits==1, then 0 isn't a valid choice and that also implies the sum of digits can't be 0, which fits well with the 1 indexing of arrays. -- Brian Diggs, Ph.D. Senior Research Associate, Department of Surgery, Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LD50 and SE in GLMM (lmer)
Thank you very much for the code Bill! I only had to change very few things to make it work for probit and lmer (instead of glmmPQL) and it works perfectly! Here's my code (I had some trouble with the output style glm.dose, so I just have it come out as an ugly list now, which isn't a problem since I only have to put it into a table). model4 - lmer (y~time + (1|blc/instar),REML=FALSE, family=binomial(link=probit)) summary(model4) dose.p.glmm - function(model4, cf = 1:2, p = 0.5) { eta - probit(p) b - fixef(model4)[cf] k - (eta - b[1])/b[2] names(k) - paste(p = , format(p), :, sep = ) pd - -cbind(1, k)/b[2] SE - sqrt(((pd %*% vcov(model4)[cf,cf]) * pd) %*% c(1, 1)) list(k, SE) } dose.p.glmm (model4, cf=1:2, p=0.5) With this, I don't even need the pnorm- probit transformation. Thank you very much! Linda Date: Mon, 11 Jan 2010 10:21:50 -0500 Subject: Re: [R] LD50 and SE in GLMM (lmer) From: billpikou...@gmail.com To: patili_bue...@hotmail.com CC: r-help@r-project.org Sorry for the delay in response. I had a somewhat similar need recently with the difference that I used a logit link for a bioassay. The design had different dose-response replicates that I modeled as blocks. It looks like you are concentrating on estimation of fixed effects and thus the population / marginal LD50 estimate. If so, then there is a function called dose.p in the MASS package, courtesy of Venables and Ripley, which is used in the context of an example on 190 - 194 of the 4th edition of their book (2002), 4th ediiion, that I think would be very helpful to study. The example code can also be found in the ch07.R file in the scripts sub-directory/folder of the MASS package directory/folder. The example illustrates the use of GLM with a logit link. To adapt it for use with a GLMM, I came up with the following, which is nearly identical to how dose.p is defined in R 2.10.0 dose.p.glmm - function(obj, cf = 1:2, p = 0.5) { eta - obj$family$linkfun(p) b - fixef(obj)[cf] x.p - (eta - b[1L])/b[2L] names(x.p) - paste(p = , format(p), :, sep = ) pd - -cbind(1, x.p)/b[2L] SE - sqrt(((pd %*% vcov(obj)[cf, cf]) * pd) %*% c(1, 1)) res - structure(x.p, SE = SE, p = p) class(res) - glm.dose res } Essentially only the fixef() call in the 2nd line of the body was needed to replace the coef() call. Please also note that I used this for a glmmPQL() call from the MASS package, not lmer(). And one more question: is it correct to use pnorm (where John Maindonald used exp(hat)/(1+exp(hat)))? Unfortunately I don't know offhand, and do not have a reference handy to check to be sure, so perhaps you can find a local statistician to help? I myself always have a preference to use the logit / logistic over probit, as they are both symmetric around 0.5 and are often reported to provide similar results. Hope that helps, Bill ### Bill Pikounis Statistician 2010/1/7 Linda Bürgi patili_bue...@hotmail.com: Hi All! I am desperately needing some help figuring out how to calculate LD50 with a GLMM (probit link) or, more importantly, the standard error of the LD50. I conducted a cold temperature experiment and am trying to assess after how long 50% of the insects had died (I had 3 different instars (non significant fixed effect) and several different blocks (I did 4 replicates at a time)= random effect). Since there is no predict function for lmer, I used the following to get predicted values (thanks to a post by John Maindonald (I'll attach his post below)): model4 - lmer (y~time + (1|blc/instar),family=binomial(link=probit)) summary(model4) b - fixef(model4) X - (model.matrix(terms(model4),zerotest)) hat - X%*%b pxal - pnorm(hat)# probit link, for logit it would be: pval - exp(hat)/(1+exp(hat)) pval Once I get the pval, I see where the 0.5 predicted value lies and I adjust the x's in zerotest to be more detailed in that range, eg. x: 1-420hours, I see that 0.5 is in the 320hours area, so I adjust x to be 320.1, 320.2, 320.3, etc. to get the precise 0.500. Very clumsy but I guess it's correct? Now my biggest problem: how do I get the SE? John Maindonald goes on to do this: U - chol(as.matrix(summary(model4)@vcov)) se - sqrt(apply(X%*%t(U), 1, function(x)sum(x^2))) list(hat=hat, se=se, x=X[,xcol]) Unfortunately, I could not figure out what the chol(as.matrix...) part is about (chol does what?) and therefore I have no idea, how to use this code to get my LD50 SE (I would need the SE to be expressed in terms of x). Could anybody help me with this? And one more question: is it correct to use pnorm (where John Maindonald used exp(hat)/(1+exp(hat)))? Thanks so much in advance! Linda Previous post by John Maindonald: ciplot - function(obj=model4,
[R] [Solved][Code Snippets] Dropping Empty Regressors
To make a long story short I was doing some in-sample testing in which some dynamically created regressors would end up either all true or all false based on the validation portion. In my case a new mainframe configuration (this is a crappy way to handle a level shift but I do what I can.) So here is the code snippet that finally let me pre-check my regressors and drop any of them that were all true or all false. First the automagic STL outlier grabber that caused part of the problem: # tsSource being my time Series source. # sh2 is a table of all my regessors that have been previously pulled in # this has historic and future values in it also, it gets sliced later. # the EOM is the regessor holding weeks that contain an 'End of Month' # # This appends the found IOs to the regressor table. Stepwise tends to # remove them later on. I needed a programtic way of removing useless # regressors for model verification since I would not know their names # if any are found tsSourceDiag - stl(tsSource,s.window=per, robust=TRUE) # tsSourceIO - which(tsSourceDiag $ weights 1e-8) # # This is how to append run-time regessors for(z in tsSourceIO) { tmpname -paste(PreIO,z,sep=) #COPY EOM AS A TEMPLATE sh2[[tmpname]] - sh2[[EOM]] #SET IT ALL TO 0 sh2[[tmpname]][]-FALSE #SET The Proper Indice to TRUE sh2[[tmpname]][z]- TRUE } So to get rid of them (those empty useless regressors) I cooked up this: ### #Prune Empty Regressors (All false or all true) # the newmcReg you see is a copy of the sh2 from earlier # newmcReg = New Model Current Regressors # sh2 later became cReg. # # Yes it makes my eyes bleed. in short we count all the trues # and all the false and if they happen to be the same number # as the length we know they are all true or false. # # the trick I finally found was that you could in fact -c() # a list (e.g. ask for everything but the following) but you # can't apparently do that inline so we just make a list of # regressors that get shown the door then after hunting # them down we give em the boot. This mess is soley # so my in-sample Arima doesn't choke on xreg=newmcReg # in which one of the newmcReg happen to be all true or false. # # God I wish I had taken more then a Trig course. Where was I? # # Yes that phantom 'i' you see is that this is all in a big loop # for 6 possible models # lm1 = all regressors w/ intercept # lm2 = lm1 stepwise removal # lm3 = all regressors wo/ intercept # lm4 = lm3 stepwise removal # lm5 = Hand Tuned # lm6 = lm5 stepwise removal ### toPurge=c() for(k in names(newmcReg[[i]])) { print (paste(check to see if,k,is a useless regressors for model,i)) if(sum(newmcReg[[i]][k][,1])==length(newmcReg[[i]][k][,1])) { print(paste(All of,k,are TRUE)) getLost=which(names(newmcReg[[i]])==k) toPurge=c(toPurge,getLost) print(paste(k, has been added to the purge list for model, i,!)) } if(sum(newmcReg[[i]][k][,1]==FALSE)==length(newmcReg[[i]][k][,1])) { print(paste(All of,k,are FALSE)) getLost=which(names(newmcReg[[i]])==k) toPurge=c(toPurge,getLost) print(paste(k, has been added to the purge list for model, i,!)) } } toPurge # Do this only if there are any or R will beat you senseless and # steal all your MMs! if(length(toPurge)!=0) { names(newmcReg[[i]]) names(newmcReg[[i]][-c(toPurge)]) newmcReg[[i]] - newmcReg[[i]][-c(toPurge)] newmfReg[[i]] - newmfReg[[i]][-c(toPurge)] names(newmcReg[[i]]) } ## # End Regressor Pruning ## Big thanks to the help so far. Now about those darn transfer functions... hmm and pulse detection... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] expand.grid game
On 13/01/2010, at 9:19 AM, Greg Snow wrote: How trivial is probably subjective, I don't think it is much above trivial. I would not have been surprised to see this question on an exam in my undergraduate (300 or junior level) probability course (the hard part was remembering the details from that class from over 20 years ago). My favorite test question of all time came from that course: You have a deck of poker cards with the 3's removed (and jokers), you deal yourself 5 cards at random, what is the probability of getting a straight (not including straight flushes)? This problem is simpler. Just think of the 8 places in the number as urns, and the 17 1's as balls to be put into the urns. One ball has to go in the first urn, so you have 16 left, there are choose(16 +8-1,8-1) ways to distribute 16 undistinguishable balls among 8 distinguishable urns. But that includes some solutions with more than 9 balls in an urn which violates the digits restriction, so subtract off the illegal counts. If we place 10 balls in the first urn, then we have 7 remaining balls to distribute between the 8 urns or choose( 7+8-1, 7), If we place 1 ball in the first urn and 10 balls in one of the 7 other urns (7*), then there are choose( 6 +8-1, 7 ) ways to distribute the remaining 6 balls in the 8 urns. Not too complicated once you remember (or look up) the formula for urns and balls. Sorry to be a thicko --- but doesn't the foregoing solution *leave in* the possibility of putting all 17 balls in the first urn? Or 3 balls in the first urn, 12 in the second, and the remaining 2 in any of the other six urns? Etc. I.e. don't more terms have to be subtracted? cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Drop last numeral
The Below worked best for my purposes. Thanks everyone. Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) substr(Data,1,nchar(Data)-1) LCOG1 wrote: Hello all, Frustrated and i know you can help I need to drop the last numeral of each of my values in my data set. So for the following i have tried the ?substring but since i have to specify the length, but because my data are of varying lengths it doenst work so well Data-c(1131, 1132, 1731 ,1732 ,1821 ,1822, 2221 ,, 2241 ,2242,414342 ,414371 ,414372) Bldgid-substring(as.character(Data),1,3) returns: 113 113 173 173 182 182 222 222 224 224 414 414 414 but i want 113, 113, 173 ,173 ,182 ,182, 222 ,222, 224 ,224,41434 ,41437 ,41437) The values thats have more than 4 numerals are whats messing things up. Tried ?formatC as well but couldn't get it to coerce things correctly. Thanks for the help JR -- View this message in context: http://n4.nabble.com/Drop-last-numeral-tp1012347p1012492.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] some help regarding combining columns from different files
Hi Jim, I am want to merge two files into one file : Here is my code . But the problem with this is that I am getting the 2nd file appended to the first when i write temp3 in my code to the text file. I am not sure what mistake I am doing . also find the test files to run the code . Please help me with this !!! temp1 - NULL temp2 - NULL x.col.names -c(genesymbol,geneDescription,orgSymbol,orgName) y.col.names - c(genesymbol,geneDescription,orgSymbol,orgName) for (i in 1:length(list1.bp.files.names)){ temp1 - read.table(list1.bp.files.names[i],sep=\t,header=T,stringsAsFactors=F,quote=\) for (j in 1:length(list2.bp.files.names)){ temp2 - read.table(list2.bp.files.names[j],sep=\t,header=T,stringsAsFactors=F,quote=\) temp3 - merge(temp1,temp2,by.x = x.col.names,by.y=y.col.names,all=T) myfile-gsub(( ), , paste(1_,merge.bp.files.names[i],.txt)) write.table(temp3,file=myfile,sep=\t,quote=FALSE,row.names=F) } } Thanks --Hari-- genesymbol geneDescription orgSymbol orgName E2f5 e2f transcription factor 5 RG Rattus norvegicus Msh2muts homolog 2 (e. coli)RG Rattus norvegicus Kpna2 karyopherin (importin) alpha 2 RG Rattus norvegicus Gtpbp4 gtp binding protein 4 RG Rattus norvegicus Dtymk_predicted deoxythymidylate kinase (predicted) RG Rattus norvegicus Ruvbl1 ruvb-like protein 1 RG Rattus norvegicus Cetn2 centrin 2 RG Rattus norvegicus Foxm1 forkhead box m1 RG Rattus norvegicus Abtb1 ankyrin repeat and btb (poz) domain containing 1RG Rattus norvegicus Myc myelocytomatosis viral oncogene homolog (avian) RG Rattus norvegicus Il1binterleukin 1 beta RG Rattus norvegicus Cdc20 cell division cycle 20 homolog (s. cerevisiae) RG Rattus norvegicus Cdc25a cell division cycle 25 homolog a (s. cerevisiae)RG Rattus norvegicus Kifc1 kinesin family member c1RG Rattus norvegicus Fancd2 fanconi anemia d2 protein RG Rattus norvegicus Rhobrhob gene RG Rattus norvegicus Clp1cardiac lineage protein 1 RG Rattus norvegicus Psmd1 proteasome (prosome, macropain) 26s subunit, non-atpase, 1 RG Rattus norvegicus Mad2l1_predictedmad2 (mitotic arrest deficient, homolog)-like 1 (yeast) (predicted) RG Rattus norvegicus Dhcr24 24-dehydrocholesterol reductase RG Rattus norvegicus Ahr aryl hydrocarbon receptor RG Rattus norvegicus Rnd3ras homolog gene family, member e RG Rattus norvegicus Acvr1b activin a receptor, type 1b RG Rattus norvegicus Mcm2_predicted minichromosome maintenance deficient 2 mitotin (s. cerevisiae) (predicted) RG Rattus norvegicus Mapre3 microtubule-associated protein, rp/eb family, member 3 RG Rattus norvegicus Mapre1 microtubule-associated protein, rp/eb family, member 1 RG Rattus norvegicus Tardbp tar dna binding protein RG Rattus norvegicus Cdca3 cell division cycle associated 3RG Rattus norvegicus Ccnb1 cyclin b1 RG Rattus norvegicus Npm1nucleophosmin 1 RG Rattus norvegicus Pcafp300/cbp-associated factor RG Rattus norvegicus Cdc2a cell division cycle 2 homolog a (s. pombe) RG Rattus norvegicus Dnajc2 dnaj (hsp40) homolog, subfamily c, member 2 RG Rattus norvegicus Dab2ip disabled homolog 2 (drosophila) interacting protein RG Rattus norvegicus Id2 inhibitor of dna binding 2, dominant negative helix-loop-helix protein RG Rattus norvegicus Kif23_predicted kinesin family member 23 (predicted)RG Rattus norvegicus Nek6nima (never in mitosis gene a)-related expressed kinase 6 RG Rattus norvegicus Pola1 polymerase (dna directed), alpha 1 RG Rattus norvegicus Il1ainterleukin 1 alpha RG Rattus norvegicus Ccnccyclin cRG Rattus norvegicus Ccnb2 cyclin b2 RG Rattus norvegicus Pbef1 pre-b-cell colony enhancing factor 1RG Rattus norvegicus Rad17 rad17 homolog (s. pombe)RG Rattus norvegicus Racgap1_predicted rac gtpase-activating protein 1 (predicted) RG Rattus norvegicus Ccna2 cyclin a2 RG Rattus norvegicus Cdca8 cell division cycle associated 8RG Rattus norvegicus Sesn1_predicted sestrin 1 (predicted) RG Rattus norvegicus Tpx2_predicted tpx2, microtubule-associated protein homolog (xenopus laevis) (predicted) RG Rattus norvegicus Dmtf1 cyclin d binding myb-like transcription factor 1RG Rattus norvegicus Chek1 checkpoint kinase 1 homolog (s. pombe) RG Rattus norvegicus Mlh1mutl homolog 1 (e. coli)RG Rattus norvegicus Cgref1 cell growth regulator with ef hand domain 1 RG Rattus norvegicus
[R] Strange behavior when trying to piggyback off of fitdistr
Hello. I am not certain even how to search the archives for this particular question, so if there is an obvious answer, please smack me with a large halibut and send me to the URLs. I have been experimenting with fitting curves by using both maximum likelihood and maximum spacing estimation techniques. Originally, I have been writing distribution-specific functions in 'R' which work rather well. As the procedure is identical for all distributions, other than the actual distribution function itself, I thought I would try to build a single function that accepted the distribution as an input and returned the results. Never having played with calls, formals, and arguments before, I figured I would rip off, cough, cough, be inspired by the venerable fitdistr in MASS, which accepts distribution inputs. After a few hours, I actually got it working decently (although unfinished). However, I am finding something very weird. At its core, the technique requires the difference between the value of the cumulative distribution function at neighboring evaluations. I implement this by running p($DIST) on the vector of sorted losses (call it SP), creating two new vectors, one c(0, SP) and one c(SP, 1), and then taking the latter minus the former. If there happen to be two instances of the same value, unless it is known rounding error, one substitutes the density at that point for the difference in the cumulative distribution (which would be 0, as the CDF of two identical values is the same). So, I run d($DIST), add a 0 in front to make it the same length, and return a new vector equal to pmax.int(DIFFERECEN, DENSITY), with the idea that the density is always 0 and always less than the difference in cumulative distributions, so it will only be max in the case of DIFFERENCE being trul! y 0. I then take negative the sum of the log of the differences and that is the function passed to optim. What is weird is when I leave out the density correction (which is safe 99.% of the time as the chances of two identical losses is almost 0 (assuming no clustering/capping) ), I get a very similar result to my distribution-customized function which calls the proper plnorm or pgenpareto directly. When I add in the correction, the value is orders of magnitude higher, which not only affects the fit (slightly) but also affects the goodness of fit statistics. I have no idea why this happens, although in theory, if the function is pulling too many density values, it would return a higher value as the densities are much closer to 0 so the neg-log is a larger number. In the code pasted below, if spacing returns -sum(log(SP2), it works fine. If it returns -(sum(log(SP3)), it gives strange results. I do not have the S programming language book (perhaps I should invest in it) and the online help wasn't that helpful to me, so I would very much appreciate any responses y'all may have. Thank you very much, --Avi # Code (Unfinished) MSEFit - function (x, distfun, start, ...) { require (MASS); require (actuar); Call - match.call(expand.dots = TRUE) if (missing(start)) start - NULL dots - names(list(...)) if (missing(x) || length(x) == 0L || mode(x) != numeric) stop('x' must be a non-empty numeric vector) if (any(!is.finite(x))) stop('x' contains missing or infinite values) if (missing(distfun) || !(is.function(distfun) || is.character(distfun))) stop('density' must be supplied as a function or name) n - length(x) if (is.character(distfun)) { distname - tolower(distfun) densfun - switch(distname, exp = dexp, exponential = dexp, gamma = dgamma, `log-normal` = dlnorm, lnorm = dlnorm, lognormal = dlnorm, weibull = dweibull, pareto = dpareto, loglogistic = dllogis, transbeta = dtrbeta, `transformed beta` = dtrbera, burr = dburr, paralogistic = dparalogis, genpareto = dgenpareto, generalizedpareto = dgenpareto, `generalized pareto` = dgenpareto, invburr = dinvburr, `inverse burr` = dinvburr, invpareto = dinvpareto, `inverse pareto` = dinvpareto, invparalogistic = dinvparalogis, `inverse paralogistic` = dinvparalogis, transgamma = dtrgamma, `transformed gamma` = dtrgamma, invexp = dinvexp, `inverse exponential` = dinvexp, invtransgamma = dinvtrgamma, `inverse transformed gamma` = dinvtrgamma, invgamma = dinvgamma, `inverse gamma` = dinvgamma, invweibull = dinvweibull, `inverse weibull` = dinvweibull, loggamma = dlgamma, genbeta = dgenbeta, `generalized beta` = dgenbeta, NULL) if (is.null(densfun)) stop(unsupported distribution) distfun - switch(distname, exp = pexp, exponential = pexp, gamma = pgamma, `log-normal` = plnorm, lnorm = plnorm, lognormal = plnorm, weibull =
Re: [R] trouble with installing SJava
Jiiindo wrote: Colleagues, How i can solve this error when i install SJava package A more recent version of SJava is available with source('http://bioconductor.org/biocLite.R') biocLite('SJava') rJava is an alternative. Martin Thanks R CMD INSTALL -c /usr/local/lib/R/SJava_0.69-0.tar.gz * installing to library ‘/usr/local/lib/R/site-library’ * installing *source* package ‘SJava’ ... checking for java... /usr/lib/jvm/java-6-sun/bin/java Java VM /usr/lib/jvm/java-6-sun/bin/java checking for javah... /usr/lib/jvm/java-6-sun/bin/javah Looking in /usr/lib/jvm/java-6-sun/include Looking in /usr/lib/jvm/java-6-sun/include/linux checking for g++... g++ checking for C++ compiler default output... a.out checking whether the C++ compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C++ compiler... yes checking whether g++ accepts -g... yes checking for gcc... gcc checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ANSI C... none needed checking for Rf_initEmbeddedR in -lR... no No R shared library found configure: creating ./config.status config.status: creating Makevars config.status: creating src/Makevars config.status: creating src/RSJava/Makefile config.status: creating Makefile_rules config.status: creating inst/scripts/RJava.bsh config.status: creating inst/scripts/RJava.csh config.status: creating R/zzz.R config.status: creating cleanup config.status: creating inst/scripts/RJava Copying the cleanup script to the scripts/ directory Building libRSNativeJava.so in /tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava if test ! -d /usr/local/lib/R/site-library/SJava/libs ; then \ mkdir /usr/local/lib/R/site-library/SJava/libs ; \ fi gcc -std=gnu99 -g -O2 -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -c CtoJava.c CtoJava.cweb:148: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'vm1_args' CtoJava.cweb:215: error: static declaration of 'std_env' follows non-static declaration CtoJava.cweb:195: error: previous declaration of 'std_env' was here CtoJava.cweb: In function 'create_Java_vm': CtoJava.cweb:256: error: 'vm1_args' undeclared (first use in this function) CtoJava.cweb:256: error: (Each undeclared identifier is reported only once CtoJava.cweb:256: error: for each function it appears in.) make: *** [CtoJava.o] Error 1 Generating JNI header files from Java classes. RForeignReference, RManualFunctionActionListener, ROmegahatInterpreter REvaluator * Warning: At present, to use the library you must set the LD_LIBRARY_PATH environment variable to /usr/local/lib/R/site-library/SJava/libs:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386/server:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/lib/i386:/usr/lib/jvm/java-6-sun-1.6.0.14/jre/../lib/i386::/usr/java/packages/lib/i386:/lib:/usr/lib or use one of the RJava.bsh or RJava.csh scripts * ** libs gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -IRSJava -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -I/usr/local/include-fpic -g -O2 -c ConverterExamples.c -o ConverterExamples.o ConverterExamples.cweb: In function ‘RS_JAVA_setFunctionConverter’: ConverterExamples.cweb:213: warning: assignment discards qualifiers from pointer target type ConverterExamples.cweb: In function ‘RS_JAVA_toJavaFunctionConverter’: ConverterExamples.cweb:312: warning: passing argument 1 of ‘getOmegahatReferenceValue’ discards qualifiers from pointer target type gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -IRSJava -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -I/usr/local/include-fpic -g -O2 -c Converters.c -o Converters.o Converters.cweb: In function ‘RS_JAVA_removeConverter’: Converters.cweb:399: warning: assignment discards qualifiers from pointer target type gcc -std=gnu99 -I/usr/local/lib/R/include -D_R_ -I/usr/local/lib/R/include -I/usr/local/lib/R/include/R_ext -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/src/RSJava -I. -I/tmp/RtmpdcTvTv/R.INSTALL327b23c6/SJava/inst/include -IRSJava -I/usr/lib/jvm/java-6-sun/include -I/usr/lib/jvm/java-6-sun/include/linux -I/usr/local/include
Re: [R] expand.grid game --- never mind, I figured it out!
I re-read the solution that you posted and realized where my thinking was going wrong. Sorry (again!) for being a thicko. cheers, Rolf Turner On 13/01/2010, at 9:19 AM, Greg Snow wrote: How trivial is probably subjective, I don't think it is much above trivial. I would not have been surprised to see this question on an exam in my undergraduate (300 or junior level) probability course (the hard part was remembering the details from that class from over 20 years ago). My favorite test question of all time came from that course: You have a deck of poker cards with the 3's removed (and jokers), you deal yourself 5 cards at random, what is the probability of getting a straight (not including straight flushes)? This problem is simpler. Just think of the 8 places in the number as urns, and the 17 1's as balls to be put into the urns. One ball has to go in the first urn, so you have 16 left, there are choose(16 +8-1,8-1) ways to distribute 16 undistinguishable balls among 8 distinguishable urns. But that includes some solutions with more than 9 balls in an urn which violates the digits restriction, so subtract off the illegal counts. If we place 10 balls in the first urn, then we have 7 remaining balls to distribute between the 8 urns or choose( 7+8-1, 7), If we place 1 ball in the first urn and 10 balls in one of the 7 other urns (7*), then there are choose( 6 +8-1, 7 ) ways to distribute the remaining 6 balls in the 8 urns. Not too complicated once you remember (or look up) the formula for urns and balls. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: baptiste auguie [mailto:baptiste.aug...@googlemail.com] Sent: Tuesday, January 12, 2010 12:20 PM To: Greg Snow Cc: r-help Subject: Re: [R] expand.grid game Nice --- am I missing something or was this closed form solution not entirely trivial to find? I ought to compile the various clever solutions given in this thread someday, it's fascinating! Thanks, baptiste 2010/1/12 Greg Snow greg.s...@imail.org: This also has a closed form solution: choose(16+8-1,7) - choose(7+8-1, 7) - 7*choose(6+8-1,7) [1] 229713 -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Brian Diggs Sent: Thursday, December 31, 2009 3:08 PM To: baptiste auguie; David Winsemius Cc: r-help Subject: Re: [R] expand.grid game baptiste auguie wrote: 2009/12/19 David Winsemius dwinsem...@comcast.net: On Dec 19, 2009, at 9:06 AM, baptiste auguie wrote: Dear list, In a little numbers game, I've hit a performance snag and I'm not sure how to code this in C. The game is the following: how many 8-digit numbers have the sum of their digits equal to 17? And are you considering the number 0089 to be in the acceptable set? Or is the range of possible numbers in 1079:9800 ? The latter, the first digit should not be 0. But if you have an interesting solution for the other case, let me know anyway. I should also stress that this is only for entertainment and curiosity's sake. baptiste I realize I'm late coming to this, but I was reading it in my post- vacation catch-up and it sounded interesting so I thought I'd give it a shot. After coding a couple of solutions that were exponential in time (for the number of digits), I rearranged things and came up with something that is linear in time (for the number of digits) and gives the count of numbers for all sums at once: library(plyr) nsum3 - function(digits) { digits - as.integer(digits)[[1L]] if (digits==1) { rep(1,9) } else { dm1 - nsum3(digits-1) Reduce(+, llply(0:9, function(x) {c(rep(0,x),dm1,rep(0,9- x))})) } } nsums - llply(1:8, nsum3) nsums[[5]][17] # [1] 3675 nsums[[8]][17] # [1] 229713 The whole thing runs in well under a second on my machine (a several years old dual core Windows machine). In the results of nsum3, the i- th element is the number of numbers whose digits sum to i. The basic idea is recursion on the number of digits; if n_{t,d} is the number of d-digit numbers that sum to t, then n_{t,d} = \sum_{i\in(0,9)} n_{t- i,d-1}. (Adding the digit i to each of those numbers makes their sum t and increases the digits to d). When digits==1, then 0 isn't a valid choice and that also implies the sum of digits can't be 0, which fits well with the 1 indexing of arrays. -- Brian Diggs, Ph.D. Senior Research Associate, Department of Surgery, Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal,
[R] parsing protocol of states
Dear R-users, actually i try to parse some state protocols for my work. i an easy stetting the code below works fine, if states are reached only once. in harder settings it could be possible that one state gets visited more times. in this case for me its interesting to see how much waiting time lies between to states on the whole. by the way i didn't use R as a parsing tool so far, so any advice for doing this more effectivly are quite welcome. str01 - 2007-10-12 11:50:05 state B. ,2007-10-12 11:50:05 state C. ,2007-10-12 13:23:24 state D. ,2007-10-12 13:23:43 state E. ,2007-10-14 15:43:19 state F. ,2007-10-14 15:43:20 state E. ,2007-10-14 15:43:25 state G. ,2007-10-14 15:43:32 state H. ,2007-10-14 15:43:41 state I. ,2007-10-14 15:43:47 state F. ,2007-10-14 15:43:47 state G. ,2007-10-14 15:48:08 state H. ,2007-10-16 10:10:20 state J. ,2007-10-19 11:12:54 state K ,2007-10-19 11:17:37 state D. ,2007-10-19 11:17:42 state E. ,2007-10-19 11:17:49 state F. ,2007-10-19 11:17:51 state E. ,2007-10-19 11:17:58 state H. ,2007-10-19 11:18:05 state J. ,2007-10-19 11:21:45 state L. str02 - unlist(strsplit(str01, \\,)) x1 - grep(state B, str02) x2 - grep(state C, str02) x3 - grep(state D, str02) x4 - grep(state E, str02) x5 - grep(state F, str02) x6 - grep(state G, str02) x7 - grep(state H, str02) x8 - grep(state I, str02) x9 - grep(state J, str02) x10 - grep(state K, str02) x11 - grep(state L, str02) t1 - substr(str02[x1], 1, 19) t1 - as.POSIXct(strptime(t1, %Y-%m-%d %H:%M:%S)) t2 - substr(str02[x2], 1, 19) t2 - as.POSIXct(strptime(t2, %Y-%m-%d %H:%M:%S)) t3 - substr(str02[x3], 1, 19) t3 - as.POSIXct(strptime(t3, %Y-%m-%d %H:%M:%S)) t4 - substr(str02[x4], 1, 19) t4 - as.POSIXct(strptime(t4, %Y-%m-%d %H:%M:%S)) t5 - substr(str02[x5], 1, 19) t5 - as.POSIXct(strptime(t5, %Y-%m-%d %H:%M:%S)) t6 - substr(str02[x6], 1, 19) t6 - as.POSIXct(strptime(t6, %Y-%m-%d %H:%M:%S)) t7 - substr(str02[x7], 1, 19) t7 - as.POSIXct(strptime(t7, %Y-%m-%d %H:%M:%S)) t8 - substr(str02[x8], 1, 19) t8 - as.POSIXct(strptime(t8, %Y-%m-%d %H:%M:%S)) t9 - substr(str02[x9], 1, 19) t9 - as.POSIXct(strptime(t9, %Y-%m-%d %H:%M:%S)) t10 - substr(str02[x10], 1, 19) t10 - as.POSIXct(strptime(t10, %Y-%m-%d %H:%M:%S)) t11 - substr(str02[x11], 1, 19) t11 - as.POSIXct(strptime(t11, %Y-%m-%d %H:%M:%S)) as.numeric(difftime(t11, t1, units=days)) ## waiting times between state E and F sum(as.numeric(difftime(t5, t4, units=days))) best regards Andreas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R for windows 64 bit
On 12.01.2010 21:07, Alexander Shenkin wrote: Hi Alessia, Note that, while your physical limit might be 6 GB, Windows memory management allows more memory than that to be allocated (aka Virtual Memory, or at least that's what they called it in XP). Windows swaps out memory from RAM to the hard disk and back when necessary (please excuse the explanation if you already know all this). For processing large vectors, this swapping might bring your system to a standstill. Regardless, the maximum memory for a windows process is larger than the physical RAM you have available. allie In this case 6Gb was the default (as physical maximum in the particular machine) and there was bug in the *experimental* version of R that did not allow to increase memory size from within R using memory.limit() which already has been fixed thanks to Brian Ripley. Uwe Ligges On 1/12/2010 6:27 AM, alessia matano wrote: Fine, it worked. I will try in this way. Just the last question and I won't bother you further today. My machine right now has just 6 giga of RAM (it will be increased to 16 in a few days), and I see that with this experimental version memory.limit is 6135. How is the command to increase the memory usage until the maximum I can (5 giga?). If I am writing memory.limit(5000) it still gives me the error: don't be silly! Your machine has a 4Gb address limit which is quite odd. Many thanks Best A. 2010/1/12 alessia matanoalexis@gmail.com: ok, perfect! I will try with it...many many thanks. Have you got there also the quantreg package, which has actually the same problem of sparseM (32bit version)? best alessia 2010/1/12 Uwe Liggeslig...@statistik.tu-dortmund.de: On 12.01.2010 12:09, alessia matano wrote: I am sorry, I know it is an experimental version, and I have been misleading saying a new version. Therefore, I will wait for when they will be available officially, since it is just a few days. Or just use today my private repository I indicated in the other mail. Uwe Ligges However, I tried also to go to the cran pages and download them and insert into the library. For quantreg it worked, for sparseM it did not probably because it's a win32 version, as you said. 2010/1/12 Prof Brian Ripleyrip...@stats.ox.ac.uk: On Tue, 12 Jan 2010, alessia matano wrote: Dear all, I just download and set this new version of R. I am now trying to download the packages I need which are sperseM and quantreg. I downloaded and insert into the library file the quantreg pacjkage and it seems to work. However, when I try to do the same with sparseM I get the following error message: Loading required package: SparseM Error in inDL(x, as.logical(local), as.logical(now), ...) : unable to load shared library 'C:/PROGRA~1/R/R-211~1.0DE/library/SparseM/libs/SparseM.dll': LoadLibrary failure: %1 non è un'applicazione di Win32 valida. Any help for it? Please do refer to the posting referred to in that thread (and Henrique, please do not post just the URL without the explanations). https://stat.ethz.ch/pipermail/r-devel/2010-January/056301.html You cannot mix 32-bit Windows binary packages with this experimental port (it is not a 'new version'): you need to install from the package sources. If that is too difficult for you, please do not try to use unsupported experimental builds (and Uwe Ligges may have some binary packages available for test in a few days). Thanks a lot alessia 2010/1/11 Henrique Dallazuannawww...@gmail.com: Try this version (beta of development version): http://www.stats.ox.ac.uk/pub/RWin/Win64/R-2.11.0dev-win64.exe On Mon, Jan 11, 2010 at 2:29 PM, alessia matanoalexis@gmail.com wrote: Dear all, do you know if there is any particular version of R to implement with windows 64 bit, in such a way to increase the amount of memory it can use? How should I increase the memory, and more importantly to set a higher max vector size? It still stops me saying Could not allocate vector of size 145 thanks to all alessia __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595
[R] Problems connecting with MySQL using odbcDriverConnect (RODBC package) on Linux
I am sure I'm doing something wrong here but not sure what. Our system administrator recently installed UnixODBC and the MyODBC driver on a Linux box running Linux version 2.6 x86_64. I have an .odbc.ini file in my home directory with following lines: [mydb] Description = MySQL server on my-server Driver=/usr/lib64/libmyodbc3.so SERVER=my-server I can successfully do the following: library(RODBC) channel - odbcConnect(mydb) sqlQuery(channel, show databases) And in general, I have no problems using odbcConnect to connect to the mydb DSN. However, for various reasons I want to make a DSN-less connection using odbcDriverConnect. However, everything I've tried generated a data source not found message (see below for details) After reading through various documents, I tried doing following. (1) Put an odbcinst.ini file in my home directory with following lines [MySQL] Description = ODBC for MySQL Driver=/usr/lib64/libmyodbc3.so Setup = /usr/lib/libodbcmyS.so FileUsage = 1 (2) Install it with odbcinst -i -f. This seems to work as when I type odbcinst -j I get DRIVERS: /home/jmarcus/odbcinst.ini SYSTEM DATA SOURCES: /home/jmarcus/odbc.ini USER DATA SOURCES..: /home/jmarcus/.odbc.ini (2) Set the environment variable to point to this file: bash-3.2$ ODBCSYSINI=/home/jmarcus bash-3.2$ export ODBCSYSINI (3) Start R Note that R has inherited environment variable Sys.getenv(ODBCSYSINI) ODBCSYSINI /home/jmarcus (4) Try to connect to the MySQL server conn - odbcDriverConnect(connection=Driver={MySQL};Server=my-server;Database=m y_database;Uid=my_username;Pwd=my_password) This generates following: Warning messages: 1: In odbcDriverConnect(connection = Driver={MySQL};Server=my-server;Database=my_database;Uid=my_username;Pw d=my_password) : [RODBC] ERROR: state IM002, code 0, message [unixODBC][Driver Manager]Data source name not found, and no default driver specified 2: In odbcDriverConnect(connection = Driver={MySQL};Server=my-server;Database=my_database;Uid=my_username;Pw d=my_password) : ODBC connection failed Can anyone see what I'm doing wrong? Thanks. Jeff __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Forming Portfolios for Fama / French Regression
Kai, Your question is best addressed to r-sig-fina...@stat.math.ethz.ch as it is finance related question. Jude ___ Jude Ryan Director, Client Analytical Services Strategy Business Development UBS Financial Services Inc. 1200 Harbor Boulevard, 4th Floor Weehawken, NJ 07086-6791 Tel. 201-352-1935 Fax 201-272-2914 Email: jude.r...@ubs.com Please do not transmit orders or instructions regarding a UBS account electronically, including but not limited to e-mail, fax, text or instant messaging. The information provided in this e-mail or any attachments is not an official transaction confirmation or account statement. For your protection, do not include account numbers, Social Security numbers, credit card numbers, passwords or other non-public information in your e-mail. Because the information contained in this message may be privileged, confidential, proprietary or otherwise protected from disclosure, please notify us immediately by replying to this message and deleting it from your computer if you have received this communication in error. Thank you. UBS Financial Services Inc. UBS Financial Services Incorporated of Puerto Rico UBS AG\ \ \ UBS reserves the right to retain all message...{{dropped:7}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.