Re: [R] quick question about lme()
Hi Thierry, Thank you very much. Best regards, YA On 2012-4-15 1:06, ONKELINX, Thierry wrote: Both specifications are the same model. An intercept is added by default unless you use +0 or -1 like ~0 + LEAD|GRP or ~ -1 + LEAD|GRP Van: r-help-boun...@r-project.org [r-help-boun...@r-project.org] namens YA [xinxi...@163.com] Verzonden: zaterdag 14 april 2012 18:33 Aan: r-help@r-project.org Onderwerp: [R] quick question about lme() Hi everyone, I have a quick question about the random part in lme(). Here is the code: lme(WBEING~HRS+LEAD+G.HRS,random=~LEAD|GRP) I want to specify both random intercept and random slope of LEAD. Is the random intercept already in it? or should I specify it like: lme(WBEING~HRS+LEAD+G.HRS,random=~1+LEAD|GRP) ? Thank you very much. Best regards, YA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. * * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * * Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sending lists from R to C using .C
On 15/04/2012 05:22, chris90nz wrote: Hi, I am currently stuck on the following problem: I have a matrix where each element is a list of numbers of differing lengths. I am wondering what would be the best way to send this to my C function using .C? Follow the documentation This is not the list for discussing compiled code (see the posting guide), nor have you told us what you tried. But in fact there is only one way to do this, and ?.C and 'Writing R Extensions' explain it to you. I have no idea why you would have 'a list of numbers of differing lengths'. First, what is the length of a number? Second, why not store numbers in numeric vectors? An example (as required by the posting guide) would have helped compensate for an imprecise description. Cheers, Chris R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] basic question predict GLM offset
On Apr 15, 2012, at 01:37 , David Winsemius wrote: On Apr 14, 2012, at 6:47 PM, smfa wrote: Hi, I know this is probably a basic question... But I don't seem to find the answer. I'm fitting a GLM with a Poisson family, and then tried to get a look at the predictions, however the offset does seem to be taken into consideration: model_glm=glm(cases~rhs(data$year,2003)+lhs(data$year,2003), offset=(log(population)), data=data, subset=28:36, family=poisson()) predict (model_glm, type=response) I get cases not rates... I've tried also model_glm=glm(cases~rhs(data$year,2003)+lhs(data$year,2003)+ offset(log(population)), data=data, subset=28:36, family=poisson()) with the same results. However when I predict from GAM, using mgcv, the predictions consider the offset (I get rates). The beta coefficients are the log-rate-estimates when you use log(population) as the offset. But they are not the log predicted rates if you are describing many rates using a few parameters. I'm missing something? You are most definitely missing the part where you include 'data'. True. (cases ~ rhs(year, 2003) + lhs(year, 2003) is right, the other way only even works if you predict on the same data set). More to the point: does the OP realize how easy it is to go from fitted cases to rates by dividing with the population size? A logical way to get predicted rates would be to make predictions for a new data set where the poulation size was set to 1 (or 10, maybe), but it seems easier to divide and conquer (pardon the pun). -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] match values from data.frame and vector
Dear R helpers! I have a vector 'x1' and data.frame 'df1'. Do you have any suggestion how to get vector x2, which will be a result of matching values from vector 'x1' and values from 'df1'? Please, see the example: x1 - c(rep(1,3), rep(NA,2), rep(2,4)) df1 - data.frame(c1 = c(1,2), c2 = c(5,6)) I would like to get vector x2: x2 [1] 5 5 5 NA NA 6 6 6 6 Thanks a lot, OV [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to rank vectors based on their elements?
Thanks for your reply. But the classes are having weighht vectors? so we know alredy each elelment (here 5) has their own signicance which is shown by negative and positive values. But i need to use this data to rank them. Regards -- View this message in context: http://r.789695.n4.nabble.com/How-to-rank-vectors-based-on-their-elements-tp4558409p4558638.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] match values from data.frame and vector
On Apr 15, 2012, at 09:36 , Omphalodes Verna wrote: Dear R helpers! I have a vector 'x1' and data.frame 'df1'. Do you have any suggestion how to get vector x2, which will be a result of matching values from vector 'x1' and values from 'df1'? Please, see the example: x1 - c(rep(1,3), rep(NA,2), rep(2,4)) df1 - data.frame(c1 = c(1,2), c2 = c(5,6)) I would like to get vector x2: x2 [1] 5 5 5 NA NA 6 6 6 6 Thanks a lot, OV You mean like this? df1$c2[match(x1, df1$c1)] [1] 5 5 5 NA NA 6 6 6 6 -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple read in with zoo using POSIXlt
On Sat, 14 Apr 2012, knavero wrote: Achim Zeileis-4 wrote You just need to declare that the index is in two columns (1 and 2) and then provide a function that extracts a suitable object from it: read.zoo(test.txt, header = FALSE, index = 1:2, FUN = function(x, y) strptime(paste(x, y), %d/%m/%Y %H:%M)) Use an additional as.POSIXct(...) around the strptime() call if you want to use POSIXct instead of POSIXlt which is typically recommended. See vignette(zoo-read, package = zoo) for more examples. Z Unfortunately, it's not working as I hoped for. Let me elaborate, I don't know why you make this so complicated. Either use read.zoo(test.txt, header = FALSE, sep = \t, format = %d/%m/%Y %H:%M, tz = ) which yields a POSIXct time index. Alternatively, you can produce POSIXlt via strptime: read.zoo(test.txt, header = FALSE, sep = \t, FUN = function(x) strptime(x, %d/%m/%Y %H:%M)) The former is recommended for use in zoo. new code: http://pastebin.com/axpPB6M8 So for this, I understand that the read in works very well with POSIXct, but I want to utilize the vectors contained with the POSIXlt class (wday, yday, mon, etc.). Here's how the POSIXct read.zoo looks like in the shell when copy pasted: test = read.zoo(http://dl.dropbox.com/u/41922443/test.txt;, +header = FALSE, sep = \t, +FUN = function(idx) as.POSIXct(strptime(idx, + format = fmt, tz = PDT), format = fmt, tz = PDT), +colClasses = rep(c(NA, numeric, NULL), c(1, 1, 0)), +aggregate = tail1) test 2010-01-07 00:15:00 2010-01-07 00:30:00 2010-01-07 00:45:00 2010-01-07 01:00:00 1333.6201333.3881335.343 1334.251 2010-01-07 01:15:00 2010-01-07 01:30:00 2010-01-07 01:45:00 2010-01-07 02:00:00 1331.5891328.6951329.151 1329.077 2010-01-07 02:15:00 2010-01-07 02:30:00 1327.6491326.789 This is good when you just eyeball it, HOWEVER, when the date/time is looked at by the machine, it doesn't see vectors that can be accessed, but the lame numerical/double that is the UTC time from 1960 or whatever in seconds. Proof of this is the following: unclass(index(test)) [1] 1262823300 1262824200 1262825100 1262826000 1262826900 1262827800 [7] 1262828700 1262829600 1262830500 1262831400 attr(,tzone) [1] PDT Now with this code: http://pastebin.com/pr2X78sX This just pisses me offlet me elaborate...or well, eff it. I'll just copy paste and you'll get my point: test = read.zoo(http://dl.dropbox.com/u/41922443/test.txt;, +header = FALSE, sep = \t, +FUN = function(idx) as.POSIXlt(strptime(idx, + format = fmt, tz = PDT), format = fmt, tz = PDT), +colClasses = rep(c(NA, numeric, NULL), c(1, 1, 0)), +aggregate = tail1) test 0 1326.789 Basically, what the hell is 0 and 1326.789 doing there?.right? -- View this message in context: http://r.789695.n4.nabble.com/simple-read-in-with-zoo-using-POSIXlt-tp4557138p4558038.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave compiling problem
Hi, I' a newbie in Debian (I came from OsX) and I notice this error compiling a Sweave document after I installed the last R relese (2.15.0): Error in driver$finish(drobj) : the output file 'MyDocument.tex' has disappeared Calls: Anonymous - do.call - Anonymous - Anonymous Execution halted Do you have any suggestion?? Best Riccardo Sent from iPod [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Specifying splits - in read.csv.ffdf
Hi All, I am currently trying to familiarize with ff package which allows me to store R objects in the hard drive. One of things that I notice when reading in a text file with read.csv.ffdf function - is that, in the R temp folder, 1000+ small files get created, each file having a name like ffdf1bcd8b4aa0.ff. Each file is about 5KB in size. My understanding is, the whole file has been split into small small pieces and stored in the hard drive. What I am trying to see is that - is there a way to reduce the number of splits by increasing the file size? Thanks in advance, Regards, Indrajit [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to rank vectors based on their elements?
On Sat, Apr 14, 2012 at 07:10:48PM -0700, Manish Gupta wrote: Hi, I am working on ranking algo? I have data which is in the form of vectors (feature) for each class and i need to rank them based on feature vector. class1-c(1,3,4,-2,0) class2-c(2,0,0,-3,0) class3-c(2,3,1,4,5) class4-c(-4,-5,1,0,0) I need to rank class1, class2, class3, class4 class5. How can i implement it? Hi. I am not sure, whether i understand, what you mean. How many examples of each class do you have? Do you have one example for each class consisting of 5 features or you have 5 examples for each class, where each example is represented by a single number? The usual format for a classification task is a matrix or data frame with one column for each feature and one column for the class. The first interpretation would be represented by the matrix 1 3 4 -2 0 class1 2 0 0 -3 0 class2 2 3 1 4 5 class3 -4 -5 1 0 0 class4 and the other interpretation would be represented by 1 class1 3 class1 4 class1 -2 class1 0 class1 2 class2 0 class2 0 class2 -3 class2 0 class2 2 class3 3 class3 1 class3 4 class3 5 class3 -4 class4 -5 class4 1 class4 0 class4 0 class4 Is some of this meaningful in your application? What do you mean by ranking? Do you mean to estimate to which class belongs a new example not contained in the data above? Another understanding may be that you want to order the four classes by some type of average of the numbers assigned to each of them. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave UFT8 problem
Hello, Have you tried to put that command in a comment: %\usepackage[utf8]{inputenc} I haven't tested it in this particular case, but it works in some other situations. Best, Philippe ..¡})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons University, Belgium ( ( ( ( ( .. On 14/04/12 22:37, Mark Heckmann wrote: Hi, I work on MacOS, trying to Sweave an UFT8 document. AFAI remember R 2.14 used to render a warning when the encoding was not declared when using Sweave. With R 2.15 it seems to render an error. Sweave(sim_pi.Rnw) Error: 'sim_pi.Rnw' is not ASCII and does not declare an encoding Declaring an encoding by adding a line like \usepackage[utf8]{inputenc} in the preamble does the job. In my case though the .Rnw document does no have a preamble as it is just one chapter. All chapters are Sweaved separately (due to computation time). Hence I cannot inject the above line as LaTex will cause an error afterwards. (usepackage{} is only allowed in the preamble which only appears once in the main document, not in each chapter). How can I get around this not using the terminal for Sweaving, like e.g. R CMD Sweave --encoding=utf-8 sim_pi.Rnw ? Thanks Mark [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave UFT8 problem
try: Sweave(sim_pi.Rnw, encoding = utf8) Best, Matthias On 15.04.2012 11:41, Philippe Grosjean wrote: Hello, Have you tried to put that command in a comment: %\usepackage[utf8]{inputenc} I haven't tested it in this particular case, but it works in some other situations. Best, Philippe ..¡})) ) ) ) ) ) ( ( ( ( ( Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( ( Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons University, Belgium ( ( ( ( ( .. On 14/04/12 22:37, Mark Heckmann wrote: Hi, I work on MacOS, trying to Sweave an UFT8 document. AFAI remember R 2.14 used to render a warning when the encoding was not declared when using Sweave. With R 2.15 it seems to render an error. Sweave(sim_pi.Rnw) Error: 'sim_pi.Rnw' is not ASCII and does not declare an encoding Declaring an encoding by adding a line like \usepackage[utf8]{inputenc} in the preamble does the job. In my case though the .Rnw document does no have a preamble as it is just one chapter. All chapters are Sweaved separately (due to computation time). Hence I cannot inject the above line as LaTex will cause an error afterwards. (usepackage{} is only allowed in the preamble which only appears once in the main document, not in each chapter). How can I get around this not using the terminal for Sweaving, like e.g. R CMD Sweave --encoding=utf-8 sim_pi.Rnw ? Thanks Mark [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Prof. Dr. Matthias Kohl www.stamats.de __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xyplot type=l
Probably a stupidly simple question, but I wouldn't know how to google it: xyplot(neuro ~ time | UserID, data=data_sub) creates a proper plot. However, if I add type = l the lines do not go first through time1, then time2, then time3 etc but in about 50% of all subjects the lines go through points seemingly random (e.g. from 1 to 4 to 2 to 5 to 3). The lines always start at time point 1, though. Defining time as factor or ordered doesn't change this. neuro is a numerical variable. It's probably some beginner's mistake, but I don't seem to be able to solve it. Thanks E. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] GLM other machine learning packages for ffdf formats
Hi All, I have 1 GB dataset in ffdf format. Is there any package / machine learning algorithms available that I can apply on these ffdf format datasets? Regards, Indrajit [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to rank vectors based on their elements?
Hi, In my case, your first guess is right. I need to rank classes based on their feature vector. 1 3 4 -2 0 class1 2 0 0 -3 0 class2 2 3 1 4 5 class3 -4 -5 1 0 0 class4 Like class1 class3 class4 class2 How can i implement it? Regards -- View this message in context: http://r.789695.n4.nabble.com/How-to-rank-vectors-based-on-their-elements-tp4558409p4558829.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to rank vectors based on their elements?
On Sun, Apr 15, 2012 at 02:52:11AM -0700, Manish Gupta wrote: Hi, In my case, your first guess is right. I need to rank classes based on their feature vector. 1 3 4 -2 0 class1 2 0 0 -3 0 class2 2 3 1 4 5 class3 -4 -5 1 0 0 class4 Like class1 class3 class4 class2 How can i implement it? Hi. The ordering may be defined in many ways depending on the purpose of the ordering. Since i do not know this purpose, i can only guess, what can be meaningful. Try the ordering by the mean value. This can be done as follows. class1-c(1,3,4,-2,0) class2-c(2,0,0,-3,0) class3-c(2,3,1,4,5) class4-c(-4,-5,1,0,0) mat - rbind(class1, class2, class3, class4) mat[order(rowMeans(mat), decreasing=TRUE), ] [,1] [,2] [,3] [,4] [,5] class323145 class1134 -20 class2200 -30 class4 -4 -5100 If the importance of the features is not equal, one can use weigted mean. For example, as follows. w - c(1, 1, 4, 1, 1) weightedMean - (mat %*% w)/sum(w) mat[order(weightedMean, decreasing=TRUE), ] [,1] [,2] [,3] [,4] [,5] class1134 -20 class323145 class2200 -30 class4 -4 -5100 Hope this helps. Petr Savicky. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] master thesis
Try the pairs() function to explore your raw data. This webpage may even give you a nice way to visualize your data: http://r-epid.blogspot.com/2008/11/correlation-pairs-plot.html -Steve -Original Message- From: Francesca Sorbie [mailto:fsor...@hotmail.com] Sent: Saturday, April 14, 2012 6:44 AM To: r-help@r-project.org Subject: [R] master thesis Hi, For my master thesis I have 24 micro-plots on which I did measurements during 3 months. The measurements were: - Rainfall and runoff events throughout 3monts (runoff being dependant on the rainfall, a coefficient (%) has been made per rainfall event and per 3 months) - Soil texture (3 different textures were differentiated) - Slope (3 classes of slopes) - Stoniness (one time measurement) - Random roughness (throughout 3 months) - Land use (crop land or grazing land) - Vegetation cover (throughout 3 months) - Vegetation height (throughout 3 months, only measured on cropland) - Antecedent moisture content (throughout 3 months) Now I would like to investigate the effect of all these variables on the rainfall/runoff. For example does a steeper slope have a larger effect on the runoff than the soil texture? What are the possibilities in R? Thank you for any feedback, Francesca [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot type=l
On Apr 15, 2012, at 6:19 AM, Eiko Fried wrote: Probably a stupidly simple question, but I wouldn't know how to google it: xyplot(neuro ~ time | UserID, data=data_sub) creates a proper plot. However, if I add type = l the lines do not go first through time1, then time2, then time3 etc but in about 50% of all subjects the lines go through points seemingly random (e.g. from 1 to 4 to 2 to 5 to 3). The lines always start at time point 1, though. Defining time as factor or ordered doesn't change this. neuro is a numerical variable. It's probably some beginner's mistake, but I don't seem to be able to solve it. You should post the results of either str(data_sub) or (preferably) dput(data_sub). I'm guessing you don't really understand how your data is stored and that the problem is at the input and preprocessing stage. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] version 3.0-0 of the sem now on CRAN
Dear R users, Version 3.0-0 of the sem package is now on CRAN. From the package NEWS file: o Compiled code for optimization. o Added multi-group models. o Modification indices for equality-constrained parameters. o weights argument added to tsls(). o raw argument added to cfa(). Of these changes, the first two are the most significant, and the first -- the use of compiled code to improve performance substantially -- was implemented by Zhengua Nie, who has joined me and Jarrett Byrnes as a co-developer of the package. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question with R CMD SHLIB in 64 bit R
On 14.04.2012 21:53, Katharine Miller wrote: OK. So, I have 64 bit Windows 7 and I have installed R 2.15.0 Yes, and have you also installed version 2.15 of the Rtools? Uwe ligges Thanks 2012/4/14 Uwe Liggeslig...@statistik.tu-dortmund.de On 14.04.2012 19:01, Katharine Miller wrote: Hi, Sorry - I should have said that I was using Windows 7 on a 64 bit computer in my earlier post. Which Windows 7? 64-bit or 32-bit? I used the installer to install R, I did not do a build from source. I have read the installation instructions. Both R and Rtools are first in my path, but by toolchain do you mean that I need to install the mingw-w64 toolchain? Since the installation instructions indicate that Windows 64-bit is integrated into the R package and build systems, I assumed this wasn't necessary - but that would be an easy fix! You need Rtools 2.15 (and not earlier). the gcc 4.6.3 is part of it (and it builds both 32 and 64 bit binaries) Uwe Thank you. - Katharine 2012/4/14 Uwe Liggeslig...@statistik.tu-**dortmund.delig...@statistik.tu-dortmund.de On 14.04.2012 00:24, Katharine Miller wrote: Hi, I have some C++ code that I compiled into a dll for use in 32 bit R and would like to recompile for use in 64bit R. I thought it would be as easy as going to R-2.15.0\bib\x64 and running R CMD SHLIB mfregRF.c Is this Windows? 1. If so, is your OS 64-bit? Otherwise that R won't run for you. 2. Do you have the latest toolchain first in your PATH? 3. Do yo follow the details of the most recent version of the R Installation and Administration manual? Uwe Ligges but that doesn't do anything. It doesn't give me any error messages, but it also doesn't create a shared (so) file. I just get the command prompt back. I also tried \bin\x64 R CMD SHLIB - help, and again got nothing but the command prompt back. Just to check, I ran R CMD SHLIB from \bin\i386 on the same C++ files and it produced a working dll just like usual. Is there something that I need to do to the C++ files to make them compatible with 64 bit? They are not file that I made, so I do not know how they were originally compiled. Thanks for any help. - Katharine [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-helphttps://stat.ethz.ch/mailman/**listinfo/r-help https://stat.**ethz.ch/mailman/listinfo/r-**helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.htmlhttp://www.**R-project.org/posting-guide.**htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot type=l
On 2012-04-15 03:19, Eiko Fried wrote: Probably a stupidly simple question, but I wouldn't know how to google it: xyplot(neuro ~ time | UserID, data=data_sub) creates a proper plot. However, if I add type = l the lines do not go first through time1, then time2, then time3 etc but in about 50% of all subjects the lines go through points seemingly random (e.g. from 1 to 4 to 2 to 5 to 3). The lines always start at time point 1, though. Defining time as factor or ordered doesn't change this. neuro is a numerical variable. It's probably some beginner's mistake, but I don't seem to be able to solve it. See if this gives a clue: library(lattice) xyplot(Sepal.Length ~ Sepal.Width | Species, type = 'l', data = iris) ord - with(iris, order(Sepal.Width)) iris2 - iris[ord,] xyplot(Sepal.Length ~ Sepal.Width | Species, type = 'l', data = iris2) Peter Ehlers Thanks E. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot type=l
On Apr 15, 2012, at 11:54 AM, Peter Ehlers wrote: On 2012-04-15 03:19, Eiko Fried wrote: Probably a stupidly simple question, but I wouldn't know how to google it: xyplot(neuro ~ time | UserID, data=data_sub) creates a proper plot. However, if I add type = l the lines do not go first through time1, then time2, then time3 etc but in about 50% of all subjects the lines go through points seemingly random (e.g. from 1 to 4 to 2 to 5 to 3). The lines always start at time point 1, though. Defining time as factor or ordered doesn't change this. neuro is a numerical variable. It's probably some beginner's mistake, but I don't seem to be able to solve it. See if this gives a clue: library(lattice) xyplot(Sepal.Length ~ Sepal.Width | Species, type = 'l', data = iris) ord - with(iris, order(Sepal.Width)) iris2 - iris[ord,] xyplot(Sepal.Length ~ Sepal.Width | Species, type = 'l', data = iris2) I hope Peter doesn't take offense when I opine that the improved version remains an ugly plot. Better would be: xyplot(Sepal.Length ~ Sepal.Width | Species, panel=function(x,y, ...){ panel.xyplot(x,y, ...) panel.loess(x,y)}, data = iris) OR: xyplot(Sepal.Length ~ Sepal.Width | Species, panel=function(x,y, ...){ panel.xyplot(x,y,...) panel.lmline(x,y)}, data = iris) -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xyplot type=l
On 2012-04-15 09:31, David Winsemius wrote: On Apr 15, 2012, at 11:54 AM, Peter Ehlers wrote: On 2012-04-15 03:19, Eiko Fried wrote: Probably a stupidly simple question, but I wouldn't know how to google it: xyplot(neuro ~ time | UserID, data=data_sub) creates a proper plot. However, if I add type = l the lines do not go first through time1, then time2, then time3 etc but in about 50% of all subjects the lines go through points seemingly random (e.g. from 1 to 4 to 2 to 5 to 3). The lines always start at time point 1, though. Defining time as factor or ordered doesn't change this. neuro is a numerical variable. It's probably some beginner's mistake, but I don't seem to be able to solve it. See if this gives a clue: library(lattice) xyplot(Sepal.Length ~ Sepal.Width | Species, type = 'l', data = iris) ord- with(iris, order(Sepal.Width)) iris2- iris[ord,] xyplot(Sepal.Length ~ Sepal.Width | Species, type = 'l', data = iris2) I hope Peter doesn't take offense when I opine that the improved version remains an ugly plot. Better would be: xyplot(Sepal.Length ~ Sepal.Width | Species, panel=function(x,y, ...){ panel.xyplot(x,y, ...) panel.loess(x,y)}, data = iris) OR: xyplot(Sepal.Length ~ Sepal.Width | Species, panel=function(x,y, ...){ panel.xyplot(x,y,...) panel.lmline(x,y)}, data = iris) No offense taken, David. This may well have been (and almost surely *should* have been) the OP's unstated ultimate objective. I merely interpreted the question quite literally as a desire to pass sequentially through all points which of course the OP's code would indeed do, but clearly not in the order desired. Peter Ehlers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question with R CMD SHLIB in 64 bit R
Yes. I have version 2.15.0 of Rtools as well. I went ahead and re-installed both R and Rtools just to make sure everything was OK. But that did not fix the problem. - Katharine 2012/4/15 Uwe Ligges lig...@statistik.tu-dortmund.de On 14.04.2012 21:53, Katharine Miller wrote: OK. So, I have 64 bit Windows 7 and I have installed R 2.15.0 Yes, and have you also installed version 2.15 of the Rtools? Uwe ligges Thanks 2012/4/14 Uwe Liggeslig...@statistik.tu-**dortmund.delig...@statistik.tu-dortmund.de On 14.04.2012 19:01, Katharine Miller wrote: Hi, Sorry - I should have said that I was using Windows 7 on a 64 bit computer in my earlier post. Which Windows 7? 64-bit or 32-bit? I used the installer to install R, I did not do a build from source. I have read the installation instructions. Both R and Rtools are first in my path, but by toolchain do you mean that I need to install the mingw-w64 toolchain? Since the installation instructions indicate that Windows 64-bit is integrated into the R package and build systems, I assumed this wasn't necessary - but that would be an easy fix! You need Rtools 2.15 (and not earlier). the gcc 4.6.3 is part of it (and it builds both 32 and 64 bit binaries) Uwe Thank you. - Katharine 2012/4/14 Uwe Liggeslig...@statistik.tu-**d**ortmund.dehttp://dortmund.de ligges@statistik.**tu-dortmund.de lig...@statistik.tu-dortmund.de On 14.04.2012 00:24, Katharine Miller wrote: Hi, I have some C++ code that I compiled into a dll for use in 32 bit R and would like to recompile for use in 64bit R. I thought it would be as easy as going to R-2.15.0\bib\x64 and running R CMD SHLIB mfregRF.c Is this Windows? 1. If so, is your OS 64-bit? Otherwise that R won't run for you. 2. Do you have the latest toolchain first in your PATH? 3. Do yo follow the details of the most recent version of the R Installation and Administration manual? Uwe Ligges but that doesn't do anything. It doesn't give me any error messages, but it also doesn't create a shared (so) file. I just get the command prompt back. I also tried \bin\x64 R CMD SHLIB - help, and again got nothing but the command prompt back. Just to check, I ran R CMD SHLIB from \bin\i386 on the same C++ files and it produced a working dll just like usual. Is there something that I need to do to the C++ files to make them compatible with 64 bit? They are not file that I made, so I do not know how they were originally compiled. Thanks for any help. - Katharine [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help https://**stat.ethz.ch/mailman/listinfo/r-helphttps://stat.ethz.ch/mailman/**listinfo/r-help https://stat.**ethz.ch/**mailman/listinfo/r-**helphttp://ethz.ch/mailman/listinfo/r-**help http**s://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.htmlhttp://www.R-project.org/posting-guide.** **htmlhttp://www.R-project.**org/posting-guide.htmlhttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Choose between duplicated rows
Thank you very much to both your replies. Trinker's solution works great for small dataset, but the 'split' function just hangs when I try to apply it to all my data (around 9,000 rows)…Has anyone encountered this problem before, and do you know what I could try? Thanks again. -- View this message in context: http://r.789695.n4.nabble.com/Choose-between-duplicated-rows-tp4557833p4559319.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] correct implementation of a mixed-model ANOVA in R
Dear R-experts! I having trouble with the correct implementation of a mixed-model ANOVA in R. I my dataset subjects were tested in a cognitive performance test (numerical outcome variable 'score'). This cognitive performance test is devided into five blocks (categorical factor 'block'). All subjects were tested two times (in random order once following placebo treatment and once following drug treatment: categorical factor 'treatment'). In order to account for re-test effects I coded the testing session (categorical factor 'session' with 1 coding for the first and 2 coding for the second session). Also all subjects once underwent a blood screening before the study in which a stable blood marker was measured which is hypothesized to mediate treatment effects (covariate 'blood'). I would like to look for treatment effects on my outcome variable 'score' and whether the treatment effect is mediated by the covariate 'blood'. Also I would like to use 'subject' as a random factor, include 'block' as a within-subjects factor and control for re-test effects by including 'session'. Could you advice me regarding the proper implementation of my model? Also I am happy about any pointers to R-related literature. Please find below some dummy data and initially model formulation. Many thanks best wishes, Jokel # some dummy data data - structure(list(subject = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c(1, 2, 3, 4, 5), class = factor), block = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c(1, 2, 3), class = factor), treatment = structure(c(2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c(drug, placebo), class = factor), session = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c(1st, 2nd), class = factor), blood = c(3.04, 3.04, 3.04, 3.04, 3.04, 3.04, 4.93, 4.93, 4.93, 4.93, 4.93, 4.93, 5.65, 5.65, 5.65, 5.65, 5.65, 5.65, 5.7, 5.7, 5.7, 5.7, 5.7, 5.7, 5.77, 5.77, 5.77, 5.77, 5.77, 5.77), score = c(10.98, 11.66, 9.99, 9.15, 9.9, 10.3, 10.78, 9.43, 8.6, 9.67, 10.75, 9.09, 11.18, 10.33, 8.61, 10.04, 9.13, 8.71, 10.03, 9.38, 11.17, 9.82, 10.07, 8.34, 9.94, 11.38, 9.19, 8.35, 10.17, 9.04)), .Names = c(subject, block, treatment, session, blood, score), row.names = c(NA, -30L), class = data.frame) # some first try which however throws a warning aov1 - aov(score~blood*treatment*session*block+Error(subject/(treatment*session*block)), data=data) summary(aov1) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple read in with zoo using POSIXlt
Thanks Gabor. Much appreciated. -- View this message in context: http://r.789695.n4.nabble.com/simple-read-in-with-zoo-using-POSIXlt-tp4557138p4559762.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple read in with zoo using POSIXlt
Achim Zeileis-4 wrote I don't know why you make this so complicated. Either use read.zoo(test.txt, header = FALSE, sep = \t, format = %d/%m/%Y %H:%M, tz = ) which yields a POSIXct time index. Alternatively, you can produce POSIXlt via strptime: read.zoo(test.txt, header = FALSE, sep = \t, FUN = function(x) strptime(x, %d/%m/%Y %H:%M)) The former is recommended for use in zoo. Sorry, it's not that I'm trying to make it complicated, but rather specific. As Gabor said in the earlier post, it seems POSIXlt is not a suitable argument for read.zoo, and therefore explains the problem that I have been having. Like I said, I was able to produce the solution that I wanted; however, it was not as efficient and elegant as I was hoping it to be. When going through 2 year span data sets, there is a noticeable difference in speed when you use POSIXlt separately outside the read.zoo function. Anyway, it's fine. I appreciate the feedback and suggestion. It definitely helps bounce ideas around and possibly offers ways to expand R source code in the future maybe. Thanks! -- View this message in context: http://r.789695.n4.nabble.com/simple-read-in-with-zoo-using-POSIXlt-tp4557138p4559760.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
Thanks David, for helping out. Works great. A question about your added lines of code: with opar - par(mar=c(8,3,2,2)) your are setting the margins, and you store them in opar next, par(opar) does nothing, I think because it just sets the same margins again. shouldn't this be something like the following: opar - par( 'all graphics parameter') # store all current graphic parameters (in correct R language of course) par(mar=c(8,3,2,2) # set new margins barplot(...) par(opar) # restore former graphics parameters. Regards, Fino - Original Message - From: David Winsemius dwinsem...@comcast.net To: David Winsemius dwinsem...@comcast.net Cc: Fino Fontana finofont...@yahoo.com; r-help@r-project.org r-help@r-project.org Sent: Sunday, April 15, 2012 3:23 AM Subject: Re: [R] (no subject) On Apr 14, 2012, at 9:21 PM, David Winsemius wrote: On Apr 14, 2012, at 8:11 PM, Fino Fontana wrote: I am wrestling with the following in creating a barplot in R: I have a data set with 18 entries. It is plotted in a bargraph. The x-axis should have 18 tick marks each with its own label. What happens is, only a few labels are shown; there is not enough space for all labels. The tick marks are concentrated on the left side of the axis. The mar parameter to par() will give you control of the lower margin and the 'las' parameter will give you control of the axis label orientation: barplot(times$duur, names.arg= times$taak, las=3, mar=c(8,3,2,2) ) Make that: opar - par(mar=c(8,3,2,2)) barplot(times$duur, names.arg= times$taak, las=3 ) par(opar) I'd like to have all labels shown, in vertical direction. This is part of the data: times taak duur 1 algemeen 48.75 2 swimming 14.25 3 football 24.25 4 tennis 36.75 5 bicycling 1.50 Under 'taak' are the labels. This is the code that should do the job: barplot( width= bar_width, times$duur, names.arg=times$taak, col=fill_colors, border=NA, space=0.3, xlab=taak, ylab=uren, main=Uren per taak) axis(1,at=1:length(times$taak),lab=times$taak) Could anyone give me advise on how to get rid of this horrible x axis? Thanks and r __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Choose between duplicated rows
I also tried using Jim's code, but it doesn't work as expected with my real dataset. This is what I did: Best.na - do.call(rbind, lapply(split(x, x$A), function(.grp){ best - which.min(apply(.grp, 1, function(a) sum(is.na(a .grp[best, ] })) df.split - split(Best.na, Best.na$id) Best.date - lapply(df.split, function(x){ # Select by given criterion y - x[which(max(x$A) == x$A),] y }) Best.date - do.call(rbind, Best.date) Thank you again for your help. -- View this message in context: http://r.789695.n4.nabble.com/Choose-between-duplicated-rows-tp4557833p4559792.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
On Apr 15, 2012, at 4:18 PM, Fino Fontana wrote: Thanks David, for helping out. Works great. A question about your added lines of code: with your are setting the margins, and you store them in opar next, par(opar) does nothing, I disagree (and you offer no counter-examples, as would most experienced users of R I believe. Look at the first two examples on help(par). The warning applies to situations where you are building functions. I was just offering a solution that would work at the console. Here's my counter-example to your claim. par(mar) [1] 5.1 4.1 4.1 2.1 opar - par(mar=c(8,3,2,2)) par(mar) [1] 8 3 2 2 str(opar) List of 1 $ mar: num [1:4] 5.1 4.1 4.1 2.1 par(opar) par(mar) [1] 5.1 4.1 4.1 2.1 par() like many graphics functions works via side-effects. -- David. I think because it just sets the same margins again. shouldn't this be something like the following: opar - par( 'all graphics parameter') # store all current graphic parameters (in correct R language of course) par(mar=c(8,3,2,2) # set new margins barplot(...) par(opar) # restore former graphics parameters. Regards, Fino - Original Message - From: David Winsemius dwinsem...@comcast.net To: David Winsemius dwinsem...@comcast.net Cc: Fino Fontana finofont...@yahoo.com; r-help@r-project.org r-help@r-project.org Sent: Sunday, April 15, 2012 3:23 AM Subject: Re: [R] (no subject) On Apr 14, 2012, at 9:21 PM, David Winsemius wrote: On Apr 14, 2012, at 8:11 PM, Fino Fontana wrote: I am wrestling with the following in creating a barplot in R: I have a data set with 18 entries. It is plotted in a bargraph. The x-axis should have 18 tick marks each with its own label. What happens is, only a few labels are shown; there is not enough space for all labels. The tick marks are concentrated on the left side of the axis. The mar parameter to par() will give you control of the lower margin and the 'las' parameter will give you control of the axis label orientation: barplot(times$duur, names.arg= times$taak, las=3, mar=c(8,3,2,2) ) Make that: opar - par(mar=c(8,3,2,2)) barplot(times$duur, names.arg= times$taak, las=3 ) par(opar) I'd like to have all labels shown, in vertical direction. This is part of the data: times taak duur 1 algemeen 48.75 2 swimming 14.25 3 football 24.25 4 tennis 36.75 5 bicycling 1.50 Under 'taak' are the labels. This is the code that should do the job: barplot( width= bar_width, times$duur, names.arg=times$taak, col=fill_colors, border=NA, space=0.3, xlab=taak, ylab=uren, main=Uren per taak) axis(1,at=1:length(times$taak),lab=times$taak) Could anyone give me advise on how to get rid of this horrible x axis? Thanks and r David Winsemius, MD West Hartford, CT David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] caret package: custom summary function in trainControl doesn't work with oob?
Hi Max, Thanks for your help. In the case of randomForest, the keyword keep.inbag=TRUE in the train function provokes the return the information about which data rows were in-bag in each tree. That should provide the required info to re-compute the OOB error for any given alternative error definition. It is somewhat convoluted and likely to differ between bagging algorithm implementations, so it doesn't surprise me that this isn't supported, I just wanted to check I hadn't missed anything before re-inventing the wheel. Thanks again, Matt On Sat, Apr 14, 2012 at 2:53 AM, Max Kuhn mxk...@gmail.com wrote: Matt, I've been using a custom summary function to optimise regression model methods using the caret package. This has worked smoothly. I've been using the default bootstrapping resampling method. For bagging models (specifically randomForest in this case) caret can, in theory, uses the out-of-bag (oob) error estimate from the model instead of resampling, which (in theory) is largely redundant for such models. Since they take a while to build in the first place, it really slows things down when estimating performance using boostrap. I can successfully run either using the oob 'resampling method' with the default RMSE optimisation, or run using bootstrap and my custom summaryFunction as the thing to optimise, but they don't work together. If I try and use oob and supply a summaryFunction caret throws an error saying it can't find the relevant metric. Now, if caret is simply polling the randomForest object for the stored oob error I can understand this limitation That is exactly what it does. See caret:::rfStats (not a public function) train() was written to be fairly general and this level of control would be very difficult to implement, especially since each model that does some type of bagging uses different internal structures etc. but in the case of randomForest (and probably other bagging methods?) the training function can be asked to return information about the individual tree predictions and whether data points were oob in each case. With this information you can reconstruct an oob 'error' using whatever function you choose to target for optimisation. As far as I can tell, caret is not doing this and I can't see anywhere that it can be coerced to do so. It will not be able to do this. I'm not sure that you can either. randomForest() will return the individual forests and predict.randomForest() can return the per-tree results but I don't know if it saves the indices that tell you which bootstrap samples contained which training set points. Perhaps Andy would know. Have I missed something? Can anyone suggest how this could be achieved? It wouldn't be *that* hard to code up something that essentially operates in the same way as caret.train but can handle this feature for bagging models, but if it is already there and I've missed something please let me know. Well, everything is easy for the person not doing it =] If you save the proximity measures, you might gain the sampling indices. WIth these, you would use predict.randomForest(..., predict.all=TRUE) to get the individual predictions. Max [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Approximately how big is an installation of all packages.
Folks, If you know the answer great. If you can tell me which command to use to find out that information please let me know. If this is the wrong forum, my apologies. Thanks, KW -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] CRAN (and crantastic) updates this week
CRAN (and crantastic) updates this week New packages * disclapmix (0.1) Maintainer: Mikkel Meyer Andersen Author(s): Mikkel Meyer Andersen and Poul Svante Eriksen License: GPL-2 http://crantastic.org/packages/disclapmix disclapmix makes inference in a mixture of Discrete Laplace distributions using the EM algorithm. * EstSimPDMP (1.1) Maintainer: Unknown Author(s): Romain Azais License: GPL (= 2) http://crantastic.org/packages/EstSimPDMP This package deals with the estimation of the jump rate for piecewise-deterministic Markov processes (PDMPs), from only one observation of the process within a long time. The main functions provide an estimate of this function. The state space may be discrete or continuous. The associated paper is given in References. Other functions provide a method to simulate random variables from their (conditional) hazard rate, and then to simulate PDMPs. * faoutlier (0.2.2) Maintainer: Phil Chalmers Author(s): Phil Chalmers rphilip.chalm...@gmail.com License: GPL (= 2) http://crantastic.org/packages/faoutlier Tools for detecting and summarize influential cases that can affect exploratory and confirmatory factor analysis models as well as structural equation models more generally. * FastImputation (1.0) Maintainer: Stephen R. Haptonstahl Author(s): Stephen R. Haptonstahl License: GPL (= 2) http://crantastic.org/packages/FastImputation TrainFastImputation uses training data to describe a multivariate normal distribution that the data approximates or can be transformed into approximating and stores this information as an object of class FastImputationPatterns. The FastImputation function uses this FastImputationPatterns object to impute (make a good guess at) missing data in a single line or a whole dataframe of data. This approximates the process used by Amelia (\url{http://gking.harvard.edu/amelia/}) but is much faster when filling in values for a single line of data. * hydroPSO (0.1-54-1) Maintainer: Mauricio Zambrano-Bigiarini Author(s): Mauricio Zambrano-Bigiarini [aut, cre] and Rodrigo Rojas [ctb] License: GPL (= 2) http://crantastic.org/packages/hydroPSO This package implements a state-of-the-art version of the Particle Swarm Optimisation (PSO) algorithm, with a special focus on the calibration of environmental models. hydroPSO is model-independent, allowing the user to easily interface any model code with the calibration engine (PSO). hydroPSO includes a series of controlling options and PSO variants to fine-tune the performance of the calibration engine. An advanced sensitivity analysis function together with user-friendly plotting summaries facilitate the interpretation and assessment of the calibration results. Bugs reports/comments/questions are very welcomed. * OOmisc (1.0) Maintainer: Ozgur Asar Author(s): Ozgur Asar, Ozlem Ilk License: GPL (= 2) http://crantastic.org/packages/OOmisc Includes miscellaneous functions. * robustHD (0.1.0) Maintainer: Andreas Alfons Author(s): Andreas Alfons License: GPL (= 2) http://crantastic.org/packages/robustHD Robust methods for high-dimensional data, in particular linear model selection techniques based on least angle regression and sparse regression. * sparseLTSEigen (0.1.0) Maintainer: Andreas Alfons Author(s): Andreas Alfons License: GPL (= 2) http://crantastic.org/packages/sparseLTSEigen Use RcppEigen to fit least trimmed squares regression models with an L1 penalty in order to obtain sparse models. * sROC (0.1-2) Maintainer: Xiao-Feng Wang Author(s): Xiao-Feng Wang License: GPL (= 3) http://crantastic.org/packages/sROC This package contains a collection of functions to perform nonparametric estimation of receiver operating characteristic (ROC) curves for continuous data. * sybilDynFBA (0.0.1) Maintainer: Abdelmoneim Amer Desouki Author(s): Abdelmoneim Amer Desouki License: GPL-3 http://crantastic.org/packages/sybilDynFBA In this package the dynamic FBA technique proposed by Varma was implemented. * tfplot (2012.4-1) Maintainer: Paul Gilbert and Erik Meijer Author(s): Paul Gilbert pgilbert.tt...@ncf.ca License: GPL-2 http://crantastic.org/packages/tfplot Utilities in this package are for simple manipulation and quick plotting of time series data. These utilities use the tframe package which provides a programming kernel for time series. Extension to tframe provided in tframePlus can also be used. See the Guide vignette for examples. * varbvs (1.0) Maintainer: Peter Carbonetto Author(s): Peter Carbonetto License: GPL (= 3) http://crantastic.org/packages/varbvs Implements the variational inference procedure for Bayesian variable selection, as described in the quot;Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studiesquot;
[R] Problems loading siar package
Hello, I recently bought a new Macbook Pro and installed R via a disk image from an older computer - it seems to work fine. I then installed the package siar, tried to load it via the library command and received the following messages (note the mvtnorm package did not appear to load correctly): trying URL 'http://cran.stat.ucla.edu/bin/macosx/leopard/contrib/2.15/siar_4.1.3.tgz' Content type 'application/x-tar' length 181788 bytes (177 Kb) opened URL == downloaded 177 Kb The downloaded binary packages are in /var/folders/_b/pxh1w6_n5wlf9f1pscyhsjx4gn/T//RtmpgZD0hI/downloaded_packages library(siar) Loading required package: hdrcde Loading required package: locfit locfit 1.5-7 2012-03-22 Loading required package: ash Loading required package: ks Loading required package: KernSmooth KernSmooth 2.23 loaded Copyright M. P. Wand 1997-2009 Loading required package: mvtnorm Error in library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = lib.loc) : ‘mvtnorm’ is not a valid installed package I originally tried the Berkeley CRAN, removed the package, and then tried the UCLA CRAN and had the same problem. I then went back to my old computer and loaded the library and did not have any problems, the mvtnorm package loaded fine, followed by two others: coda and MASS. Any help would be most appreciated. I did not have any problems loading siar on my old Macbook Pro and the library should be compatible with both operating systems: old = 10.5.8, new = 10.7.3. Thank you, Gary -- View this message in context: http://r.789695.n4.nabble.com/Problems-loading-siar-package-tp4560105p4560105.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] correct standard errors (heteroskedasticity) using survey design
Hello all, I'm hoping someone can help clarify how the survey design method works in R. I currently have a data set that utilized a complex survey design. The only thing is that only the weight is provided. Thus, I constructed my survey design as: svdes-svydesign(id=~1, weights=~weightvar, data=dataset) Then, I want to run an OLS model, so: fitsurv-svyglm(y~x1+x2+x3...xk, design=svdes, data=dataset) But, I want to check if there is evidence of heteroskedasticity. If so, how would I correct the standard errors? Can the sandwich library do this? Are the standard errors already adjusted. How else can I verify if heteroskedasticity is still present? Can I still use the bptest()? I read an earlier post where someone used a dataset example entitled banco. But, her dataset included strata and cluster variables. Someone responded that the sandwich library already adjusted for clustering. In my situation, however, I only have a weight variable. I hope someone can clarify this problem for me. Thanks, Carlos -- View this message in context: http://r.789695.n4.nabble.com/correct-standard-errors-heteroskedasticity-using-survey-design-tp4560122p4560122.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Issue with xxx-package.Rd
Dear list, I have an issue with placing the information stored in the file mypackage-package.Rd file (as created by promptPackage(mypackage)) in the package manual pdf file. The information on that .Rd file is not placed in the head of the manual as I would expect. I think a possible reason is that the package name is also the name of one of the functions within the package. But that is the case for many packages (if not most packages), so not sure. Any hint what might be going on? Thanks, Lars. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems loading siar package
PS The version of siar on my old computer is 4.0, the new version is 4.1.3 - could the latter be corrupt? Thanks, Gary -- View this message in context: http://r.789695.n4.nabble.com/Problems-loading-siar-package-tp4560105p4560125.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with xxx-package.Rd
On 12-04-15 7:39 PM, Lars Bishop wrote: Dear list, I have an issue with placing the information stored in the file mypackage-package.Rd file (as created by promptPackage(mypackage)) in the package manual pdf file. The information on that .Rd file is not placed in the head of the manual as I would expect. I think a possible reason is that the package name is also the name of one of the functions within the package. But that is the case for many packages (if not most packages), so not sure. Any hint what might be going on? You need to show us the \alias lines from the two help files. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] simple read in with zoo using POSIXlt
On Sun, Apr 15, 2012 at 3:45 PM, knavero knav...@gmail.com wrote: Achim Zeileis-4 wrote I don't know why you make this so complicated. Either use read.zoo(test.txt, header = FALSE, sep = \t, format = %d/%m/%Y %H:%M, tz = ) which yields a POSIXct time index. Alternatively, you can produce POSIXlt via strptime: read.zoo(test.txt, header = FALSE, sep = \t, FUN = function(x) strptime(x, %d/%m/%Y %H:%M)) The former is recommended for use in zoo. Sorry, it's not that I'm trying to make it complicated, but rather specific. As Gabor said in the earlier post, it seems POSIXlt is not a suitable argument for read.zoo, and therefore explains the problem that I have been Its not just zoo -- its not suitable for use as a column in a data frame or for time series in general. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gompertz-Makeham hazard models---test for significant difference
Hi, all. I'm working with published paleodemographic data (counts of skeletons that have been assigned into an age-range category, based upon observed morphological characteristics). For example, the following is the age distribution from a prehistoric cemetery in Egypt: naga - structure(c(15,20,35,50,20,35,50,Inf,46,43,26,14),.Dim=as.integer(c(4,3)),.Dimnames=list(NULL,c(col1,col2,col3))) naga col1 col2 col3 [1,] 15 20 46 [2,] 20 35 43 [3,] 35 50 26 [4,] 50 Inf 14 So above, the first two columns are the lower and upper limits of the age-range categories (typically used by paleodemographers), and the third column is the number of skeletons assigned to each group. I've co-opted some R script for fitting a Gompertz-Makeham hazard model to this data... ## GM.naga - function(x,deaths=naga) { a2=x[1] a3=x[2] b3=x[3] shift-15 nrow-NROW(deaths) S.t-function(t) { return(exp(-a2*(t-shift)+a3/b3*(1-exp(b3*(t-shift) } d-S.t(deaths[1:nrow,1])-S.t(deaths[1:nrow,2]) obs-deaths[,3] lnlk-as.numeric(crossprod(obs,log(d))) return(lnlk) } optim(c(0.001, 0.01, 0.1),GM.naga,control=list(fnscale=-1)) And the output: $par [1] 0.05486053 0.31290971 -1.87744033 $value [1] -168.3748 $counts function gradient 174 NA $convergence [1] 0 $message NULL Let's say that I have data from another cemetery, for which I would estimate another set of hazard model parameters. How would I go about comparing the two to see if the resulting parameters for each (and their resulting survival, hazard, and density curves) are significantly different? Also, what if I wanted to include data from even more cemeteries and compare many sets of estimated hazard parameters? Below, I've included a the data/results for the another cemetery that I'd like to compare to the first. Any suggestions are welcome. Thanks so much. --Trey # naqada - structure(c(15,20,25,50,19,24,49,Inf,26,45,219,30),.Dim=as.integer(c(4,3)),.Dimnames=list(NULL,c(col1,col2,col3))) GM.naqada - function(x,deaths=naqada) { a2=x[1] a3=x[2] b3=x[3] shift-15 nrow-NROW(deaths) S.t-function(t) { return(exp(-a2*(t-shift)+a3/b3*(1-exp(b3*(t-shift) } d-S.t(deaths[1:nrow,1])-S.t(deaths[1:nrow,2]) obs-deaths[,3] lnlk-as.numeric(crossprod(obs,log(d))) return(lnlk) } optim(c(0.001, 0.01, 0.1),GM.naqada,control=list(fnscale=-1)) And the output: $par [1] 1.544739e-08 1.862670e-02 6.372117e-02 $value [1] -331.1865 $counts function gradient 276 NA $convergence [1] 0 $message NULL Trey Batey---Anthropology Instructor Division of Social Sciences Mt. Hood Community College Gresham, OR 97030 trey.batey[at]mhcc[dot]edu -- View this message in context: http://r.789695.n4.nabble.com/Gompertz-Makeham-hazard-models-test-for-significant-difference-tp4560550p4560550.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Compiling graphics into the same PDF file
I am new to R and RStudio. I am using a sweave template to compile a pdf file when i finish the R code. However, it generated a new pdf called Rplot.pdf to save all graphics separately from the original file? How could I combine the two pdf into one? Any helps will be appreciated. My work environment is Windows 7/R 2.14/R Studio 0.94/ProTeXt3.0. Also, I could access a web based RStudio set up by our department. -- View this message in context: http://r.789695.n4.nabble.com/Compiling-graphics-into-the-same-PDF-file-tp4560586p4560586.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: Help; error in optim
Hello, When i run the code below from Weibull distribution with 30% censoring by using optim i get an error form R, which states that Error in optim(start, fn = z, data = q, hessian = T) : objective function in optim evaluates to length 25 not 1 can somebody help me remove this error. Is my censoring approach correct. n=25;rr=1000 p=1.5;b=1.2 for (i in 1:rr){ q-c(t,cen) t-rweibull(25,shape=p,scale=b) meantrue-gamma(1+(1/p))*b meantrue d-meantrue/0.30 cen- runif(25,min=0,max=d) cen s-ifelse(t=cen,1,0) z-function(data,p){ beta-p[1] eta-p[2] log1-(n*cen*log(p[1])-n*cen*(p[1])*log(p[2])+cen*(p[1]-1)*sum(log(t))-n*sum((t/(p[2]))^(p[1]))) return(-log1) } start -c(0.5,0.5) zz-optim(start,fn=z,data=q,hessian=T) m1-zz$par[2] p-zz$par[1] } m1 p Thank you Chris Guure Researcher Institute for Mathematical Research UPM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting the value from previous row
Thanks it helped a lot - Thanks in Advance Arun -- View this message in context: http://r.789695.n4.nabble.com/getting-the-value-from-previous-row-tp4554761p4560577.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: Help; error in optim
On 16-04-2012, at 07:04, Christopher Kelvin wrote: Hello, When i run the code below from Weibull distribution with 30% censoring by using optim i get an error form R, which states that Error in optim(start, fn = z, data = q, hessian = T) : objective function in optim evaluates to length 25 not 1 can somebody help me remove this error. Is my censoring approach correct. n=25;rr=1000 p=1.5;b=1.2 for (i in 1:rr){ q-c(t,cen) t-rweibull(25,shape=p,scale=b) meantrue-gamma(1+(1/p))*b meantrue d-meantrue/0.30 cen- runif(25,min=0,max=d) cen s-ifelse(t=cen,1,0) z-function(data,p){ beta-p[1] eta-p[2] log1-(n*cen*log(p[1])-n*cen*(p[1])*log(p[2])+cen*(p[1]-1)*sum(log(t))-n*sum((t/(p[2]))^(p[1]))) return(-log1) } start -c(0.5,0.5) zz-optim(start,fn=z,data=q,hessian=T) m1-zz$par[2] p-zz$par[1] } m1 p The example as given doesn't run. The first assignment after the start of the for loop ( q - ...) gives an error message: object 'cen' not found. The assignment needs to moved to after the line with ifelse. In function z object 'cen' (with length 25) is used in the calculation of log1, which becomes a vector of length 25. You need to review the definition of log1 in function z. Finally: why are you assigning p[1] and p[2] to beta and eta and nor using these variables? Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.