Re: [R] publishing random effects from lme
Hi Dieter, Yes, I´ve tried both options. The anova(lme(...)) gives me good results for the fixed effects part, but what I´m specifically interested in is what to do with the random effects. I have tried glmmPQL (generalized linear mixed-effects models), which did in fact greatly help account for heteroscedasticity, but I can´t do model simplification with these models (and they´re still heavily debated, as I read from previous postings to R Help. How would you deal with the random effects part of the models when publishing results from lme? Thanks for your help! Christoph ### Here are my original questions once again (with an example below): 1) What is the total variance of the random effects at each level? (2) How can I test the significance of the variance components? (3) Is there something like an r squared for the whole model which I can state? ##it seems, there isn´t (as I learned from a previous posting The data come from an experiment on plant performance with and without insecticide, with and without grasses present, and across different levels of plant diversity (div). Thanks for your help! Christoph. lme(asin(sqrt(response)) ~ treatment + logb(div + 1, 2) + grass, random = ~ 1 | plotcode/treatment, na.action = na.exclude, method = ML) Linear mixed-effects model fit by maximum likelihood Data: NULL AIC BIC logLik -290.4181 -268.719 152.209 Random effects: Formula: ~ 1 | plotcode (Intercept) StdDev: 0.04176364 Formula: ~ 1 | treatment %in% plotcode (Intercept) Residual StdDev: 0.08660458 0.00833387 Fixed effects: asin(sqrt(response)) ~ treatment + logb(div + 1, 2) + grass Value Std.Error DF t-value p-value (Intercept) 0.1858065 0.01858581 81 9.997225 .0001 treatment 0.0201384 0.00687832 81 2.927803 0.0044 logb(div + 1, 2) -0.0203301 0.00690074 79 -2.946073 0.0042 grass 0.0428934 0.01802506 79 2.379656 0.0197 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -0.2033155 -0.05739679 -0.00943737 0.04045958 0.3637217 Number of Observations: 164 Number of Groups: plotcode ansatz %in% plotcode 82 164 Dieter Menne wrote: Suppose I have a linear mixed-effects model (from the package nlme) with nested random effects (see below); how would I present the results from the random effects part in a publication? Have you tried anova(lme())? Your asin(sqrt()) looks a bit like these are percentages of counts. The method is still quoted in old books, but has fallen a bit out of favor. Have you thought of some glm model instead (http://www.stats.ox.ac.uk/pub/MASS4/)? Dieter Menne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Handling large data sets via scan()
does it solve to a part your problem, if you use read.table() instead of scan, since it imports data directly to a data.frame? let me know, if it helps Nawaaz Ahmed wrote: I'm trying to read in datasets with roughly 150,000 rows and 600 features. I wrote a function using scan() to read it in (I have a 4GB linux machine) and it works like a charm. Unfortunately, converting the scanned list into a datafame using as.data.frame() causes the memory usage to explode (it can go from 300MB for the scanned list to 1.4GB for a data.frame of 3 rows) and it fails claiming it cannot allocate memory (though it is still not close to the 3GB limit per process on my linux box - the message is unable to allocate vector of size 522K). So I have three questions -- 1) Why is it failing even though there seems to be enough memory available? 2) Why is converting it into a data.frame causing the memory usage to explode? Am I using as.data.frame() wrongly? Should I be using some other command? 3) All the model fitting packages seem to want to use data.frames as their input. If I cannot convert my list into a data.frame what can I do? Is there any way of getting around this? Much thanks! Nawaaz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] publishing random effects from lme
If you have heteroscedasticity problems, the nlme package has many varFunctions (e.g., varPower, varIdent, etc.) that could assist you in fitting it. The usage of GLMMs is mainly for discrete and count data that you cannot fit with lme. Testing between competing lme models should be done via LRTs and the anova.lme() function. However, take care of the fitting procedure (REML vs ML), especially in case you also change the fixed-effects. The latter has been recently discussed on the list. I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm - Original Message - From: Christoph Scherber [EMAIL PROTECTED] To: Dieter Menne [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Friday, February 04, 2005 10:09 AM Subject: Re: [R] publishing random effects from lme Hi Dieter, Yes, I´ve tried both options. The anova(lme(...)) gives me good results for the fixed effects part, but what I´m specifically interested in is what to do with the random effects. I have tried glmmPQL (generalized linear mixed-effects models), which did in fact greatly help account for heteroscedasticity, but I can´t do model simplification with these models (and they´re still heavily debated, as I read from previous postings to R Help. How would you deal with the random effects part of the models when publishing results from lme? Thanks for your help! Christoph ### Here are my original questions once again (with an example below): 1) What is the total variance of the random effects at each level? (2) How can I test the significance of the variance components? (3) Is there something like an r squared for the whole model which I can state? ##it seems, there isn´t (as I learned from a previous posting The data come from an experiment on plant performance with and without insecticide, with and without grasses present, and across different levels of plant diversity (div). Thanks for your help! Christoph. lme(asin(sqrt(response)) ~ treatment + logb(div + 1, 2) + grass, random = ~ 1 | plotcode/treatment, na.action = na.exclude, method = ML) Linear mixed-effects model fit by maximum likelihood Data: NULL AIC BIC logLik -290.4181 -268.719 152.209 Random effects: Formula: ~ 1 | plotcode (Intercept) StdDev: 0.04176364 Formula: ~ 1 | treatment %in% plotcode (Intercept) Residual StdDev: 0.08660458 0.00833387 Fixed effects: asin(sqrt(response)) ~ treatment + logb(div + 1, 2) + grass Value Std.Error DF t-value p-value (Intercept) 0.1858065 0.01858581 81 9.997225 .0001 treatment 0.0201384 0.00687832 81 2.927803 0.0044 logb(div + 1, 2) -0.0203301 0.00690074 79 -2.946073 0.0042 grass 0.0428934 0.01802506 79 2.379656 0.0197 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -0.2033155 -0.05739679 -0.00943737 0.04045958 0.3637217 Number of Observations: 164 Number of Groups: plotcode ansatz %in% plotcode 82 164 Dieter Menne wrote: Suppose I have a linear mixed-effects model (from the package nlme) with nested random effects (see below); how would I present the results from the random effects part in a publication? Have you tried anova(lme())? Your asin(sqrt()) looks a bit like these are percentages of counts. The method is still quoted in old books, but has fallen a bit out of favor. Have you thought of some glm model instead (http://www.stats.ox.ac.uk/pub/MASS4/)? Dieter Menne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Output from function to a tcltk window
I would like to display output to a tcltk window from e.g. a call to summary(). I tried to get something else than oneliners into a text window of the kind found at: http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/TextWindows.html But without success. Henrik - Henrik Andersson Netherlands Institute of Ecology - Centre for Estuarine and Marine Ecology P.O. Box 140 4400 AC Yerseke Phone: +31 113 577473 [EMAIL PROTECTED] http://www.nioo.knaw.nl/ppages/handersson __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Is anyone using the MiniR distribution?
I currently build two different versions of each Windows binary: the rw.exe full installation program (with the next release looking to be around 25 Megabytes), and a series of 8 diskette-sized files named miniR*. The miniR files only include a minimal installation of R, and are rarely tested. Rather than building something that may not even work, I'd like to stop building them. Would this be a problem for anyone? Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Surprising Behavior of 'tapply'
Dear helpers, thank you very much for your advice. After starting a new R-session this morning, I was also unable to replicate the problem, although the old session showed still the same problem. One suggestion was that I maybe redefined some functions, but this was not the case. I only loaded one additional package (Hmisc) but I did this now as well and it did not cause any problems. Another suggestion (alternative) was to use 'xtabs'. This works also nicely, but I made some timings with my dataset (moderate size of 6MB) and I assume that for really large datasets 'tapply' is probably faster than 'xtabs': system.time(tapply(austria$COUNT, list(austria$sescat, austria$STATUS, austria$SEX), sum)) [1] 0.05 0.00 0.04 NA NA system.time(xtabs(austria$COUNT ~., data.frame(ses = austria$sescat, status =austria$STATUS, sex=austria$SEX))) [1] 0.86 0.00 0.86 NA NA (I did the timings several times and was also using gc() ). Thanks again (in chronological order) to Bert Gunter, Carlos Ortega, James Holtman, and Gabor Grothendieck. Best, Roland -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck Sent: Thursday, February 03, 2005 9:08 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] Surprising Behavior of 'tapply' I tried it on Windows XP with R 2.1.0 and could not replicate it either. Suggest you start up a fresh session and try it again. By the way, you could consider this: xtabs(count ~., data.frame(sex = sex, income = income)) Carlos Ortega carlos_ortegafernandez at yahoo.es writes: : : Hi, : : That is something strange, I could not replicate it... : : Regards, : Carlos. : : +++ : version : _ : platform i386-pc-mingw32 : arch i386 : os mingw32 : system i386, mingw32 : status : major2 : minor0.1 : year 2004 : month11 : day 15 : language R : sex - rep(c(F, M), 5) : income - c(rep(low, 5), rep(high, 5)) : count - 1:10 : mydf - as.data.frame(cbind(sex, income, count)) : mydf$count = as.numeric(as.character(mydf$count)) : tapply(mydf$count, list(mydf$sex, mydf$income), : FUN=sum) : high low : F 16 9 : M 24 6 : : : --- Rau, Roland Rau at demogr.mpg.de escribió: : Dear all, : : I wanted to make a two-way-table of two variables : with a counting : variable stored in another column of a dataframe. In : version 1.9.1, the : behavior is as expected as shown in the simplified : example code. : : sex - rep(c(F, M), 5) : income - c(rep(low, 5), rep(high, 5)) : count - 1:10 : mydf - as.data.frame(cbind(sex, income, count)) : mydf$count = as.numeric(as.character(mydf$count)) : tapply(mydf$count, list(mydf$sex, mydf$income), : FUN=sum) :high low : F 16 9 : M 24 6 : version : _ : platform i386-pc-mingw32 : arch i386 : os mingw32 : system i386, mingw32 : status : major1 : minor9.1 : year 2004 : month06 : day 21 : language R : : : In version 2.0.1, however, I get the following : output: : : sex - rep(c(F, M), 5) : income - c(rep(low, 5), rep(high, 5)) : count - 1:10 : mydf - as.data.frame(cbind(sex, income, count)) : mydf$count = as.numeric(as.character(mydf$count)) : tapply(mydf$count, list(mydf$sex, mydf$income), : FUN=sum) : Error in get(x, envir, mode, inherits) : variable : FUN was not found : version : _ : platform i386-pc-mingw32 : arch i386 : os mingw32 : system i386, mingw32 : status : major2 : minor0.1 : year 2004 : month11 : day 15 : language R : : : Was this change in behavior intended with the : changes in tapply from : R1.9.1 to R2.0.1? : Is the R-help-list appropriate or rather R-Devel? : : Thanks, : Roland : : : : + : This mail has been sent through the MPI for : Demographic Rese...{{dropped}} : : __ : R-help at stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! : http://www.R-project.org/posting-guide.html : : : __ : R-help at stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html : : __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] Output from function to a tcltk window
Henrik Andersson [EMAIL PROTECTED] writes: I would like to display output to a tcltk window from e.g. a call to summary(). I tried to get something else than oneliners into a text window of the kind found at: http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/TextWindows.html But without success. (Rcmdr must be doing this sort of thing already?) I'd try this: 1) str - paste(capture.output(summary(myfit),collapse=\n)) 2) clone the tkfaq demo (or one of James W.'s examples), but replace the line tkinsert(txt, end, tkcmd(read, chn)) with tkinsert(txt, end, str) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Installing R packages in windows
I need to install a selected set of packages on a number of machines (in a computer lab). Some of these machines are not connected to internet. Is it possible to download all the packages and make a kind of repository on a CD, and then install.packages from the CD? Vikas == This Mail was Scanned for Virus and found Virus free __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Rare Cases and SOM
I am trying to understand how the SOM algorithm works using library(class) SOM function. I have a 1000*10 matrix and I want to be able to summarize the different types of 10-element vectors. In my real world case it is likely that most of the 1000 values are of one kind the rest of other (this is an oversimplification). Say for example: InputA-matrix(cos(1:10),nrow=900,ncol=10,byrow=TRUE) InputB-matrix(sin(5:14),nrow=100,ncol=10,byrow=TRUE) Input-rbind(InputA,InputB) I though that a small grid of 3*3 would be enough to extract the patterns in such simple matrix : GridWidth-3 GridLength-3 gr - somgrid(xdim=GridWidth,ydim=GridLength,topo = hexagonal) test.som - SOM(Input, gr) par(mfrow=c(GridLength,GridWidth)) for(i in 1:(GridWidth*GridLength)) plot(test.som$codes[i,],type=l) Only when I use a larger grid (say for example 7*3 ) I get some of the representatives for the sin pattern. This must have something to do with the initialization of the grid, as the sin is so rare it is unlikely that I get it as a reference vector. Afterwards, because the selection for the training is also random it is also unlikely they are picked. I've been trying to modify some of the other parameters for the SOM also, but I would appreciatte some input to keep me going until I receive the reference books from my bookstore. Are my suspictions right? Should I be using the SOM for my study or should I look somewhere else? NOTE: I have no prior knowledge of whether the datasets I want to analyse will have rare cases or not or where they will be located. Thanks, Manuel __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] integration function
Dear R users, I have tried to write a function which gives the step-wise integral for an exponential function (moving from -3 to 3 in steps of 0.1, where the output for every step shall be the integral under the curve of y against x. However, something seems to be wrong with this function; can anyone please help me? x-seq(-3,3,0.1) y-exp(x) integral-function(z,a,b,step){ for(i in (1:((b-a)/step))){ c-0 c[i]-integrate(z,lower=a+(i-1)*step,upper=a+i*step) print(c$integral) }} integral(y,-3,3,0.1) Best regards Christoph __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Output from function to a tcltk window
Dear Peter and Henrik, What the Rcmdr does may be overkill for Henrik's application, since it also intercepts error and warning messages, and tries to take the behaviour of the R console. The relevant functions are in the file Commander.R in the source package; the principal one is: doItAndPrint - function(command, log=TRUE) { messages.connection - textConnection(.messages, open=w) sink(messages.connection, type=message) output.connection - textConnection(.Output, open=w) sink(output.connection, type=output) on.exit({ sink(type=message) if (!.console.output) sink(type=output) # if .console.output, output connection already closed close(messages.connection) close(output.connection) }) if (log) logger(command) result - try(eval(parse(text=command), envir=.GlobalEnv), silent=TRUE) if (class(result)[1] == try-error){ tkmessageBox(message=paste(Error:, strsplit(result, :)[[1]][2]), icon=error) if (.console.output) sink(type=output) tkfocus(.commander) return() } if (isS4object(result)) show(result) else print(result) if (.Output[length(.Output)] == NULL) .Output - .Output[-length(.Output)] # suppress NULL line at end of output if (length(.Output) != 0) { # is there output to print? if (.console.output) { out - .Output sink(type=output) for (line in out) cat(paste(line, \n, sep=)) } else{ for (line in .Output) tkinsert(.output, end, paste(line, \n, sep=)) tkyview.moveto(.output, 1) } } else if (.console.output) sink(type=output) checkWarnings(.messages) # errors already intercepted, display any warnings result } Regards, John John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard Sent: Friday, February 04, 2005 5:21 AM To: Henrik Andersson Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Output from function to a tcltk window Henrik Andersson [EMAIL PROTECTED] writes: I would like to display output to a tcltk window from e.g. a call to summary(). I tried to get something else than oneliners into a text window of the kind found at: http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/TextWindows.html But without success. (Rcmdr must be doing this sort of thing already?) I'd try this: 1) str - paste(capture.output(summary(myfit),collapse=\n)) 2) clone the tkfaq demo (or one of James W.'s examples), but replace the line tkinsert(txt, end, tkcmd(read, chn)) with tkinsert(txt, end, str) -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Installing R packages in windows
Vikas Rawal wrote: I need to install a selected set of packages on a number of machines (in a computer lab). Some of these machines are not connected to internet. Is it possible to download all the packages and make a kind of repository on a CD, and then install.packages from the CD? Yes, just download the packages and install.packages with CRAN=NULL ... Instead, you might want to mount the installed packages from a network volume instead, adding a second library path for R. So you only need to install stuff once. Uwe Ligges Vikas == This Mail was Scanned for Virus and found Virus free __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] integration function
Christoph Scherber wrote: Dear R users, I have tried to write a function which gives the step-wise integral for an exponential function (moving from -3 to 3 in steps of 0.1, where the output for every step shall be the integral under the curve of y against x. However, something seems to be wrong with this function; can anyone please help me? x-seq(-3,3,0.1) y-exp(x) integral-function(z,a,b,step){ for(i in (1:((b-a)/step))){ c-0 c[i]-integrate(z,lower=a+(i-1)*step,upper=a+i*step) integrate() expects a function, you specify a vector of values Uwe Liges print(c$integral) }} integral(y,-3,3,0.1) Best regards Christoph __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] integration function
The syntax you have you used is not correct. integrate() needs as first argument a function! see ?integrate for more info. a possible solution could be: x - seq(-3, 3, 0.1) y - exp(x) ## integral - function(z, a, b, step.){ cc - numeric(n - (b-a)/step.) f - function(x) exp(x) for(i in 1:n) cc[i] - integrate(f, lower=a+(i-1)*step., upper=a+i*step.)$val cc } integral(y, -3, 3, 0.1) However, since exp has known integral, you do not need to integrate: exp(x[-1])-exp(x[seq(1, length(x)-1)])) I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/336899 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm - Original Message - From: Christoph Scherber [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Friday, February 04, 2005 1:28 PM Subject: [R] integration function Dear R users, I have tried to write a function which gives the step-wise integral for an exponential function (moving from -3 to 3 in steps of 0.1, where the output for every step shall be the integral under the curve of y against x. However, something seems to be wrong with this function; can anyone please help me? x-seq(-3,3,0.1) y-exp(x) integral-function(z,a,b,step){ for(i in (1:((b-a)/step))){ c-0 c[i]-integrate(z,lower=a+(i-1)*step,upper=a+i*step) print(c$integral) }} integral(y,-3,3,0.1) Best regards Christoph __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Handling large data sets via scan()
I can usually read in large tables by very careful usage of read.table() without having to resort to scan(). In particular, using the `colClasses', `nrows', and `comment.char' arguments correctly can greatly reduce memory usage (and increase speed) when reading in data. Converting from a list to a data frame likely requires at least two copies of the data being stored in memory. Also, are you using a 64-bit operating system? -roger Nawaaz Ahmed wrote: I'm trying to read in datasets with roughly 150,000 rows and 600 features. I wrote a function using scan() to read it in (I have a 4GB linux machine) and it works like a charm. Unfortunately, converting the scanned list into a datafame using as.data.frame() causes the memory usage to explode (it can go from 300MB for the scanned list to 1.4GB for a data.frame of 3 rows) and it fails claiming it cannot allocate memory (though it is still not close to the 3GB limit per process on my linux box - the message is unable to allocate vector of size 522K). So I have three questions -- 1) Why is it failing even though there seems to be enough memory available? 2) Why is converting it into a data.frame causing the memory usage to explode? Am I using as.data.frame() wrongly? Should I be using some other command? 3) All the model fitting packages seem to want to use data.frames as their input. If I cannot convert my list into a data.frame what can I do? Is there any way of getting around this? Much thanks! Nawaaz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is anyone using the MiniR distribution?
Duncan == Duncan Murdoch [EMAIL PROTECTED] on Fri, 04 Feb 2005 09:50:22 + writes: Duncan I currently build two different versions of each Duncan Windows binary: the rw.exe full installation Duncan program (with the next release looking to be around Duncan 25 Megabytes), and a series of 8 diskette-sized Duncan files named miniR*. Duncan The miniR files only include a minimal installation Duncan of R, and are rarely tested. Rather than building Duncan something that may not even work, I'd like to stop Duncan building them. Duncan Would this be a problem for anyone? People for which this is a problem will probably not be subscribed to R-help because they'll be living in places / situations with bad / expensive internet connection. (Sorry to be not really constructive here). Martin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to read in .jpeg files
In case others are looking for a simple way to read in .jpeg files as ordinary matrices, here is my solution. I am only interested in greyscale images, so you will have to alter the following if you want colour. Most .jpegs are colour, so first step is to open the file with ImageMagick display and save as greyscale. Then convert (using IM convert) .jpg to .pgm (grey scale): convert -compress none groundhogbw.jpg groundhogbw.pgm The compress none is needed to make it stored as plain, not raw .pgm - using an ordinary editor you will see the top of the file is like this: P2 # magic number identifies file as plain .pgm 350 383 # width, height 255 # max grey level - the rest of the numbers are the grey levels left-right, top-bottom. - use scan() to read file in as a vector, then use matrix() to convert to matrix dims-scan(groundhogbw.pgm,skip=1,nlines=1) #skip magic number and read dims x-matrix(scan(groundhogbw.pgm,skip=3),ncol=dims[2],nrow=dims[1]) for(i in 1:dims[1]) x[i,]-rev(x[i,]) #flip the image vertically image(x,col=gray(0:255/255),axes=F) Bill __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Opening for a Statistics Practitioner in San Francisco
On Thu, 3 Feb 2005 16:34:17 -0800, you wrote: |=[:o) You will have a degree in statistics, be fluent in R and the Microsoft |=[:o) Office suite (especially Excel, Word and PowerPoint). SQL query skills are |=[:o) very helpful as is real-world business and marketing experience. |=[:o) fortune(59) Let's not kid ourselves: the most widely used piece of software for statistics is Excel. -- Brian D. Ripley (`Statistical Methods Need Software: A View of Statistical Computing') Opening lecture RSS 2002, Plymouth (September 2002) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] sink to file
Hello I would like to use the source(command) and write the output into a file. I am using outputfile=file(output.txt, open=wt) sink(outputfile, type=output) source(input.R, echo=TRUE) Unfortunately the result has prompted commands. How can I avoid the prompted commands data(iris), ...? Thanks data(iris) dataset = iris options(width = 50) summary(dataset) Sepal.LengthSepal.Width Petal.Length Min. :4.300 Min. :2.000 Min. :1.000 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 Median :5.800 Median :3.000 Median :4.350 Mean :5.843 Mean :3.057 Mean :3.758 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 Max. :7.900 Max. :4.400 Max. :6.900 Petal.Width Species Min. :0.100 setosa:50 1st Qu.:0.300 versicolor:50 Median :1.300 virginica :50 Mean :1.199 3rd Qu.:1.800 Max. :2.500 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] how to generate a function from a linear model
Hi All, I am trying to generate a function from a linear model. I think there should be build-in function that perform this action but I've had no luck finding it. For example, I have a model created using lm(). model - lm(sat.d~1+sat.n+I(sat.n^2)) What I would like to have is a function (similar to the one generated by splinefun()) so that I can use it on different data-sets. Thanks in advance for the help. Tony Han Bao [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Bayesian Network
Hello, I would like to use Bayesian Networks with R. I have already installed the package called deal which has succefully unpacked (package 'deal' successfully unpacked and MD5 sums checked) . But when I try to write - network (df) I have that kind of error message (be low)! rats - network(rats.df) Error: couldn't find function network Thank u for your help Alice [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] QCC and PlotMath question
For some reason, using the qcc package, I'm unable to use the plotmath notation in the title. Can anyone see what I'm doing wrong? library(qcc) a - rnorm(100) qcc(a,type=xbar.one,title=expression(bar(X)),ylab=expression(CFU/ft^3) ) This seems to not let the expression be evaluated, so I tried: qcc(a,type=xbar.one,title=eval(expression(bar(X))),ylab=expression(CFU /ft^3)) And get the following error: Error in eval(expr, envir, enclos) : couldn't find function bar Any thoughts? - Policies are many, Principles are few, Policies will change, Principles never do. -John C. Maxwell Shawn Way, PE __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to generate a function from a linear model
I don't think there's an automatic way to do this but you might try something like: model - lm(sat.d~1+sat.n+I(sat.n^2)) f - function(x) { predict(model, data.frame(sat.n = x)) } -roger Tony Han Bao wrote: Hi All, I am trying to generate a function from a linear model. I think there should be build-in function that perform this action but I've had no luck finding it. For example, I have a model created using lm(). model - lm(sat.d~1+sat.n+I(sat.n^2)) What I would like to have is a function (similar to the one generated by splinefun()) so that I can use it on different data-sets. Thanks in advance for the help. Tony Han Bao [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Roger D. Peng http://www.biostat.jhsph.edu/~rpeng/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to read in .jpeg files
for(i in 1:dims[1]) x[i,]-rev(x[i,]) #flip the image vertically Courtesy of Rolf Turner, here is a much better way to flip vertically: x - x[,ncol(x):1] Bill __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Bayesian Network
NDIKUMAGENGE Alice wrote: Hello, I would like to use Bayesian Networks with R. I have already installed the package called deal which has succefully unpacked (package 'deal' successfully unpacked and MD5 sums checked) . But when I try to write - network (df) I have that kind of error message (be low)! Quite certainly you forgot to load the package: library(deal) Uwe Ligges rats - network(rats.df) Error: couldn't find function network Thank u for your help Alice [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Bayesian Network
I guess you are using R on Windows, and install the binary version of the package. (Please tell us, as the Posting Guide asks, rather than leave us guessing.) Did you load the package with `library(deal)' before using the functions? Andy From: NDIKUMAGENGE Alice Hello, I would like to use Bayesian Networks with R. I have already installed the package called deal which has succefully unpacked (package 'deal' successfully unpacked and MD5 sums checked) . But when I try to write - network (df) I have that kind of error message (be low)! rats - network(rats.df) Error: couldn't find function network Thank u for your help Alice [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] how to generate a function from a linear model
predict() can do that for you without giving you the explicit form of the prediction function. I believe Prof. Harrell has facilities in his Design/Hmisc packages for producing functions from fitted models. Andy From: Tony Han Bao Hi All, I am trying to generate a function from a linear model. I think there should be build-in function that perform this action but I've had no luck finding it. For example, I have a model created using lm(). model - lm(sat.d~1+sat.n+I(sat.n^2)) What I would like to have is a function (similar to the one generated by splinefun()) so that I can use it on different data-sets. Thanks in advance for the help. Tony Han Bao [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is anyone using the MiniR distribution?
On Fri, 4 Feb 2005 14:46:03 +0100, Martin Maechler [EMAIL PROTECTED] wrote : Duncan == Duncan Murdoch [EMAIL PROTECTED] on Fri, 04 Feb 2005 09:50:22 + writes: Duncan The miniR files only include a minimal installation Duncan of R, and are rarely tested. Rather than building Duncan something that may not even work, I'd like to stop Duncan building them. Duncan Would this be a problem for anyone? People for which this is a problem will probably not be subscribed to R-help because they'll be living in places / situations with bad / expensive internet connection. (Sorry to be not really constructive here). That's a good point; I'll put a copy of my question in the miniR directory on CRAN. It's not a big problem to build it, but I don't want to test it if it's not being used. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] QCC and PlotMath question
Shawn Way wrote: For some reason, using the qcc package, I'm unable to use the plotmath notation in the title. Can anyone see what I'm doing wrong? library(qcc) a - rnorm(100) qcc(a,type=xbar.one,title=expression(bar(X)),ylab=expression(CFU/ft^3) ) This seems to not let the expression be evaluated, so I tried: qcc(a,type=xbar.one,title=eval(expression(bar(X))),ylab=expression(CFU /ft^3)) And get the following error: Error in eval(expr, envir, enclos) : couldn't find function bar Any thoughts? Add the title by a separate call: qcc(a, type=xbar.one, title=, ylab=expression(CFU/ft^3)) title(expression(bar(X))) Uwe Ligges - Policies are many, Principles are few, Policies will change, Principles never do. -John C. Maxwell Shawn Way, PE __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sink to file
Urs Wagner wrote: Hello I would like to use the source(command) and write the output into a file. I am using outputfile=file(output.txt, open=wt) sink(outputfile, type=output) source(input.R, echo=TRUE) Unfortunately the result has prompted commands. How can I avoid the prompted commands data(iris), ...? By *not* specifying echo=TRUE in source, but print()-ing the summary below. Uwe Ligges Thanks data(iris) dataset = iris options(width = 50) summary(dataset) Sepal.LengthSepal.Width Petal.Length Min. :4.300 Min. :2.000 Min. :1.000 1st Qu.:5.100 1st Qu.:2.800 1st Qu.:1.600 Median :5.800 Median :3.000 Median :4.350 Mean :5.843 Mean :3.057 Mean :3.758 3rd Qu.:6.400 3rd Qu.:3.300 3rd Qu.:5.100 Max. :7.900 Max. :4.400 Max. :6.900 Petal.Width Species Min. :0.100 setosa:50 1st Qu.:0.300 versicolor:50 Median :1.300 virginica :50 Mean :1.199 3rd Qu.:1.800 Max. :2.500 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] graphics examples
Very nice! I can't wait to buy the book. I have some plots I am working on that are surprisingly difficult to do : http://www.oplnk.net/~ajackson/weather/Temperature_2000.png and others in that directory for an example. The challenge was coloring in the polygons which were, in some cases, defined by the intersection of four curves, and also required interpolating the bounding curves to those intersection points. I'll post the code on the website tonight. Alan Jackson Staff Geophysicist Shell International Exploration and Production Inc. 3737 Bellaire Blvd, P O Box 481, Houston, Texas 77001-0481, USA Tel: +0117132457355 none Other Tel: +011-713-245-7355 Email: [EMAIL PROTECTED] Internet: http://www.shell.com/eandp-en -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of [EMAIL PROTECTED] Sent: Thursday, February 03, 2005 8:35 PM To: r-help@stat.math.ethz.ch Subject: [R] graphics examples Hi I have put up some web pages containing a number of plots (and diagrams) produced using R (they correspond to the figures for a book that I am working on about R graphics), with the relevant R code provided for each plot (or diagram), at http://www.stat.auckland.ac.nz/~paul/RGraphics/rgraphics.html Hope these are of some help/use; comments/suggestions welcome. Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] 2 small problems: integer division and the nature of NA
Hi, I'm wondering why 48 %/% 2 gives 24 but 4.8 %/% 0.2 gives 23... I'm not trying to round up here, but to find out how many times something fits into something else, and the answer should have been the same for both examples, no? On a different topic, I like the behavior of NAs better in R than in SAS (at least they are not considered the smallest value for a variable), but at the same time I am surprised that the sum of NAs is 0 instead of NA. The sum of a vector having at least one NA but also valid data gives NA if we do not specify na.rm=T. But with na.rm=T, we are telling sum to give the sum of valid data, ignoring NAs that do not tell us anything about the value of a variable. I found out while getting the sum of small subsets of my data (such as when subsetting by several variables), sometimes a cell only contained NAs for my response variable. I would have expected the sum to be NA in such cases, as I do not have a single data point telling me the value of my response here. But R tells me the sum was zero in that cell! Was this behavior considered desirable when sum was built? If not, any hope it will be fixed? Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Compilation of R (linux) package on windows
Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? -- Alexandre DEPIRE INRETS / GARIG __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Keeping the data of C structure in R variables?..
Dear all, does anybody know if there is a way to implement the following idea: if for example I have a C/C++ structure of form: struct { int size; char * data; } SData; in C code I could create some implementation that would create this structure by pointer and fill in the data, so I would have a variable something like SData* myData; Now what I need is to pass this data to a certain SEXP structure and keep it completely in R, thus setting myData = NULL and _unloading the C library_; then later I want to create another variable, in another C call, SData* myOldData and reload it with values from R. Is there a way to do that, keeping also in mind that char* data is generally binary data. Would be greatful for any suggestions. Regards Oleg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compilation of R (linux) package on windows
Depire Alexandre wrote: Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? Please read the R for Windows FAQ 3.1 Can I install packages into libraries in this version?. It points you to README.packages, http://www.murdoch-sutherland.com/Rtools/ , and tells you Note that this is rather tricky; please do ensure that you have followed the instructions exactly. Uwe Ligges __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 2 small problems: integer division and the nature of NA
Denis Chabot wrote: Hi, I'm wondering why 48 %/% 2 gives 24 but 4.8 %/% 0.2 gives 23... I'm not trying to round up here, but to find out how many times something fits into something else, and the answer should have been the same for both examples, no? No. Not from the perspective of a digital computer who cannot represent all real numbers exactly (well, only a very small subset, since we are using floating point arithmetics) ... On a different topic, I like the behavior of NAs better in R than in SAS (at least they are not considered the smallest value for a variable), but at the same time I am surprised that the sum of NAs is 0 instead of NA. It *is* NA: sum(c(NA, NA)) # [1] NA sum(c(NA, 1)) # [1] NA The sum of a vector having at least one NA but also valid data gives NA if we do not specify na.rm=T. But with na.rm=T, we are telling sum to give the sum of valid data, ignoring NAs that do not tell us anything about the value of a variable. I found out while getting the sum of small subsets of my data (such as when subsetting by several variables), sometimes a cell only contained NAs for my response variable. I would have expected the sum to be NA in such cases, as I do not have a single data point telling me the value of my response here. But R tells me the sum was zero in that cell! Was this behavior considered desirable when sum was built? If not, any hope it will be fixed? I don't get your point! If you *remove* NAs as in sum(c(NA, NA), na.rm=TRUE) # [1] 0 sum(c(NA, 1), na.rm=TRUE) # [1] 1 you are summing up not that much so what do you expect in the cases above? Please read the docs on NA handling. Uwe Ligges Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 2 small problems: integer division and the nature of NA
It's the difference between integers and reals: 48 and 24 are integers; 4.8 and 0.2 are floating point numbers. Consider: (4.8+.Machine$double.eps) %/% (0.2-.Machine$double.eps) [1] 24 (4.8-.Machine$double.eps) %/% (0.2+.Machine$double.eps) [1] 23 Does this help? spencer graves Denis Chabot wrote: Hi, I'm wondering why 48 %/% 2 gives 24 but 4.8 %/% 0.2 gives 23... I'm not trying to round up here, but to find out how many times something fits into something else, and the answer should have been the same for both examples, no? On a different topic, I like the behavior of NAs better in R than in SAS (at least they are not considered the smallest value for a variable), but at the same time I am surprised that the sum of NAs is 0 instead of NA. The sum of a vector having at least one NA but also valid data gives NA if we do not specify na.rm=T. But with na.rm=T, we are telling sum to give the sum of valid data, ignoring NAs that do not tell us anything about the value of a variable. I found out while getting the sum of small subsets of my data (such as when subsetting by several variables), sometimes a cell only contained NAs for my response variable. I would have expected the sum to be NA in such cases, as I do not have a single data point telling me the value of my response here. But R tells me the sum was zero in that cell! Was this behavior considered desirable when sum was built? If not, any hope it will be fixed? Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 2 small problems: integer division and the nature of NA
Denis Chabot [EMAIL PROTECTED] writes: Hi, I'm wondering why 48 %/% 2 gives 24 but 4.8 %/% 0.2 gives 23... I'm not trying to round up here, but to find out how many times something fits into something else, and the answer should have been the same for both examples, no? Well, you can't trust floating point numbers to give you an exact result: 4.8 / 0.2 - 24 [1] -3.552714e-15 and even (48/10) / (2/10) - 24 [1] -3.552714e-15 the basic issue being that tenths are not exactly representable in binary floating point. I think very few people even expected you to use integer division on non-integers, but I note that the claim on the help page actually holds: 0.2 * 4.8 %/% 0.2 + 4.8 %% 0.2 == 4.8 [1] TRUE On a different topic, I like the behavior of NAs better in R than in SAS (at least they are not considered the smallest value for a variable), but at the same time I am surprised that the sum of NAs is 0 instead of NA. The sum of a vector having at least one NA but also valid data gives NA if we do not specify na.rm=T. But with na.rm=T, we are telling sum to give the sum of valid data, ignoring NAs that do not tell us anything about the value of a variable. I found out while getting the sum of small subsets of my data (such as when subsetting by several variables), sometimes a cell only contained NAs for my response variable. I would have expected the sum to be NA in such cases, as I do not have a single data point telling me the value of my response here. But R tells me the sum was zero in that cell! Was this behavior considered desirable when sum was built? If not, any hope it will be fixed? Yes it was, and no there isn't. In math, the sum over an empty index set is zero, which has some nice consistency properties (the sum over a disjoint union of sets is the sum of the sums over each set, for instance. -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Keeping the data of C structure in R variables?..
I think you should have a look at external pointers (type EXTPTRSXP). They are used in the R source . See, for example, memory.c. Also see the developer page notes on weak references, finalizers, etc, which you'll need to be familiar with. This is really an R-devel question! Reid Huntsinger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Oleg Sklyar Sent: Friday, February 04, 2005 11:11 AM To: R-help@stat.math.ethz.ch Subject: [R] Keeping the data of C structure in R variables?.. Dear all, does anybody know if there is a way to implement the following idea: if for example I have a C/C++ structure of form: struct { int size; char * data; } SData; in C code I could create some implementation that would create this structure by pointer and fill in the data, so I would have a variable something like SData* myData; Now what I need is to pass this data to a certain SEXP structure and keep it completely in R, thus setting myData = NULL and _unloading the C library_; then later I want to create another variable, in another C call, SData* myOldData and reload it with values from R. Is there a way to do that, keeping also in mind that char* data is generally binary data. Would be greatful for any suggestions. Regards Oleg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] 2 small problems: integer division and the nature of NA
It's convention in mathematics that the empty sum is 0. You can think of this as a generalization of 0*x = 0. Reid Huntsinger -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Denis Chabot Sent: Friday, February 04, 2005 11:01 AM To: r-help@stat.math.ethz.ch Subject: [R] 2 small problems: integer division and the nature of NA Hi, I'm wondering why 48 %/% 2 gives 24 but 4.8 %/% 0.2 gives 23... I'm not trying to round up here, but to find out how many times something fits into something else, and the answer should have been the same for both examples, no? On a different topic, I like the behavior of NAs better in R than in SAS (at least they are not considered the smallest value for a variable), but at the same time I am surprised that the sum of NAs is 0 instead of NA. The sum of a vector having at least one NA but also valid data gives NA if we do not specify na.rm=T. But with na.rm=T, we are telling sum to give the sum of valid data, ignoring NAs that do not tell us anything about the value of a variable. I found out while getting the sum of small subsets of my data (such as when subsetting by several variables), sometimes a cell only contained NAs for my response variable. I would have expected the sum to be NA in such cases, as I do not have a single data point telling me the value of my response here. But R tells me the sum was zero in that cell! Was this behavior considered desirable when sum was built? If not, any hope it will be fixed? Sincerely, Denis Chabot __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] genetic algorithm
Hi, I am doing some research on feature selection for classfication problem using genetic algorithm in a wrapper approach. I am wondering if there is some package which is already built for this purpose. I was advised before about dprep package but I don't think it used GA there (if I am wrong, please correct me!) Thanks, Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compilation of R (linux) package on windows
On Fri, 4 Feb 2005, Uwe Ligges wrote: Depire Alexandre wrote: Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? Please read the R for Windows FAQ 3.1 Can I install packages into libraries in this version?. It points you to README.packages, http://www.murdoch-sutherland.com/Rtools/ , and tells you Note that this is rather tricky; please do ensure that you have followed the instructions exactly. To reinforce that, djgpp is a DOS (extender) and not a Windows compiler. You need a native Windows compiler, from www.mingw.org, and currently we suggest the release candidate of MinGW-3.2.0 (which postdates the details in the last release of R, 2.0.1). -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compilation of R (linux) package on windows
On windows, I install the last version of MinGW, I change path environment variable, but when on command windows, I try to compute R CMD SHLIB inv.c I have the following error: 'make' is unknown. I have mingw32-make.exe, but R don't use it, ?? I think it is'nt normal, but I don't know how change the name of it in R. Le vendredi 4 Février 2005 18:37, Prof Brian Ripley a écrit : On Fri, 4 Feb 2005, Uwe Ligges wrote: Depire Alexandre wrote: Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? Please read the R for Windows FAQ 3.1 Can I install packages into libraries in this version?. It points you to README.packages, http://www.murdoch-sutherland.com/Rtools/ , and tells you Note that this is rather tricky; please do ensure that you have followed the instructions exactly. To reinforce that, djgpp is a DOS (extender) and not a Windows compiler. You need a native Windows compiler, from www.mingw.org, and currently we suggest the release candidate of MinGW-3.2.0 (which postdates the details in the last release of R, 2.0.1). -- Alexandre DEPIRE INRETS / GARIG __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compilation of R (linux) package on windows
Is it easier to compute .dll on linux, via cross-compiler ? Le vendredi 4 Février 2005 18:37, Prof Brian Ripley a écrit : On Fri, 4 Feb 2005, Uwe Ligges wrote: Depire Alexandre wrote: Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? Please read the R for Windows FAQ 3.1 Can I install packages into libraries in this version?. It points you to README.packages, http://www.murdoch-sutherland.com/Rtools/ , and tells you Note that this is rather tricky; please do ensure that you have followed the instructions exactly. To reinforce that, djgpp is a DOS (extender) and not a Windows compiler. You need a native Windows compiler, from www.mingw.org, and currently we suggest the release candidate of MinGW-3.2.0 (which postdates the details in the last release of R, 2.0.1). -- Alexandre DEPIRE INRETS / GARIG __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Compilation of R (linux) package on windows
From: Depire Alexandre On windows, I install the last version of MinGW, I change path environment variable, but when on command windows, I try to compute R CMD SHLIB inv.c I have the following error: 'make' is unknown. I have mingw32-make.exe, but R don't use it, ?? I think it is'nt normal, but I don't know how change the name of it in R. Confusion is the price you pay for not following the directions given. You need to download and install the tools in Rtools.zip as mentioned in README.packages, and have it the the appropriate position in the PATH, as also mentioned in README.packages. Andy Le vendredi 4 Février 2005 18:37, Prof Brian Ripley a écrit : On Fri, 4 Feb 2005, Uwe Ligges wrote: Depire Alexandre wrote: Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? Please read the R for Windows FAQ 3.1 Can I install packages into libraries in this version?. It points you to README.packages, http://www.murdoch-sutherland.com/Rtools/ , and tells you Note that this is rather tricky; please do ensure that you have followed the instructions exactly. To reinforce that, djgpp is a DOS (extender) and not a Windows compiler. You need a native Windows compiler, from www.mingw.org, and currently we suggest the release candidate of MinGW-3.2.0 (which postdates the details in the last release of R, 2.0.1). -- Alexandre DEPIRE INRETS / GARIG __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sink to file
On Fri, 4 Feb 2005, Uwe Ligges wrote: Urs Wagner wrote: Hello I would like to use the source(command) and write the output into a file. I am using outputfile=file(output.txt, open=wt) sink(outputfile, type=output) source(input.R, echo=TRUE) Unfortunately the result has prompted commands. How can I avoid the prompted commands data(iris), ...? By *not* specifying echo=TRUE in source, but print()-ing the summary below. There is also a print.eval= argument to source(), so that printing of the output can be controlled independently of echoing the input. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] genetic algorithm
To my knowledge, two packages have an implementation of an evolutionary, or genentic, algorithm. Gafit is a curve fitting package and rgenoud for function minimisation (combined with, iirc, a derivative-based Quasi-Newton approach for unconstrained problems). One thing, in the S-Plus robust library, the robust regression package lmRob has an option to use a genetic algorithm in the resampling scheme to obtain initial S-estimates. Regards Mike -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of WeiWei Shi Sent: 04 February 2005 17:01 To: R-help@stat.math.ethz.ch Subject: [R] genetic algorithm Hi, I am doing some research on feature selection for classfication problem using genetic algorithm in a wrapper approach. I am wondering if there is some package which is already built for this purpose. I was advised before about dprep package but I don't think it used GA there (if I am wrong, please correct me!) Thanks, Ed __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compilation of R (linux) package on windows
Depire Alexandre wrote: Is it easier to compute .dll on linux, via cross-compiler ? The Windows way described in README.packages is not hard, you just have to follow the advices. Once your system has been set up, it's the same as native compiling on Linux. Uwe Ligges Le vendredi 4 Février 2005 18:37, Prof Brian Ripley a écrit : On Fri, 4 Feb 2005, Uwe Ligges wrote: Depire Alexandre wrote: Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? Please read the R for Windows FAQ 3.1 Can I install packages into libraries in this version?. It points you to README.packages, http://www.murdoch-sutherland.com/Rtools/ , and tells you Note that this is rather tricky; please do ensure that you have followed the instructions exactly. To reinforce that, djgpp is a DOS (extender) and not a Windows compiler. You need a native Windows compiler, from www.mingw.org, and currently we suggest the release candidate of MinGW-3.2.0 (which postdates the details in the last release of R, 2.0.1). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compilation of R (linux) package on windows
Depire Alexandre wrote: On windows, I install the last version of MinGW, I change path environment variable, but when on command windows, I try to compute R CMD SHLIB inv.c I have the following error: 'make' is unknown. I have mingw32-make.exe, but R don't use it, ?? I think it is'nt normal, but I don't know how change the name of it in R. I already quoted: please do ensure that you have followed the instructions exactly. But you haven't At least you have the tools from http://www.murdoch-sutherland.com/Rtools/ either not downloaded or not in your path. Uwe Ligges Le vendredi 4 Février 2005 18:37, Prof Brian Ripley a écrit : On Fri, 4 Feb 2005, Uwe Ligges wrote: Depire Alexandre wrote: Hello, I develop some R package on Linux machine with C subroutines. The programs in C are well compiled on Linux machine and so I have some .so files. Now, I want to do the same work on windows, so I install R (the last version) on windows, with Active Perl and djgpp, which is, as I know, the gcc version for windows (to compile C program), but unfortunately when I run R CMD SHLIB inv.c, , I have an error. I think it's a problem with my choice of compiler C, could somebody give to me the name of good compiler to do that ? Please read the R for Windows FAQ 3.1 Can I install packages into libraries in this version?. It points you to README.packages, http://www.murdoch-sutherland.com/Rtools/ , and tells you Note that this is rather tricky; please do ensure that you have followed the instructions exactly. To reinforce that, djgpp is a DOS (extender) and not a Windows compiler. You need a native Windows compiler, from www.mingw.org, and currently we suggest the release candidate of MinGW-3.2.0 (which postdates the details in the last release of R, 2.0.1). __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Building a Matrix
Dear List: I am having some difficulty constructing a matrix that must take a specific form. The matrix must be have a lower block of non-zero values and the rest must all be zero. For example, if I am building an n X n matrix, then the first n/2 rows need to be zero and the first n/2 columns must remain as zero with all other elements having a non-zero value that I specify. For example, assume I start with the following 4 x 4 matrix: vl.mat - matrix(0,4,4) [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]0000 [4,]0000 I need for the for the first two columns (4/2) to remain as zero and the first two rows (4/2) to remain as zeros. But I need for the bottom block to include some values that I specify. [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]00 100 100 [4,]00 100 100 I know that if I use the following I can build a matrix with values along the diagonals where I need them, but I need for the lower block of off-diagonals to also be the same value. vl.mat - matrix(0,4,4) vl.mat[(col(vl.mat)%%4 == row(vl.mat)%%4)col(vl.mat)%%4 !=1 col(vl.mat)%%4 !=2 ] - 100 Can anyone offer a suggestion? Thank you. -Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Building a Matrix
Is this what you want? mat - matrix(0, 4, 4) mat[col(mat) 2 row(mat) 2] - 100 mat [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]00 100 100 [4,]00 100 100 Andy From: Doran, Harold Dear List: I am having some difficulty constructing a matrix that must take a specific form. The matrix must be have a lower block of non-zero values and the rest must all be zero. For example, if I am building an n X n matrix, then the first n/2 rows need to be zero and the first n/2 columns must remain as zero with all other elements having a non-zero value that I specify. For example, assume I start with the following 4 x 4 matrix: vl.mat - matrix(0,4,4) [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]0000 [4,]0000 I need for the for the first two columns (4/2) to remain as zero and the first two rows (4/2) to remain as zeros. But I need for the bottom block to include some values that I specify. [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]00 100 100 [4,]00 100 100 I know that if I use the following I can build a matrix with values along the diagonals where I need them, but I need for the lower block of off-diagonals to also be the same value. vl.mat - matrix(0,4,4) vl.mat[(col(vl.mat)%%4 == row(vl.mat)%%4)col(vl.mat)%%4 !=1 col(vl.mat)%%4 !=2 ] - 100 Can anyone offer a suggestion? Thank you. -Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] (no subject)
Hi. I have a problem that I can't seem to find an optimal way of solving other than by doing things manually. I'm trying to subset a data frame by the number of observations that occurred at a given row but want to take into account the number of observations of preceding rows. Here's an example. I'm looking at intervals of data [10,20), [10, 30), ., [10,120) which contain a certain number of observations for treatment A and treatment B. An example is given by the following code. int - as.factor(paste([, rep(10, 11), ,, seq(20,120, by=10), ))) nsamA - c(62, 83, 118, 151, 180, 201, 212, 215, 216, 217, 218) nsamB - c(65, 90, 128, 163, 190, 199, 209, 214, 215, 216, 218) df0 - data.frame(int, nsamA, nsamB) df0 Since the interval [10, s) with n_s samples is nested in [10, t)with n_t sample for s t, we know n_s - n_t samples exist in the interval [s, t). If this sample size of the difference is small I want to exclude the interval [10,s). This can be done comparing adjacent preceding rows using the following. df0$itagA - ifelse(c(10, diff(nsamA)) = 4, 1, 0) df0$itagB - ifelse(c(10, diff(nsamB)) = 4, 1, 0) df0 # Subset df0 on the tag results df1 - df0[df0$itagA != 1 df0$itagB != 1,] df1 This works fine, but here is my problem. This simply looks at only the immediate preceding row and not at rows further down the line. What I would like to do is include the next interval that includes 5 or more samples from each group since earlier intervals are nested in the latter intervals. In the example given this would include the final interval [10, 120) as this contains more than 4 samples for each treatment. I can do this by hand using something like df0[c(1:7,11),] But this is not an attractive solution as it requires me to actually look at the data set each time and determine the row numbers. This works for this case, but I have many intervals (rows of data) to look at and this would be cumbersome. I've considered using diff with different lag arguments, but this still doesn't seem to work. I also want to note that I need to keep the int factor (as used in the example above) as this is used throughout my analysis (i.e. this is a true factor variable and not simply denoting an interval). I'd be grateful for any possible suggestions as I'm stumped at this moment. Thanks, Mat R v. 2.0.1 on Windows XP Disclaimer: The views and opinions expressed in this email are of the author and not of the Food and Drug Administration. *** Mat Soukup, Ph.D. Mathematical Statistician, Biometrics III Center for Drug Evaluation and Research 9201 Corporate Blvd. Rm. N250 Phone: 301.827.2081 *** [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Building a Matrix
Does the following do what you want: vl.mat - matrix(0,4,4) i34 - 3:4 vl.mat[i34,i34] - 100 vl.mat [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]00 100 100 [4,]00 100 100 spencer graves Doran, Harold wrote: Dear List: I am having some difficulty constructing a matrix that must take a specific form. The matrix must be have a lower block of non-zero values and the rest must all be zero. For example, if I am building an n X n matrix, then the first n/2 rows need to be zero and the first n/2 columns must remain as zero with all other elements having a non-zero value that I specify. For example, assume I start with the following 4 x 4 matrix: vl.mat - matrix(0,4,4) [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]0000 [4,]0000 I need for the for the first two columns (4/2) to remain as zero and the first two rows (4/2) to remain as zeros. But I need for the bottom block to include some values that I specify. [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]00 100 100 [4,]00 100 100 I know that if I use the following I can build a matrix with values along the diagonals where I need them, but I need for the lower block of off-diagonals to also be the same value. vl.mat - matrix(0,4,4) vl.mat[(col(vl.mat)%%4 == row(vl.mat)%%4)col(vl.mat)%%4 !=1 col(vl.mat)%%4 !=2 ] - 100 Can anyone offer a suggestion? Thank you. -Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Building a Matrix
Unless I'm missing something, all you need to do is create the large matrix and then replace the submatrix via subscripting the rows and columns: ans - matrix(0, n, n) sub.seq - floor(n/2 + 1):n ans[sub.seq, sub.seq] - submatrix Patrick Burns Burns Statistics [EMAIL PROTECTED] +44 (0)20 8525 0696 http://www.burns-stat.com (home of S Poetry and A Guide for the Unwilling S User) Doran, Harold wrote: Dear List: I am having some difficulty constructing a matrix that must take a specific form. The matrix must be have a lower block of non-zero values and the rest must all be zero. For example, if I am building an n X n matrix, then the first n/2 rows need to be zero and the first n/2 columns must remain as zero with all other elements having a non-zero value that I specify. For example, assume I start with the following 4 x 4 matrix: vl.mat - matrix(0,4,4) [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]0000 [4,]0000 I need for the for the first two columns (4/2) to remain as zero and the first two rows (4/2) to remain as zeros. But I need for the bottom block to include some values that I specify. [,1] [,2] [,3] [,4] [1,]0000 [2,]0000 [3,]00 100 100 [4,]00 100 100 I know that if I use the following I can build a matrix with values along the diagonals where I need them, but I need for the lower block of off-diagonals to also be the same value. vl.mat - matrix(0,4,4) vl.mat[(col(vl.mat)%%4 == row(vl.mat)%%4)col(vl.mat)%%4 !=1 col(vl.mat)%%4 !=2 ] - 100 Can anyone offer a suggestion? Thank you. -Harold [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] simple example of C interface to R
i'd like to use the C interface to R in a program i'm writing. as a starting point, i'm trying to create a very simple C program that uses R. i've read the R documentation on this, but i'm having trouble figuring out where SEXP is defined and how to use it. i noticed someone else on this list also tried to use the C interface, but they ran into similar problems: http://maths.newcastle.edu.au/~rking/R/help/03b/1942.html could someone show me a simple example of how to use the R interface to C? thank you, jason dunsmore __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] proportional chance criteria
Is there an R function that I can use to calculate the p-value from the Z statistics computed for the relationship between chance and observed proportions in predictions. More sprcifically I am refering to proportional chance criteria (Cpro). Details are in Huberty's book on Applied discriminant analysis, unfortunately our library has misplaced the book. I got some details from the following page http://marketing.byu.edu/htmlpages/tutorials/discriminant.htm but I still need to compute the p-val from their Z-statistic. Thanks ../Murli __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] 2 small problems: integer division and the nature of NA
Denis Chabot chabotd at globetrotter.net writes: : The sum of a vector having at least one NA but also valid data gives NA : if we do not specify na.rm=T. But with na.rm=T, we are telling sum to : give the sum of valid data, ignoring NAs that do not tell us anything : about the value of a variable. I found out while getting the sum of : small subsets of my data (such as when subsetting by several : variables), sometimes a cell only contained NAs for my response : variable. I would have expected the sum to be NA in such cases, as I do : not have a single data point telling me the value of my response here. : But R tells me the sum was zero in that cell! Was this behavior : considered desirable when sum was built? If not, any hope it will be : fixed? Think of it this way: If u and v are index vectors then its desirable that sum(x[u]) + sum(x[v]) == sum(x[c(u,v)]) hold for zero length index vectors too in which case sum(numeric()) should be zero, not NA. If you want a short expression that gives NA for zero length x try this: sum(x) + if (length(x)) 0 else NA or define your own function, sum0, say. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] simple example of C interface to R
See if this helps: http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Dalgaard.pdf Andy From: [EMAIL PROTECTED] i'd like to use the C interface to R in a program i'm writing. as a starting point, i'm trying to create a very simple C program that uses R. i've read the R documentation on this, but i'm having trouble figuring out where SEXP is defined and how to use it. i noticed someone else on this list also tried to use the C interface, but they ran into similar problems: http://maths.newcastle.edu.au/~rking/R/help/03b/1942.html could someone show me a simple example of how to use the R interface to C? thank you, jason dunsmore __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] no. at risk in survfit()
array chip wrote: Hi, when I generated a survfit() object, I can get number of patients at risk at various time points by using summary(): fit-survfit(Surv(time,status)~class,data=mtdata) summary(fit) class=1 time n.risk n.event survival std.err lower 95% CI upper 95% CI 9.9 78 10.987 0.0127 0.963 1 41.5 77 10.974 0.0179 0.940 1 54.0 76 10.962 0.0218 0.920 1 99.1 38 10.936 0.0328 0.874 1 class=2 time n.risk n.event survival std.err lower 95% CI upper 95% CI 6.9102 10.990 0.00976 0.971 1.000 8.0101 10.980 0.01373 0.954 1.000 14.4100 10.971 0.01673 0.938 1.000 16.1 99 10.961 0.01922 0.924 0.999 16.6 98 10.951 0.02138 0.910 0.994 18.7 97 10.941 0.02330 0.897 0.988 : : : I have many censoring observations in the dataset, and I would like to know the number of patients at risk (n.risk in the above output) for certain time points, for example at 60, 72, etc, which is not available from the above printout for class=1. Is there anyway I can get them? Thanks The Design package's survplot function can print n.risk over equally spaced time points. You might see an easy way to print this by looking at the code. -Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] no. at risk in survfit()
Have you looked at the times argument to the summary method? --Matt Matt Austin Statistician Amgen One Amgen Center Drive M/S 24-2-C Thousand Oaks CA 93021 (805) 447 - 7431 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of array chip Sent: Friday, February 04, 2005 12:2 PM To: r-help@stat.math.ethz.ch Subject: [R] no. at risk in survfit() Hi, when I generated a survfit() object, I can get number of patients at risk at various time points by using summary(): fit-survfit(Surv(time,status)~class,data=mtdata) summary(fit) class=1 time n.risk n.event survival std.err lower 95% CI upper 95% CI 9.9 78 10.987 0.0127 0.963 1 41.5 77 10.974 0.0179 0.940 1 54.0 76 10.962 0.0218 0.920 1 99.1 38 10.936 0.0328 0.874 1 class=2 time n.risk n.event survival std.err lower 95% CI upper 95% CI 6.9102 10.990 0.00976 0.971 1.000 8.0101 10.980 0.01373 0.954 1.000 14.4100 10.971 0.01673 0.938 1.000 16.1 99 10.961 0.01922 0.924 0.999 16.6 98 10.951 0.02138 0.910 0.994 18.7 97 10.941 0.02330 0.897 0.988 : : : I have many censoring observations in the dataset, and I would like to know the number of patients at risk (n.risk in the above output) for certain time points, for example at 60, 72, etc, which is not available from the above printout for class=1. Is there anyway I can get them? Thanks __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R package with C code on Windows
Dear R helpers, MyPkg passes R CMD check on Linux machines. However, when I 'R CMD check myPkg' on Windows, the libs subdirectory is not being created. If I install the package and then create the libs subdirectory manually and copy the dll files to it, the package seems to work fine (but that's not good enough for submitting it to CRAN). Any advice will be appreciated, Thanks, Sigal __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] arrow head styles?
dear R wizards: is it possible to specify different arrow head styles? E.g., a solid arrow head? Or a bent arrow head? Or a longer or shorter arrow head? (perhaps through an add in?) I guess I could write this myself, but since arrows is built-in, I was hoping it had some flexibility hidden in it that I did not see in ?arrows. sincerely, /iaw --- ivo welch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] simple example of C interface to R
On Fri, Feb 04, 2005 at 09:09:37PM +0100, Roger Bivand wrote: Well, it is documented in the Writing R Extensions manual: http://cran.r-project.org/doc/manuals/R-exts.html#System-and-foreign-language-interfaces thanks. reading through that a second time made all the difference. i still think a complete example (a very simple C program that uses R and compiles without errors) would be good to have in the documentation. jason __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How to access results of survival analysis
Hello, it seems that the main results of survival analysis with package survival are shown only as side effects of the print method. If I compute e.g. a Kaplan-Meier estimate by km.survdur-survfit(s.survdur) then I can simply print the results by km.survdur Call: survfit(formula = s.survdur) n events median 0.95LCL 0.95UCL 100.058.046.841.079.3 Is there a simple method to access these results, e.g. if I want to print only the median with the confidence limits? Regarding the results of a Cox-PH-model I face the same situation. The printed results are: cx.survdur.ipss_mds.sex Call: coxph(formula = s.survdur ~ x1 + x2, method = efron) coef exp(coef) se(coef) z p x1 0.6424 1.900.206 3.123 0.0018 x2.L 0.0616 1.060.263 0.234 0.8100 Likelihood ratio test=9.56 on 2 df, p=0.0084 n=58 (42 observations deleted due to missing) Is there a simple method to copy e.g. the coefficients and p-values in a new object? I am working with: R : Copyright 2004, The R Foundation for Statistical Computing Version 2.0.1 (2004-11-15), ISBN 3-900051-07-0 Survival package version: survival_2.16 Operating System: Windows 98SE Thanks, Heinz Tüchler __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: Re: [R] Installing R packages in windows
Thank you. That is useful. But is it possible to download all the packages in one go, or would one have download each one by one? Vikas -Original Message- From: Uwe Ligges [EMAIL PROTECTED] To: Vikas Rawal [EMAIL PROTECTED] Date: Fri, 04 Feb 2005 13:55:42 +0100 Subject: Re: [R] Installing R packages in windows Vikas Rawal wrote: I need to install a selected set of packages on a number of machines (in a computer lab). Some of these machines are not connected to internet. Is it possible to download all the packages and make a kind of repository on a CD, and then install.packages from the CD? Yes, just download the packages and install.packages with CRAN=NULL ... Instead, you might want to mount the installed packages from a network volume instead, adding a second library path for R. So you only need to install stuff once. Uwe Ligges Vikas == This Mail was Scanned for Virus and found Virus free __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: Re: [R] Installing R packages in windows
A half-way decent ftp client would allow you to get all files in a directory, so that ought to be quite easy. Andy From: Vikas Rawal Thank you. That is useful. But is it possible to download all the packages in one go, or would one have download each one by one? Vikas -Original Message- From: Uwe Ligges [EMAIL PROTECTED] To: Vikas Rawal [EMAIL PROTECTED] Date: Fri, 04 Feb 2005 13:55:42 +0100 Subject: Re: [R] Installing R packages in windows Vikas Rawal wrote: I need to install a selected set of packages on a number of machines (in a computer lab). Some of these machines are not connected to internet. Is it possible to download all the packages and make a kind of repository on a CD, and then install.packages from the CD? Yes, just download the packages and install.packages with CRAN=NULL ... Instead, you might want to mount the installed packages from a network volume instead, adding a second library path for R. So you only need to install stuff once. Uwe Ligges Vikas == This Mail was Scanned for Virus and found Virus free __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Building a Matrix
Doran, Harold HDoran at air.org writes: : : Dear List: : : I am having some difficulty constructing a matrix that must take a : specific form. The matrix must be have a lower block of non-zero values : and the rest must all be zero. For example, if I am building an n X n : matrix, then the first n/2 rows need to be zero and the first n/2 : columns must remain as zero with all other elements having a non-zero : value that I specify. : : For example, assume I start with the following 4 x 4 matrix: : : vl.mat - matrix(0,4,4) : : [,1] [,2] [,3] [,4] : [1,]0000 : [2,]0000 : [3,]0000 : [4,]0000 : : I need for the for the first two columns (4/2) to remain as zero and the : first two rows (4/2) to remain as zeros. But I need for the bottom block : to include some values that I specify. : : [,1] [,2] [,3] [,4] : [1,]0000 : [2,]0000 : [3,]00 100 100 : [4,]00 100 100 : : I know that if I use the following I can build a matrix with values : along the diagonals where I need them, but I need for the lower block of : off-diagonals to also be the same value. : : vl.mat - matrix(0,4,4) : vl.mat[(col(vl.mat)%%4 == row(vl.mat)%%4)col(vl.mat)%%4 !=1 : col(vl.mat)%%4 !=2 ] - 100 : If A is the 2x2 submatrix that is to go in the lower right hand corner then: kronecker(diag(0:1), A) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] interval partition problem [was: (no subject)]
Soukup, Matt SoukupM at cder.fda.gov writes: : : Hi. : : I have a problem that I can't seem to find an optimal way of solving other : than by doing things manually. I'm trying to subset a data frame by the : number of observations that occurred at a given row but want to take into : account the number of observations of preceding rows. Here's an example. : : I'm looking at intervals of data [10,20), [10, 30), ., [10,120) which : contain a certain number of observations for treatment A and treatment B. An : example is given by the following code. : : int - as.factor(paste([, rep(10, 11), ,, seq(20,120, by=10), ))) : nsamA - c(62, 83, 118, 151, 180, 201, 212, 215, 216, 217, 218) : nsamB - c(65, 90, 128, 163, 190, 199, 209, 214, 215, 216, 218) : : df0 - data.frame(int, nsamA, nsamB) : df0 : : Since the interval [10, s) with n_s samples is nested in [10, t)with n_t : sample for s t, we know n_s - n_t samples exist in the interval [s, t). If : this sample size of the difference is small I want to exclude the interval : [10,s). This can be done comparing adjacent preceding rows using the : following. : : df0$itagA - ifelse(c(10, diff(nsamA)) = 4, 1, 0) : df0$itagB - ifelse(c(10, diff(nsamB)) = 4, 1, 0) : df0 : # Subset df0 on the tag results : df1 - df0[df0$itagA != 1 df0$itagB != 1,] : df1 : : This works fine, but here is my problem. This simply looks at only the : immediate preceding row and not at rows further down the line. What I : would like to do is include the next interval that includes 5 or more : samples from each group since earlier intervals are nested in the latter : intervals. In the example given this would include the final interval [10, : 120) as this contains more than 4 samples for each treatment. I can do this : by hand using something like : : df0[c(1:7,11),] : : But this is not an attractive solution as it requires me to actually look at : the data set each time and determine the row numbers. This works for this : case, but I have many intervals (rows of data) to look at and this would be : cumbersome. I've considered using diff with different lag arguments, but : this still doesn't seem to work. I also want to note that I need to keep the : int factor (as used in the example above) as this is used throughout my : analysis (i.e. this is a true factor variable and not simply denoting an : interval). I'd be grateful for any possible suggestions as I'm stumped at : this moment. : Delete the rows one by one and then recalculate diff after each deletion (rather than diff'ing all at once and then deleting all at once). Also, assuming you want every interval to be covered, force the last interval to end at the last row. Assume too.few(df0, i) is a function, not shown here, which returns TRUE if there are too few As or Bs in row i minus row i-1 of df0 and otherwise FALSE. Then: last.row - df0[nrow(df0),] i - 1 while(i nrow(df0)) if (too.few(df0, i)) df0 - df0[-i,] else i - i + 1 df0[nrow(df0),] - last.row P.S. Please start a new thread rather than replying to an existing thread and please use a meaningful subject. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] interval partition problem [was: (no subject)]
Gabor Grothendieck ggrothendieck at myway.com writes: : : Soukup, Matt SoukupM at cder.fda.gov writes: : : : : : Hi. : : : : I have a problem that I can't seem to find an optimal way of solving other : : than by doing things manually. I'm trying to subset a data frame by the : : number of observations that occurred at a given row but want to take into : : account the number of observations of preceding rows. Here's an example. : : : : I'm looking at intervals of data [10,20), [10, 30), ., [10,120) which : : contain a certain number of observations for treatment A and treatment B. An : : example is given by the following code. : : : : int - as.factor(paste([, rep(10, 11), ,, seq(20,120, by=10), ))) : : nsamA - c(62, 83, 118, 151, 180, 201, 212, 215, 216, 217, 218) : : nsamB - c(65, 90, 128, 163, 190, 199, 209, 214, 215, 216, 218) : : : : df0 - data.frame(int, nsamA, nsamB) : : df0 : : : : Since the interval [10, s) with n_s samples is nested in [10, t)with n_t : : sample for s t, we know n_s - n_t samples exist in the interval [s, t). If : : this sample size of the difference is small I want to exclude the interval : : [10,s). This can be done comparing adjacent preceding rows using the : : following. : : : : df0$itagA - ifelse(c(10, diff(nsamA)) = 4, 1, 0) : : df0$itagB - ifelse(c(10, diff(nsamB)) = 4, 1, 0) : : df0 : : # Subset df0 on the tag results : : df1 - df0[df0$itagA != 1 df0$itagB != 1,] : : df1 : : : : This works fine, but here is my problem. This simply looks at only the : : immediate preceding row and not at rows further down the line. What I : : would like to do is include the next interval that includes 5 or more : : samples from each group since earlier intervals are nested in the latter : : intervals. In the example given this would include the final interval [10, : : 120) as this contains more than 4 samples for each treatment. I can do this : : by hand using something like : : : : df0[c(1:7,11),] : : : : But this is not an attractive solution as it requires me to actually look at : : the data set each time and determine the row numbers. This works for this : : case, but I have many intervals (rows of data) to look at and this would be : : cumbersome. I've considered using diff with different lag arguments, but : : this still doesn't seem to work. I also want to note that I need to keep the : : int factor (as used in the example above) as this is used throughout my : : analysis (i.e. this is a true factor variable and not simply denoting an : : interval). I'd be grateful for any possible suggestions as I'm stumped at : : this moment. : : : : Delete the rows one by one and then recalculate diff : after each deletion (rather than diff'ing all at once : and then deleting all at once). Also, assuming you want : every interval to be covered, force the last interval to : end at the last row. : : Assume too.few(df0, i) is a function, not shown here, which : returns TRUE if there are too few As or Bs in row i minus row : i-1 of df0 and otherwise FALSE. Then: : : last.row - df0[nrow(df0),] : i - 1 : while(i nrow(df0)) if (too.few(df0, i)) df0 - df0[-i,] else i - i + 1 That should be i = nrow(df0) : df0[nrow(df0),] - last.row : : P.S. : : Please start a new thread rather than replying to an existing thread : and please use a meaningful subject. : : __ : R-help at stat.math.ethz.ch mailing list : https://stat.ethz.ch/mailman/listinfo/r-help : PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html : : __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] question about ldahist function
Hi, When I am using ldahist function I would like to specify different colors for each of the groups in the data I am using. Is it possible? If not does anybody know of another function to plot multiple histograms on one plot for different groups. Thanks = Thanks Fairouz Makhlouf __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html