[R] Error in La.svd(X) : error code 1 from Lapack routine 'dgesdd'
Dear R helpers, I am working with R 2.4.1 GUI 1.18 (4038) for MacOSX. I have a matrix of 10 000 genes and try to run the following commands: model.mix-makeModel (data=data, formula=~Dye+Array+Sample+Time, random=~Array+Sample) anova.mix-fitmaanova (data, model.mix) test.mix-matest (data, model=model.mix, term=Time, n.perm=100, test.method=c(1,0,1,1)) I get the following error message: Doing F-test on observed data ... Doing permutation. This may take a long time ... Error in La.svd(X) : error code 1 from Lapack routine 'dgesdd' What does this mean? is my matrix too big? What can I do? Thanks a lot in adavance Sophie __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] about find the solution
If I want to find out the soltion of X1,X2 that min(3X1+2X2+X1X2) subject to 20=X1+3X2=50 10=X1 which function or package can I use? Thanks. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot(): I want to display dates on X-axis.
Hi you probably know that the second column are dates but your poor PC does not, so you should to tell him. You have several options: Change the column to suitable date format - see chron package or help pages related to date functions e.g. strptime, as.Date, ... and perform your plot. Change your dat column to character vector and using it as a labels to x axis - see help pages to plot, axes, titles On 5 Mar 2007 at 13:26, d. sarthi maheshwari wrote: Date sent: Mon, 5 Mar 2007 13:26:28 +0530 From: d. sarthi maheshwari [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Subject:[R] plot(): I want to display dates on X-axis. Hi, I want to display dates on my x-axis of the plot. I was trying to use plot() command for the same and passing the values in following manner: The variable dat is a data frame. The first column has numeric values and second column has date. e.g. dat [,1]dat[,2] [1,]300 20060101 [2,]257 20060102 [3,]320 20060103 [4,]311 20060104 [5,]297 20060105 [6,]454 20060106 [7,]360 20060107 [8,]307 20060108 However what did you suppose this command will do? Did you even try to read plot help page? the command I am performing is:: plot(x=dat[1], y=as.character(dat[2])) If you want to plot date on x axis and values on y axis why you did it in opposite way? plot(dat[,2], dat[,1]) after transformation to date format shall do what you want. Regards Petr Kindly suggest some method by which I can perform my task of displaying the first column values on y-axis against dates on x-axis. -- Thanks Regards Sarthi M. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lattice histogram
See argument drop.unused.levels in xyplot. You will also need to manage the case n = 0 for dispalying the mean and stdv. Best, Renaud histogram(~ resp | group, drop.unused.levels = FALSE, panel = function(x, ...){ std - if(length(x) 0) format(round(sd(x), 2), nsmall = 2) else NA n - length(x) m - if(length(x) 0) format(round(mean(x), 2), nsmall = 2) else NA panel.histogram(x, ...) x1 - unit(1, npc) - unit(2, mm) y1 - unit(1, npc) - unit(2, mm) grid.text(label = bquote(n == .(n)), x = x1, y = y1, just = right) grid.text(label = bquote(hat(m) == .(m)), x = x1, y = y1 - unit(1, lines), just = right) grid.text(label = bquote(hat(s) == .(std)), x = x1, y = y1 - unit(2, lines), just = right) }) 2007/3/5, Aimin Yan [EMAIL PROTECTED]: thank you very much. Your code almost solve my problem, but I have a further question. In my data, there is no observation in some group, I want to label that panel by n=0 hat(m)=NA hat(s)=NA. I try to modify your panel function, but it doesn't work out. Do you know how to add something to your panel function so that it can deal with some group that has 0 observation. Aimin At 02:54 AM 3/4/2007, Renaud Lancelot wrote: Here is an example using the grid package to annotate the graphs: library(lattice) library(grid) resp - rnorm(200) group - sample(c(G1, G2, G3), replace = TRUE, size = 100) histogram(~ resp | group, panel = function(x, ...){ std - round(sd(x), 2) n - length(x) m - round(mean(x), 2) panel.histogram(x, ...) x1 - unit(1, npc) - unit(2, mm) y1 - unit(1, npc) - unit(2, mm) grid.text(label = bquote(n == .(n)), x = x1, y = y1, just = right) grid.text(label = bquote(hat(m) == .(m)), x = x1, y = y1 - unit(1, lines), just = right) grid.text(label = bquote(hat(s) == .(std)), x = x1, y = y1 - unit(2, lines), just = right) }) Best, Renaud 2007/3/4, Aimin Yan [EMAIL PROTECTED]: How to add mean,sd, number of observation in each panel for lattice histogram? Aimin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Renaud LANCELOT Département Systèmes Biologiques du CIRAD CIRAD, Biological Systems Department Campus International de Baillarguet TA 30 / B F34398 Montpellier Tel +33 (0)4 67 59 37 17 Secr. +33 (0)4 67 59 37 37 Fax +33 (0)4 67 59 37 95 -- Renaud LANCELOT Département Systèmes Biologiques du CIRAD CIRAD, Biological Systems Department Campus International de Baillarguet TA 30 / B F34398 Montpellier Tel +33 (0)4 67 59 37 17 Secr. +33 (0)4 67 59 37 37 Fax +33 (0)4 67 59 37 95 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Non : Confidence intervals for p**2 ??
Dear List, I was asked to calculate a confidence interval for p*p. Is there any standard techniques for calculating such an interval? Delta Method? Thank you in advance! Cheers, Patrik __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scoping issue?
It is not really an argument to mmatplot; it is an argument to mapply and I am not certain what might be happening in there with respect to lazy evaluation. Yes you can set it up with lapply, but I don't think speed is a concern since most of the time is being spent in the matplot routine. If you are really concerned, use system.time to check out the difference. lapply would probably look like: lapply(1:ncol(A), function(x) mmatplot(x, 1:nrow(A), A, main=paste(Array input, column, x))) but I would guess you would not see any difference unless you were plotting 10,000 columns and then the difference would be small. On 3/4/07, Thaden, John J [EMAIL PROTECTED] wrote: Apparently you're right that colnum doesn't exist when it needs to be evaluated, but why? Why is 'paste' being evaluated so early? It is, after all, the value of an argument ('main') of my mmatplot function with colnum being another argument. I thought arguments were lazy-loaded. Does using mapply change the rules? Is there a way (like mapply) to loop at some lower level rather than Explicitly, in the R script, as in your suggestion? For speed's sake? Thanks. -John On Sunday Mar 4 2007, jim holtman [EMAIL PROTECTED] replied First of all, 'colnum' does not exist when the 'paste' is called. This probably does what you want: for (colnum in 1:ncol(A)){ mmatplot(colnum, 1:nrow(A), A, main=paste(Array input, column, colnum)) } On 3/4/07, John Thaden [EMAIL PROTECTED] wrote: Hello, the code below is supposed to be a wrapper for matplot to do columnwise visible comparison of several matrices, but I'm doing something wrong because I can't access an argument called 'colnum'.I'd be most grateful for some insight. Thanks, John Little Rock, AR # mmatplot is a matplot wrapper to compare the same column of # several matrices. Arg y is either a list of matrices with # equal number of rows, or an array. The scalar n gives the # column of each matrix or array slab to plot. par values and # matplot args are accepted, e.g., ylog.mmatplot is intended # to be mapply-compatible to test multiple columns. mmatplot - function(colnum, x, y, ...){ switch(class(y), array = y - y[, colnum, ], list = y - sapply(X = y, FUN = subset, select = colnum)) stopifnot(is.matrix(y)) matplot(x, y, ...) } #This is just a tester function mmatplotTest - function(){ oldmf - par(mfrow) par(mfrow = c(2,3)) A - array(data = rnorm(90), dim = c(10, 3, 3)) L - list(A[, , 1], A[, , 2], A[, , 3]) # The 'main' argument below throws the error, but if # commented out, another error crops up due to 'colnum'. # Test with class(y) == array mapply(X = 1:ncol(A), FUN = mmatplot, x = 1:nrow(A), y = A, main = paste(Array input, column, colnum)) # Test with class(y) == list mapply(1:ncol(L[[1]]), mmatplot, x = 1:nrow(L[[1]]), y = L, main = paste(List input, column, colnum)) par(mfrow = oldmf) } #Run the test mmatplotTest() __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Heteroskedastic Time Series
Hi R-helpers, I'm new to time series modelling, but my requirement seems to fall just outside the capabilities of the arima function in R. I'd like to fit an ARMA model where the variance of the disturbances is a function of some exogenous variable. So something like: Y_t = a_0 + a_1 * Y_(t-1) +...+ a_p * Y_(t-p) + b_1 * e_(t-1) +...+ b_q * e_(t-q) + e_t, where e_t ~ N(0, sigma^2_t), and with the variance specified by something like sigma^2_t = exp(beta_t * X_t), where X_t is my exogenous variable. I would be very grateful if somebody could point me in the direction of a library that could fit this (or a similar) model. Thanks, James Kirkby Actuarial Maths and Stats Heriot Watt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Non : Confidence intervals for p**2 ??
On 05-Mar-07 Öhagen Patrik wrote: Dear List, I was asked to calculate a confidence interval for p*p. Is there any standard techniques for calculating such an interval? Delta Method? Thank you in advance! Cheers, Patrik If p is meant to denote a probability between 0 and 1, then pL^2 p^2 pU^2 is exactly equivalent to pL p pU where pL and pU are the upper and lower limits for p. Indeed, this will be so if p is any quantity which is necessarily non-negative. Hence, if this is your situation, simply square the confidence limits for p. If, however, this is not your situation, then please explain what p represents. Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 05-Mar-07 Time: 10:05:50 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to read in this data format?
Hi, Although the solution worked, I'v got some troubles with some data files. These datafiles are very large (600-700 MB), so my computer starts swapping. If I use the code, written below, I get: Error in .Call(R_lazyLoadDBfetch, key, file, compressed, hook, PACKAGE = base) : recursive default argument reference After about 15 minutes of loading the data with the Lines. - readLines(myfile.dat) command. When I look in the help for readLines, I saw that there is a n to setup a maximum number, but is there a way to set a starting row number? If I can split up my datafiles in 4-8 small datasets, it's ok for me. But I couldn't figure it out. Thanks Bart From: Gabor Grothendieck [EMAIL PROTECTED] To: Bart Joosen [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] How to read in this data format? Date: Thu, 1 Mar 2007 16:46:21 -0500 On 3/1/07, Bart Joosen [EMAIL PROTECTED] wrote: Dear All, thanks for the replies, Jim Holtman has given a solution which fits my needs, but Gabor Grothendieck did the same thing, but it looks like the coding will allow faster processing (should check this out tomorrow on a big datafile). @gabor: I don't understand the use of the grep command: grep(^[1-9][0-9. ]*$|Time, Lines., value = TRUE) What is this expression (^[1-9][0-9. ]*$|Time) actually doing? I looked in the help page, but couldn't find a suitable answer. I briefly discussed it in the first paragraph of my response. It matches and returns only those lines that start (^ matches start of line) with a digit, i.e. [1-9], and contains only digits, dots and spaces, i.e. [0-9. ]*, to end of line, i.e. $ matches end of line, or (| means or) contains the word Time. If you don't have lines like ... (which you did in your example) then the regexp could be simplified to ^[0-9. ]+$|Time. You may need to match tabs too if your input contains those. Thanks to All Bart - Original Message - From: Gabor Grothendieck [EMAIL PROTECTED] To: Bart Joosen [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Thursday, March 01, 2007 6:35 PM Subject: Re: [R] How to read in this data format? Read in the data using readLines, extract out all desired lines (namely those containing only numbers, dots and spaces or those with the word Time) and remove Retention from all lines so that all remaining lines have two fields. Now that we have desired lines and all lines have two fields read them in using read.table. Finally, split them into groups and restructure them using by and in the last line we convert the by output to a data frame. At the end we display an alternate function f for use with by should we wish to generate long rather than wide output (using the terminology of the reshape command). Lines - $$ Experiment Number: $$ Associated Data: FUNCTION 1 Scan1 Retention Time 0.017 399.8112184 399.87420 399.9372152 Scan2 Retention Time 0.021 399.8112181 399.87421 399.9372153 # replace next line with: Lines. - readLines(myfile.dat) Lines. - readLines(textConnection(Lines)) Lines. - grep(^[1-9][0-9. ]*$|Time, Lines., value = TRUE) Lines. - gsub(Retention, , Lines.) DF - read.table(textConnection(Lines.), as.is = TRUE) closeAllConnections() f - function(x) c(id = x[1,2], structure(x[-1,2], .Names = x[-1,1])) out.by - by(DF, cumsum(DF[,1] == Time), f) as.data.frame(do.call(rbind, out.by)) We could alternately consider producing long format by replacing the function f with: f - function(x) data.frame(x[-1,], id = x[1,2]) On 3/1/07, Bart Joosen [EMAIL PROTECTED] wrote: Hi, I recieved an ascii file, containing following information: $$ Experiment Number: $$ Associated Data: FUNCTION 1 Scan1 Retention Time 0.017 399.8112184 399.87420 399.9372152 Scan2 Retention Time 0.021 399.8112181 399.87421 399.9372153 . I would like to import this data in R into a dataframe, where there is a column time, the first numbers as column names, and the second numbers as data in the dataframe: Time399.8112399.8742399.9372 0.017 184 0 152 0.021 181 1 153 I did take a look at the read.table, read.delim, scan, ... But I 've no idea about how to solve this problem. Anyone? Thanks Bart __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
[R] Difference between two time series
Hi R Users! Thanks in advance. I am using R 2.4.1 on windows XP. I have a series of (x[i,t], y[i,t]), say i = 1, 2, .., n and t = t1, t2, .., tT. The n is large in number. The series y[i,t] is constructed from x[i,t] using moving average of order o (say o = 30 days). I am trying to extract m (m n) series where: The difference between x[i,t] and y[i,t] (ie, x[i,t] - y[i,t]) is in increasing order over the last d days (say d= 30 days). Is there any package in R can do this? Any further advice is highly appreciated. Once again, thank you very much for the time you have given. Regards, Deb - TV dinner still cooling? Check out Tonight's Picks on Yahoo! TV. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot with different color combination for each bar
Hi, I'd suggest you use ?rect for this. Here's an example (I did not check whether it's correct...) I also improved (but not checked :) your definition of cols. Jonne. X - seq(1:6) Q - matrix(sample(X, 60, replace = T), nrow=6, byrow = T) H - matrix(rep(1,60), nrow=6, byrow=T) color - c(blue, orange, gold, indianred, skyblue4, lightblue) cols - matrix(data=color[Q], ncol=10) # Old: barplot(H, col=cols, width = c(0.1), xlim = c(0,3), beside=F) # New: x11() plot(0, 0, type=n, ylim=c(0,nrow(Q)), xlim=c(0,ncol(Q)), xlab=xlabel, ylab=) xleft - rep(1:ncol(Q), each=nrow(Q)) ybottom - rep(1:nrow(Q), times=ncol(Q)) rect(xleft-1, ybottom-1, xleft, ybottom, col=cols) On Fri, 2007-03-02 at 09:48 -0600, Kim Milferstedt wrote: Hi, I'd like to construct a somewhat unusual barplot. In barplot I use beside=F as I'd like to have stacked bars. The height of each bar is always the same. Information in my plot is coded in the color of the bar. I therefore need to be able so assign a different combination (or order) of colors to each individual stacked bar. In the example below, the combination of colors for my plot is generated by X, Q, color and cols. These colors are supposed to fill the stacked bars with the height of H. However, only the first column of cols is used for all columns of H as barplot only allows me to assign one vector for the color scheme of the entire barplot. Does anybody know a way how I can assign each bar a potentially unique color combination? Thanks for your help! Kim X - seq(1:6) Q- matrix(sample(X, 60, replace = T), nrow=6, byrow = T) H - matrix(rep(1,60), nrow=6, byrow=T) color - c(blue, orange, gold, indianred, skyblue4, lightblue) cols - ifelse( (Q ==1) , color[1], ifelse( (Q ==2), color[2], ifelse( (Q ==3) , color[3], ifelse( (Q ==4), color[4], ifelse( (Q ==5) , color[5], color[6] ) ) ) ) ) barplot( H, col=cols, width = c(0.1), xlim = c(0,3), beside=F ) __ Kim Milferstedt University of Illinois at Urbana-Champaign Department of Civil and Environmental Engineering 4125 Newmark Civil Engineering Laboratory 205 North Mathews Avenue MC-250 Urbana, IL 61801 USA phone: (001) 217 333-9663 fax: (001) 217 333-6968 email: [EMAIL PROTECTED] http://cee.uiuc.edu/research/morgenroth __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error loading a dependency in a package: missing namespace?
Dear r-helpers, I am building a package that depends on some others. I recently added a new dependency: package outliers. But does not work any more. Let me show some information below: [EMAIL PROTECTED]:pcrAnalysis$ cat DESCRIPTION Package: pcrAnalysis Type: Package Title: pcrAnalysis Version: 0.7.2 Date: 2007-02-27 Depends: Biobase, methods, outliers Author: Carlos J. Gil Bellosta [EMAIL PROTECTED] Maintainer: Carlos J. Gil Bellosta [EMAIL PROTECTED] Description: Package for the analysis of Taqman experiments License: TBA [EMAIL PROTECTED]:pcrAnalysis$ cat NAMESPACE import(methods, Biobase, outliers) exportPattern(^tqmn) exportClasses(pcrExprSet) exportMethods(task, task-, phenoData.sort) But now, the load of the packages fails. If I try to run [EMAIL PROTECTED]:tmp$ R CMD check pcrAnalysis I get the following log: * checking for working latex ... OK * using log directory '/tmp/pcrAnalysis.Rcheck' * using R version 2.4.1 (2006-12-18) * checking for file 'pcrAnalysis/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'pcrAnalysis' version '0.7.2' * checking package dependencies ... OK * checking if this is a source package ... OK * checking whether package 'pcrAnalysis' can be installed ... OK * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking whether the package can be loaded ... ERROR Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()' or start with 'help(Biobase)'. For details on reading vignettes, see the openVignette help page. Loading required package: outliers Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) : in 'pcrAnalysis' classes for export not defined: pcrExprSet In addition: Warning message: package 'pcrAnalysis' contains no R code in: loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) Error: package/namespace load failed for 'pcrAnalysis' Execution halted It seems that the error is related to something having to do with namespaces. The thing is that package outliers does not have a NAMESPACE file. Could this be an issue? I have contacted the author of the package and he sais that outliers has been used in another package, quantchem (also in CRAN). However, quantchem does not have a NAMESPACE file either. I have been looking for information on how the loadNamespace function works and even looking at its code. But can anybody give me a clue? Would the outliers package require a NAMESPACE file? By the way, I have contacted the author of the package and he has been quite helpful, but he says he feels that that (lack of this file) may not be causing the problem. And I am using R version 2.4.1 (2006-12-18) on an Ubuntu Edgy (6.10) box. Regards, Carlos J. Gil Bellosta http://www.datanalytics.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot(): I want to display dates on X-axis.
Sarthi M. wrote: I want to display dates on my x-axis of the plot. Dates are a problem. There's a standard for dates, but it seems that most users and software didn't catch up :-/ The variable dat is a data frame. The first column has numeric values and second column has date. e.g. dat [,1]dat[,2] [1,]300 20060101 [2,]257 20060102 [3,]320 20060103 [4,]311 20060104 [5,]297 20060105 [6,]454 20060106 [7,]360 20060107 [8,]307 20060108 the command I am performing is:: plot(x=dat[1], y=as.character(dat[2])) Hmmm... When I needed something similar, I did this day: y - dat[,1] # because in a (xy) plot, x is the scale and y the data years - floor(dat[,2] / 1) months - floor(dat[,2] / 100) %% 100 days - dat[,2] %% 1 x - ISOdate(years, months, days) plot(x, y) Kindly suggest some method by which I can perform my task of displaying the first column values on y-axis against dates on x-axis. Of course, you can combine all that into one line, but readability will be blown up: plot(ISOdate(floor(dat[,2] / 1), floor(dat[,2] / 100) %% 100, dat[,2] %% 1), dat[,1]) Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function with Multiple Output
With help of list(), function can return ala of the results. my.fun=function(vector, index){ a=fun.a(vector, index) b=fun.b(vector, index) return(list(a,b)) } Example: R : Copyright 2005, The R Foundation for Statistical Computing Version 2.2.1 (2005-12-20 r36812) ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. fix(multiresult) multiresult(rnorm(10,0,1)) [[1]] [1] -0.1240271 [[2]] [1] 1.037070 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Scoping issue?
In your test function there is no lexically visible definition for the `colnum` variable used in defining main, so that error is what you expect from lexical scoping. Lazy evaluation dictates when (and if) the `main` argument is evaluated, but the environment in which it is evaluated is determined by the context where the expression is written in the code, i.e. within the function mmatplotTest. The error you get when you take out the `main` is coming from `subset` and is due to the fact that subset is one of those functions that uses non-standard evaluation for some of its arguments, in this case `select`. This makes it (slightly) easier to use at intereactive top level but much more complicated to use within a function. The key in reading the help page for `subset` is that the argument `select` should be an expression (actually the literal argument expression is used, which in this case is the expression consisting of the single variable `colnum` and is not useful here). You need to use another function in your sapply call, something like function(d) d[,colnum] may do. Best, luke On Sun, 4 Mar 2007, Thaden, John J wrote: Apparently you're right that colnum doesn't exist when it needs to be evaluated, but why? Why is 'paste' being evaluated so early? It is, after all, the value of an argument ('main') of my mmatplot function with colnum being another argument. I thought arguments were lazy-loaded. Does using mapply change the rules? Is there a way (like mapply) to loop at some lower level rather than Explicitly, in the R script, as in your suggestion? For speed's sake? Thanks. -John On Sunday Mar 4 2007, jim holtman [EMAIL PROTECTED] replied First of all, 'colnum' does not exist when the 'paste' is called. This probably does what you want: for (colnum in 1:ncol(A)){ mmatplot(colnum, 1:nrow(A), A, main=paste(Array input, column, colnum)) } On 3/4/07, John Thaden [EMAIL PROTECTED] wrote: Hello, the code below is supposed to be a wrapper for matplot to do columnwise visible comparison of several matrices, but I'm doing something wrong because I can't access an argument called 'colnum'. I'd be most grateful for some insight. Thanks, John Little Rock, AR # mmatplot is a matplot wrapper to compare the same column of # several matrices. Arg y is either a list of matrices with # equal number of rows, or an array. The scalar n gives the # column of each matrix or array slab to plot. par values and # matplot args are accepted, e.g., ylog. mmatplot is intended # to be mapply-compatible to test multiple columns. mmatplot - function(colnum, x, y, ...){ switch(class(y), array = y - y[, colnum, ], list = y - sapply(X = y, FUN = subset, select = colnum)) stopifnot(is.matrix(y)) matplot(x, y, ...) } #This is just a tester function mmatplotTest - function(){ oldmf - par(mfrow) par(mfrow = c(2,3)) A - array(data = rnorm(90), dim = c(10, 3, 3)) L - list(A[, , 1], A[, , 2], A[, , 3]) # The 'main' argument below throws the error, but if # commented out, another error crops up due to 'colnum'. # Test with class(y) == array mapply(X = 1:ncol(A), FUN = mmatplot, x = 1:nrow(A), y = A, main = paste(Array input, column, colnum)) # Test with class(y) == list mapply(1:ncol(L[[1]]), mmatplot, x = 1:nrow(L[[1]]), y = L, main = paste(List input, column, colnum)) par(mfrow = oldmf) } #Run the test mmatplotTest() __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RBloomberg
Hi R, The below are commands used in extracting Bloomberg data. Let T1, T2,...T5 be a set of actual tickers Ticker_list-c(T1, T2, T3, T4, T5) con-blpConnect(show.days=show_day, na.action=na_action, periodicity=periodicity) cdaily-blpGetData(con,Ticker_list,EQY_SH_OUT,start=as.chron(as.Date( 1/1/1996, %m/%d/%Y)),end=as.chron(as.Date(2/12/2007, %m/%d/%Y))) blpDisconnect(con) If the data itself is not present for this combination of fields, ticker_list and the date range, then what would RBloomberg do is it throws a zoo object with 0 rows. Can this be modified so that we get the complete rows (say 3000 rows) filled with blanks? I think RBloomberg developers can help me better in this case... Other's ideas are also welcome... Thanks in advance Shubha [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] background color behind symbols in legend()
Hello, I try to display coloured rectangles behind symbols in a legend (as a background): plot(10,10) legend(top, c(text,text2), pch=c(21,22), fill=c(red,green), pt.bg=black) On the resulting graph, the symbol is not centered upon the coloured rectangle. Is there a way to adjust their relative position, so that they are centered? Looking through ?legend has not helped me (but I might have missed the line where it is explained)... [R version 2.4.0 (2006-10-03) on linux] Thanks for any help. Best regards, -- Nicolas Mazziotta The contents of this e-mail, including any attachments, are ...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identifying points in a plot that have duplicate values
I have code like this: - #--- -- x=scan() 0 0 0 0 0 1 2 3 4 y=scan() 1 1 1 2 2 1 3 4 5 plot(x,y) identify(0,1,3) #Allows me to select manually to identify co-ordinate (0,1) as being duplicated 3 times identify(0,2,2) #Allows me to select manually to identify co-ordinate (0,2) as being duplicated 2 times #--- -- Is there not a way I can automatically display if points are duplicated and by how many times? I thought if I 'jittered' the points ever so slightly I could get an idea of how many duplicates there are but with 100 points the graph looks very messy. Regards DaveL Click for free info on getting an MBA and make $200K/ year Need cash? Click to get a payday loan http://tagline.bidsystem.com/fc/CAaCDCZ60nyjrrOboFeUJgRjigwgNftK/ span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 style=font-size:13.5px___BRGet the Free email that has everyone talking at a href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr font color=#99Unlimited Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; Much More!/font/font/span [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Identifying last record in individual growth data over different time intervalls
Hi I have a plist t which contains size measurements of individual plants, identified by the field plate. It contains, among other, a field year indicating the year in which the individual was measured and the height. The number of measurements range from 1 to 4 measurements in different years. My problem is that I would need the LAST measurement. I only came up with the solution below which is probably way to complicated, but I can't think of another solution. Does anybody has an idea how to do this more effectively? Finally I would like to have a data.frame t2 which only contains the entries of the last measurements. Thanks in advance, Rainer unlist( sapply( split(t, t$plate), function(i) { i[i$year==max(i$year),]$id } ) ) 15 20 33 43 44 47 64 D72S200 S201 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 S202S203S204S205S206S207S208S209S210 S211 2004095 2006019 2006020 2006021 2006022 2006023 2006024 2006025 2006026 2006027 S212S213S214S215S216S217S218S219S220 S222 2006028 2006029 2006030 2006031 2006032 2006033 2006034 2006035 2006036 2006037 S223S224S225S226S227S228S229S230S231 S232 2006038 2006039 2006040 2006041 2006042 2006043 2006044 2006045 2006046 2006047 t id plate year height 2004007 200400715 2004 0.40 2005024 200502415 2005 0.43 2006001 200600115 2006 0.44 2004012 200401220 2004 0.90 2005026 200502620 2005 0.94 2006003 200600320 2006 0.98 2004025 200402533 2004 0.15 2005027 200502733 2005 0.15 2006005 200600533 2006 0.16 2004035 200403543 2004 0.26 2005038 200503843 2005 0.30 2006007 200600743 2006 0.38 2004036 200403644 2004 0.32 2005030 200503044 2005 0.39 2006008 200600844 2006 0.46 2004039 200403947 2004 0.50 2005025 200502547 2005 0.55 2006009 200600947 2006 0.63 2004055 200405564 2004 0.45 2005029 200502964 2005 0.58 2006014 200601464 2006 0.67 2006015 2006015 D72 2006 0.30 2004093 2004093 S200 2004 0.68 2005040 2005040 S200 2005 0.74 2006016 2006016 S200 2006 0.84 2004094 2004094 S201 2004 0.46 2005041 2005041 S201 2005 0.49 2006017 2006017 S201 2006 0.53 2004095 2004095 S202 2004 0.17 2004096 2004096 S203 2004 0.23 2005032 2005032 S203 2005 0.23 2006019 2006019 S203 2006 0.23 2004097 2004097 S204 2004 0.25 2005031 2005031 S204 2005 0.29 2006020 2006020 S204 2006 0.41 2004098 2004098 S205 2004 0.22 2005039 2005039 S205 2005 0.26 2006021 2006021 S205 2006 0.37 2004099 2004099 S206 2004 0.19 2005035 2005035 S206 2005 0.25 2006022 2006022 S206 2006 0.37 2004100 2004100 S207 2004 0.29 2005003 2005003 S207 2005 0.36 2006023 2006023 S207 2006 0.41 2004101 2004101 S208 2004 0.17 2005005 2005005 S208 2005 0.20 2006024 2006024 S208 2006 0.16 2004102 2004102 S209 2004 0.16 2005008 2005008 S209 2005 0.19 2006025 2006025 S209 2006 0.24 2004103 2004103 S210 2004 0.09 2005007 2005007 S210 2005 0.14 2006026 2006026 S210 2006 0.15 2004104 2004104 S211 2004 0.12 2005006 2005006 S211 2005 0.12 2006027 2006027 S211 2006 0.22 2004105 2004105 S212 2004 0.61 2005011 2005011 S212 2005 0.71 2006028 2006028 S212 2006 0.81 2004106 2004106 S213 2004 0.28 2005010 2005010 S213 2005 0.37 2006029 2006029 S213 2006 0.44 2004107 2004107 S214 2004 0.47 2005009 2005009 S214 2005 0.59 2006030 2006030 S214 2006 0.67 2004108 2004108 S215 2004 0.43 2005004 2005004 S215 2005 0.53 2006031 2006031 S215 2006 0.66 2004109 2004109 S216 2004 0.35 2005019 2005019 S216 2005 0.38 2006032 2006032 S216 2006 0.41 2004110 2004110 S217 2004 0.20 2005018 2005018 S217 2005 0.21 2006033 2006033 S217 2006 0.32 2004111 2004111 S218 2004 0.19 2005014 2005014 S218 2005 0.21 2006034 2006034 S218 2006 0.27 2004112 2004112 S219 2004 0.21 2005034 2005034 S219 2005 0.24 2006035 2006035 S219 2006 0.24 2004113 2004113 S220 2004 0.19 2005021 2005021 S220 2005 0.19 2006036 2006036 S220 2006 0.25 2004114 2004114 S222 2004 0.34 2005020 2005020 S222 2005 0.35 2006037 2006037 S222 2006 0.46 2005013 2005013 S223 2005 0.04 2006038 2006038 S223 2006 0.04 2005012 2005012 S224 2005 0.13 2006039 2006039 S224 2006 0.14 -- NEW EMAIL ADDRESS AND ADDRESS: [EMAIL PROTECTED] [EMAIL PROTECTED] WILL BE DISCONTINUED END OF MARCH Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Leslie Hill Institute for Plant Conservation University of Cape Town Rondebosch 7701 South Africa Fax:+27 - (0)86 516 2782 Fax:+27 - (0)21 650 2440 (w) Cell:
Re: [R] Error in La.svd(X) : error code 1 from Lapack routine 'dgesdd'
On Mon, 05 Mar 2007 09:14:17 +0100 Sophie Richier [EMAIL PROTECTED] wrote: Dear R helpers, I am working with R 2.4.1 GUI 1.18 (4038) for MacOSX. I have a matrix of 10 000 genes and try to run the following commands: model.mix-makeModel (data=data, formula=~Dye+Array+Sample+Time, random=~Array+Sample) anova.mix-fitmaanova (data, model.mix) test.mix-matest (data, model=model.mix, term=Time, n.perm=100, test.method=c(1,0,1,1)) I get the following error message: Doing F-test on observed data ... Doing permutation. This may take a long time ... Error in La.svd(X) : error code 1 from Lapack routine 'dgesdd' What does this mean? is my matrix too big? What can I do? Thanks a lot in adavance Sophie from the help file: Unsuccessful results from the underlying LAPACK code will result in an error giving a positive error code: these can only be interpreted by detailed study of the FORTRAN code. from the manpages: man dgesdd INFO(output) INTEGER = 0: successful exit. 0: if INFO = -i, the i-th argument had an illegal value. 0: DBDSDC did not converge, updating process failed. I don't know what DBDSDC is, but it appears that there may be some convergence issue for you. Unless someone else has better ideas, look up www.netlib.org/lapack and the routines in there to investigate further. HTH! Best, Ranjan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 0 * NA = NA
Is there any way to force 0 * NA to be 0 instead of NA? For example, suppose I have a vector with some valid values, while other values are NA. If I matrix-pre-multiply this by a weight row vector, whose weights that correspond to the NAs are zero, the outcome will still be NA: x - c(1, NA, 1) wt - c(2, 0, 1) wt %*% x # NA Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R CMD CHECK question
hi, second try... I ran into problems when checking one of my packages with R CMD CHECK: I have two packages, the first (named `pkc') depending on the second one (named `roiutils'). The source code and DESCRIPTION files describes the dependency as it should be, I think ('Imports', `require'). but if I run R CMD CHECK pkc I get significant warnings related to missing links (refering to functions from the second package) in the manpages of the first package as can be seen below. despite the warnings, after installing the two packages the help system works just fine including the cross-references. my question: why is it, that R CMD CHECK is complaining? can one selectively switch of this warning? or how have I to specify the links in the manpages to tell CHECK that everything is basically OK? CUT * checking for working latex ... OK * using log directory '/Users/vdh/rfiles/Rlibrary/.check/pkc.Rcheck' * using R version 2.4.0 (2006-10-03) * checking for file 'pkc/DESCRIPTION' ... OK * this is package 'pkc' version '1.1' * checking package dependencies ... OK * checking if this is a source package ... OK * checking whether package 'pkc' can be installed ... WARNING Found the following significant warnings: missing link(s): readroi readroi readroi figure readroi conv3exmodel readroi missing link(s): figure readroi * checking package directory ... OK * checking for portable file names ... OK * checking for sufficient/correct file permissions ... OK * checking DESCRIPTION meta-information ... OK * checking top-level files ... OK * checking index information ... OK * checking package subdirectories ... OK * checking R files for syntax errors ... OK * checking R files for non-ASCII characters ... OK * checking whether the package can be loaded ... OK * checking whether the package can be loaded with stated dependencies ... OK * checking whether the name space can be loaded with stated dependencies ... OK * checking S3 generic/method consistency ... OK * checking replacement functions ... OK * checking foreign function calls ... OK * checking R code for possible problems ... OK * checking Rd files ... WARNING Rd files with unknown sections: /Users/vdh/rfiles/Rlibrary/pkc/man/fitdemo.Rd: example See the chapter 'Writing R documentation files' in manual 'Writing R Extensions'. * checking Rd cross-references ... WARNING Missing link(s) in documentation object 'compfit.Rd': readroi readroi readroi figure readroi conv3exmodel readroi Missing link(s) in documentation object 'exp3fit.Rd': figure readroi CUT any hints appreciated, joerg __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [ANN] Static and dynamic graphics course, July 2007, Salt Lake City
We're pleased to announce a one day course covering static and dynamic graphics using R, ggplot and GGobi. The course will be held just before the JSM, on Saturday, 28 July 2007, in Salt Lake City. The course will be presented by Dianne Cook and Hadley Wickham. In the course you will learn: * How to build presentation quality static graphics using the R package, ggplot. We will cover plot creation and modification, and discuss the grammar which underlies the package. * How to explore your data with direct manipulation/dynamic graphics using GGobi and rggobi. You'll learn the general toolbox, as well specific approaches for dealing with missing data, supervised classification, cluster analysis and multivariate longitudinal data analysis. Dianne Cook is a full professor at Iowa State University. She has been an active researcher in the field of interactive and dynamic graphics for 16 years, and regularly teaches information visualization, multivariate analysis and data mining. Hadley Wickham is a PhD student at Iowa State University. He won the John Chambers Award for statistical computing in 2006 for his work on ggplot. For more details, or to book your place, please see http://lookingatdata.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] enumerating non-overlapping pairs of elements from a vector
Hi All, I'm trying to come up with a clear and concise (and fast?) solution to the following problem. I would like to take a vector 'v' and enumerate all of the ways in which it can be broken into n sets of length 2 (if the length of the vector is odd, and an additional set of length 1). An element of 'v' can only appear in one set. Order within sets is not important. Vector 'v' can be of lengths 2-12 'n' is determined by length(v)%/%2 if length(v)%%2 is non-zero, the additional set of length 1 is used For example vector 'v': v = (1,2,3,4) The solution would be (rows are combinations of sets chosen, where each element only appears once) 1 2, 3 4 1 3, 2 4 1 4, 2 3 In the case where length(v) is odd v = (1,2,3,4,5) 1 2, 3 4, 5 1 3, 2 4, 5 1 4, 2 3, 5 5 2, 3 4, 1 5 3, 2 4, 1 5 4, 2 3, 1 5 1, 3 4, 2 5 3, 1 4, 2 5 4, 1 3, 2 and so on... Certainly pulling all combinations of two or one elements is not a big deal, for example combinations(5,2,c(1,2,3,4,5),repeats.allowed=T) from the 'gtools' package would do something like this. I'm stuck on a clean solution for enumerating all the non-overlapping sets without some elaborate looping and checking scheme. No doubt this is a lapse in my understanding of combinatorics. Any help would be greatly appreciated cheers, a. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 0 * NA = NA
From: Alberto Monteiro Is there any way to force 0 * NA to be 0 instead of NA? For example, suppose I have a vector with some valid values, while other values are NA. If I matrix-pre-multiply this by a weight row vector, whose weights that correspond to the NAs are zero, the outcome will still be NA: x - c(1, NA, 1) wt - c(2, 0, 1) wt %*% x # NA I don't think it's prudent to bend arthmetic rules of a system, especially when there are good reasons for them. Here's one: R 0 * Inf [1] NaN If you are absolutely sure that the Nas in x cannot be Inf (or -Inf), you might try to force the result to 0, but the only way I can think of is to do something like: R wt %*% ifelse(wt, x, 0) [,1] [1,]3 Andy Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 0 * NA = NA
Alberto Monteiro napsal(a): Is there any way to force 0 * NA to be 0 instead of NA? No (AFAIK), and it is pretty reasonable to define it this way. If you want to treat the NAs as zeros, use x[is.na(x)] - 0 Petr For example, suppose I have a vector with some valid values, while other values are NA. If I matrix-pre-multiply this by a weight row vector, whose weights that correspond to the NAs are zero, the outcome will still be NA: x - c(1, NA, 1) wt - c(2, 0, 1) wt %*% x # NA Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RJDBC
I need help. I'm trying to connect with an Oracle DBMS and MySQL DBMS, I'm using RJDBC package. My code is the next: library('rJava') library('DBI') library('RJDBC') //Mysql drv - JDBC(com.mysql.jdbc.Driver,C:\\Temporal\\mysql-connector-java-3.0.9-stable-bin.jar,') conn - dbConnect(drv, jdbc:mysql://localhost:3306/bd, user, password) //Oracle drv - JDBC(oracle.jdbc.driver.OracleDriver,C:\\Temporal\\classes12.jar,') conn - dbConnect(drv,jdbc:oracle:thin:@192.168.1.70:1521:SDS22,user,password) R always returns for oracle Error en .local(drv, ...) : Unable to connect JDBC to jdbc:oracle:thin:@192.168.1.70:1521:SDS22 and for mysql Error en .local(drv, ...) : Unable to connect JDBC to jdbc:mysql://localhost:3306/bd And the function summary(drv) returns: JDBCDriver name = JDBC driver.version = 0.1-1 DBI.version = 0.1-1 client.version = NA max.connections = NA R version 2.4.1 (2006-12-18) i386-pc-mingw32 locale: LC_COLLATE=Spanish_Spain.1252;LC_CTYPE=Spanish_Spain.1252;LC_MONETARY=Spanish_Spain.1252;LC_NUMERIC=C;LC_TIME=Spanish_Spain.1252 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base other attached packages: RJDBC DBIrJava 0.1-2 0.1-12 0.4-14 Can you help me, please? Another question: I try to compile ROracle and RMysql for windows but i need Rdll.lib and it need R.exp. Can you give me one of this files? Regards. Jose Sierra __ LLama Gratis a cualquier PC del Mundo. Llamadas a fijos y móviles desde 1 céntimo por minuto. http://es.voice.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 0 * NA = NA
On 05-Mar-07 Alberto Monteiro wrote: Is there any way to force 0 * NA to be 0 instead of NA? For example, suppose I have a vector with some valid values, while other values are NA. If I matrix-pre-multiply this by a weight row vector, whose weights that correspond to the NAs are zero, the outcome will still be NA: x - c(1, NA, 1) wt - c(2, 0, 1) wt %*% x # NA Alberto Monteiro This is a bit of a tricky one, especially in a more general context. I think it involves defining new operators. In the case of the particular operation in your example, you could do %*NA% - function(x,y){ X-x;X[(is.na(x))(y==0)]-0; Y-y;Y[(is.na(Y))(x==0)]-0; return(X%*%Y) } Then: x - c(1, NA, 1) wt - c(2, 0, 1) x %*NA% wt [,1] [1,]3 Hmmm! Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 05-Mar-07 Time: 15:07:19 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] enumerating non-overlapping pairs of elements from a vector
Allan the general problem you refer to is set partitions, although I'm not clear whether the order of the sets themselves makes a difference (we in the enumerative combinatorics world refer to indistinguishable boxes). Your application would be set partitions with a specific shape, in this case 2,2,2,...,2,2,1 or 2,2,2,2. I am working on a generalization of your problem Right Now, and hope to have a complete solution ready within a couple of months (but then again I've been saying this for a long time now ;-) What's your application? best wishes Robin On 5 Mar 2007, at 14:56, Allan Strand wrote: Hi All, I'm trying to come up with a clear and concise (and fast?) solution to the following problem. I would like to take a vector 'v' and enumerate all of the ways in which it can be broken into n sets of length 2 (if the length of the vector is odd, and an additional set of length 1). An element of 'v' can only appear in one set. Order within sets is not important. Vector 'v' can be of lengths 2-12 'n' is determined by length(v)%/%2 if length(v)%%2 is non-zero, the additional set of length 1 is used For example vector 'v': v = (1,2,3,4) The solution would be (rows are combinations of sets chosen, where each element only appears once) 1 2, 3 4 1 3, 2 4 1 4, 2 3 In the case where length(v) is odd v = (1,2,3,4,5) 1 2, 3 4, 5 1 3, 2 4, 5 1 4, 2 3, 5 5 2, 3 4, 1 5 3, 2 4, 1 5 4, 2 3, 1 5 1, 3 4, 2 5 3, 1 4, 2 5 4, 1 3, 2 and so on... Certainly pulling all combinations of two or one elements is not a big deal, for example combinations(5,2,c(1,2,3,4,5),repeats.allowed=T) from the 'gtools' package would do something like this. I'm stuck on a clean solution for enumerating all the non-overlapping sets without some elaborate looping and checking scheme. No doubt this is a lapse in my understanding of combinatorics. Any help would be greatly appreciated cheers, a. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying points in a plot that have duplicate values
David Lloyd wrote: I have code like this: - #--- -- x=scan() 0 0 0 0 0 1 2 3 4 y=scan() 1 1 1 2 2 1 3 4 5 plot(x,y) identify(0,1,3) #Allows me to select manually to identify co-ordinate (0,1) as being duplicated 3 times identify(0,2,2) #Allows me to select manually to identify co-ordinate (0,2) as being duplicated 2 times #--- -- Is there not a way I can automatically display if points are duplicated and by how many times? I thought if I 'jittered' the points ever so slightly I could get an idea of how many duplicates there are but with 100 points the graph looks very messy. You might consider using alpha transparency - the more times a point is duplicated the darker it will be. For example: df - data.frame(x=c(0, 0, 0, 0, 0, 1, 2, 3, 4), y=c(1, 1, 1, 2, 2, 1, 3, 4, 5)) pdf(alphaExample.pdf, version = 1.4, width = 6, height = 6) with(df, plot(x,y, col=rgb(1,0,0,.3), pch=16)) dev.off() RSiteSearch(alpha transparency) Regards DaveL Click for free info on getting an MBA and make $200K/ year Need cash? Click to get a payday loan http://tagline.bidsystem.com/fc/CAaCDCZ60nyjrrOboFeUJgRjigwgNftK/ span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 style=font-size:13.5px___BRGet the Free email that has everyone talking at a href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr font color=#99Unlimited Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; Much More!/font/font/span [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying points in a plot that have duplicate values
Have a look at ?sunflowerplot, which not only produces a scatterplot showing multiple points with the same coordinates using special symbols, but will also produce a list showing the number of points at each coordinate as well. On 05/03/07, David Lloyd [EMAIL PROTECTED] wrote: I have code like this: - #--- -- x=scan() 0 0 0 0 0 1 2 3 4 y=scan() 1 1 1 2 2 1 3 4 5 plot(x,y) identify(0,1,3) #Allows me to select manually to identify co-ordinate (0,1) as being duplicated 3 times identify(0,2,2) #Allows me to select manually to identify co-ordinate (0,2) as being duplicated 2 times #--- -- Is there not a way I can automatically display if points are duplicated and by how many times? I thought if I 'jittered' the points ever so slightly I could get an idea of how many duplicates there are but with 100 points the graph looks very messy. Regards DaveL Click for free info on getting an MBA and make $200K/ year Need cash? Click to get a payday loan http://tagline.bidsystem.com/fc/CAaCDCZ60nyjrrOboFeUJgRjigwgNftK/ span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2 style=font-size:13.5px___BRGet the Free email that has everyone talking at a href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr font color=#99Unlimited Email Storage #150; POP3 #150; Calendar #150; SMS #150; Translator #150; Much More!/font/font/span [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 0 * NA = NA
On 05-Mar-07 Petr Klasterecky wrote: Alberto Monteiro napsal(a): Is there any way to force 0 * NA to be 0 instead of NA? No (AFAIK), and it is pretty reasonable to define it this way. If you want to treat the NAs as zeros, use x[is.na(x)] - 0 Doing it in precisely that way would have the problem that it would not give you NA when it should. For example: x - c(1, NA, 1) wt - c(2, 1, 1) Then, after x[is.na(x)] - 0, the result of x %*% wt should be NA, but your method would give 3. This is why I suggested a method which tests for corresponding elements of x = NA and y = 0, since what Alberto Monteiro wanted was 0*NA = 0, when that combination occures. I.e. %*NA% - function(x,y){ X-x;X[(is.na(x))(y==0)]-0; Y-y;Y[(is.na(y))(x==0)]-0; return(X%*%Y) } Ted. Petr For example, suppose I have a vector with some valid values, while other values are NA. If I matrix-pre-multiply this by a weight row vector, whose weights that correspond to the NAs are zero, the outcome will still be NA: x - c(1, NA, 1) wt - c(2, 0, 1) wt %*% x # NA Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 05-Mar-07 Time: 15:53:53 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression on contingency table
Bingshan Li bli1 at bcm.tmc.edu writes: I am wondering if there is a way in R to fit logistic regression on contingency table. If I have original data, I can transform the data into a design matrix and then call glm to fit the regression. But now I have a 2x3 contingency table with first row for response 0 and second row for response 1, and the columns are 3 levels of predictor variable. The 3 levels are not ordinal though and indicator variables would be more appreciate. From Documentation of GLM: For binomial and quasibinomial families the response can also be specified as a factor (when the first level denotes failure and all others success) or as a two-column matrix with the columns giving the numbers of successes and failures. Dieter Menne __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying last record in individual growth data over different time intervalls
If you were worried about efficiency and the structure/size of the dataframe was complex/big, then you could work with the indices only which would be more efficient: sapply(split(seq(nrow(t)), t$plate), function(x) t$id[x][which.max (t$year[x])]) 15 20 33 43 44 47 64 D72S200 S201S202S203S204 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 2004095 2006019 2006020 S205S206S207S208S209S210S211S212S213 S214S215S216S217 2006021 2006022 2006023 2006024 2006025 2006026 2006027 2006028 2006029 2006030 2006031 2006032 2006033 S218S219S220S222S223S224 2006034 2006035 2006036 2006037 2006038 2006039 On 3/5/07, Rainer M. Krug [EMAIL PROTECTED] wrote: Hi I have a plist t which contains size measurements of individual plants, identified by the field plate. It contains, among other, a field year indicating the year in which the individual was measured and the height. The number of measurements range from 1 to 4 measurements in different years. My problem is that I would need the LAST measurement. I only came up with the solution below which is probably way to complicated, but I can't think of another solution. Does anybody has an idea how to do this more effectively? Finally I would like to have a data.frame t2 which only contains the entries of the last measurements. Thanks in advance, Rainer unlist( sapply( split(t, t$plate), function(i) { i[i$year==max(i$year),]$id } ) ) 15 20 33 43 44 47 64 D72S200 S201 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 S202S203S204S205S206S207S208S209S210 S211 2004095 2006019 2006020 2006021 2006022 2006023 2006024 2006025 2006026 2006027 S212S213S214S215S216S217S218S219S220 S222 2006028 2006029 2006030 2006031 2006032 2006033 2006034 2006035 2006036 2006037 S223S224S225S226S227S228S229S230S231 S232 2006038 2006039 2006040 2006041 2006042 2006043 2006044 2006045 2006046 2006047 t id plate year height 2004007 200400715 2004 0.40 2005024 200502415 2005 0.43 2006001 200600115 2006 0.44 2004012 200401220 2004 0.90 2005026 200502620 2005 0.94 2006003 200600320 2006 0.98 2004025 200402533 2004 0.15 2005027 200502733 2005 0.15 2006005 200600533 2006 0.16 2004035 200403543 2004 0.26 2005038 200503843 2005 0.30 2006007 200600743 2006 0.38 2004036 200403644 2004 0.32 2005030 200503044 2005 0.39 2006008 200600844 2006 0.46 2004039 200403947 2004 0.50 2005025 200502547 2005 0.55 2006009 200600947 2006 0.63 2004055 200405564 2004 0.45 2005029 200502964 2005 0.58 2006014 200601464 2006 0.67 2006015 2006015 D72 2006 0.30 2004093 2004093 S200 2004 0.68 2005040 2005040 S200 2005 0.74 2006016 2006016 S200 2006 0.84 2004094 2004094 S201 2004 0.46 2005041 2005041 S201 2005 0.49 2006017 2006017 S201 2006 0.53 2004095 2004095 S202 2004 0.17 2004096 2004096 S203 2004 0.23 2005032 2005032 S203 2005 0.23 2006019 2006019 S203 2006 0.23 2004097 2004097 S204 2004 0.25 2005031 2005031 S204 2005 0.29 2006020 2006020 S204 2006 0.41 2004098 2004098 S205 2004 0.22 2005039 2005039 S205 2005 0.26 2006021 2006021 S205 2006 0.37 2004099 2004099 S206 2004 0.19 2005035 2005035 S206 2005 0.25 2006022 2006022 S206 2006 0.37 2004100 2004100 S207 2004 0.29 2005003 2005003 S207 2005 0.36 2006023 2006023 S207 2006 0.41 2004101 2004101 S208 2004 0.17 2005005 2005005 S208 2005 0.20 2006024 2006024 S208 2006 0.16 2004102 2004102 S209 2004 0.16 2005008 2005008 S209 2005 0.19 2006025 2006025 S209 2006 0.24 2004103 2004103 S210 2004 0.09 2005007 2005007 S210 2005 0.14 2006026 2006026 S210 2006 0.15 2004104 2004104 S211 2004 0.12 2005006 2005006 S211 2005 0.12 2006027 2006027 S211 2006 0.22 2004105 2004105 S212 2004 0.61 2005011 2005011 S212 2005 0.71 2006028 2006028 S212 2006 0.81 2004106 2004106 S213 2004 0.28 2005010 2005010 S213 2005 0.37 2006029 2006029 S213 2006 0.44 2004107 2004107 S214 2004 0.47 2005009 2005009 S214 2005 0.59 2006030 2006030 S214 2006 0.67 2004108 2004108 S215 2004 0.43 2005004 2005004 S215 2005 0.53 2006031 2006031 S215 2006 0.66 2004109 2004109 S216 2004 0.35 2005019 2005019 S216 2005 0.38 2006032 2006032 S216 2006 0.41 2004110 2004110 S217 2004 0.20 2005018 2005018 S217 2005 0.21 2006033 2006033 S217 2006 0.32 2004111 2004111
Re: [R] Mitools and lmer
Doug, It's mitools, not mltools. I wrote it. I think the problem is just that coef() is not the right function for getting the fixed effects. Beth wants betas - MIextract(model0, fun=fixef) -thomas On Sat, 3 Mar 2007, Douglas Bates wrote: On 3/2/07, Beth Gifford [EMAIL PROTECTED] wrote: Hey there I am estimating a multilevel model using lmer. I have 5 imputed datasets so I am using mitools to pool the estimates from the 5 datasets. Everything seems to work until I try to use MIcombine to produced pooled estimates. Does anyone have any suggestions? The betas and the standard errors were extracted with no problem so everything seems to work smoothly up until that point. I'm not familiar with the mltools package and I didn't see it listed in the CRAN packages. Can you provide a reference or a link to the package? Program #Read data data.dir-system.file(dta,package=mitools) files.imp-imputationList(lapply(list.files(data.dir, pattern=imp.\\.dta, full=TRUE), read.dta)) #estimate model over each imputed dataset model0-with(files.imp,lmer( erq2tnc ~1+trt2+nash+wash+male+coh2+coh3+(1 | sitebeth))) #extract betas and standard errors betas-MIextract(model0,fun=coef) vars-MIextract(model0,fun=vcov) #Combine the results summary(MIcombine(betas,vars)) Error in cbar + results[[i]] : non-numeric argument to binary operator Error in summary(MIcombine(betas, vars)) : error in evaluating the argument 'object' in selecting a method for function 'summary' First use traceback() to discover where the (first) error occurred. My guess is that Mlcombine expects a particular type of object for the vars argument and it is not getting that type (and not checking for the correct type). Thanks Beth [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying last record in individual growth data over different time intervalls
What is wrong with the method that you have? It looks reasonable efficient. As with other languages, there are always other ways of doing it. Here is another to consider, but it is basically the same: sapply(split(t, t$plate), function(x) x$id[which.max(x$year)]) 15 20 33 43 44 47 64 D72S200 S201S202S203S204 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 2004095 2006019 2006020 S205S206S207S208S209S210S211S212S213 S214S215S216S217 2006021 2006022 2006023 2006024 2006025 2006026 2006027 2006028 2006029 2006030 2006031 2006032 2006033 S218S219S220S222S223S224 2006034 2006035 2006036 2006037 2006038 2006039 On 3/5/07, Rainer M. Krug [EMAIL PROTECTED] wrote: Hi I have a plist t which contains size measurements of individual plants, identified by the field plate. It contains, among other, a field year indicating the year in which the individual was measured and the height. The number of measurements range from 1 to 4 measurements in different years. My problem is that I would need the LAST measurement. I only came up with the solution below which is probably way to complicated, but I can't think of another solution. Does anybody has an idea how to do this more effectively? Finally I would like to have a data.frame t2 which only contains the entries of the last measurements. Thanks in advance, Rainer unlist( sapply( split(t, t$plate), function(i) { i[i$year==max(i$year),]$id } ) ) 15 20 33 43 44 47 64 D72S200 S201 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 S202S203S204S205S206S207S208S209S210 S211 2004095 2006019 2006020 2006021 2006022 2006023 2006024 2006025 2006026 2006027 S212S213S214S215S216S217S218S219S220 S222 2006028 2006029 2006030 2006031 2006032 2006033 2006034 2006035 2006036 2006037 S223S224S225S226S227S228S229S230S231 S232 2006038 2006039 2006040 2006041 2006042 2006043 2006044 2006045 2006046 2006047 t id plate year height 2004007 200400715 2004 0.40 2005024 200502415 2005 0.43 2006001 200600115 2006 0.44 2004012 200401220 2004 0.90 2005026 200502620 2005 0.94 2006003 200600320 2006 0.98 2004025 200402533 2004 0.15 2005027 200502733 2005 0.15 2006005 200600533 2006 0.16 2004035 200403543 2004 0.26 2005038 200503843 2005 0.30 2006007 200600743 2006 0.38 2004036 200403644 2004 0.32 2005030 200503044 2005 0.39 2006008 200600844 2006 0.46 2004039 200403947 2004 0.50 2005025 200502547 2005 0.55 2006009 200600947 2006 0.63 2004055 200405564 2004 0.45 2005029 200502964 2005 0.58 2006014 200601464 2006 0.67 2006015 2006015 D72 2006 0.30 2004093 2004093 S200 2004 0.68 2005040 2005040 S200 2005 0.74 2006016 2006016 S200 2006 0.84 2004094 2004094 S201 2004 0.46 2005041 2005041 S201 2005 0.49 2006017 2006017 S201 2006 0.53 2004095 2004095 S202 2004 0.17 2004096 2004096 S203 2004 0.23 2005032 2005032 S203 2005 0.23 2006019 2006019 S203 2006 0.23 2004097 2004097 S204 2004 0.25 2005031 2005031 S204 2005 0.29 2006020 2006020 S204 2006 0.41 2004098 2004098 S205 2004 0.22 2005039 2005039 S205 2005 0.26 2006021 2006021 S205 2006 0.37 2004099 2004099 S206 2004 0.19 2005035 2005035 S206 2005 0.25 2006022 2006022 S206 2006 0.37 2004100 2004100 S207 2004 0.29 2005003 2005003 S207 2005 0.36 2006023 2006023 S207 2006 0.41 2004101 2004101 S208 2004 0.17 2005005 2005005 S208 2005 0.20 2006024 2006024 S208 2006 0.16 2004102 2004102 S209 2004 0.16 2005008 2005008 S209 2005 0.19 2006025 2006025 S209 2006 0.24 2004103 2004103 S210 2004 0.09 2005007 2005007 S210 2005 0.14 2006026 2006026 S210 2006 0.15 2004104 2004104 S211 2004 0.12 2005006 2005006 S211 2005 0.12 2006027 2006027 S211 2006 0.22 2004105 2004105 S212 2004 0.61 2005011 2005011 S212 2005 0.71 2006028 2006028 S212 2006 0.81 2004106 2004106 S213 2004 0.28 2005010 2005010 S213 2005 0.37 2006029 2006029 S213 2006 0.44 2004107 2004107 S214 2004 0.47 2005009 2005009 S214 2005 0.59 2006030 2006030 S214 2006 0.67 2004108 2004108 S215 2004 0.43 2005004 2005004 S215 2005 0.53 2006031 2006031 S215 2006 0.66 2004109 2004109 S216 2004 0.35 2005019 2005019 S216 2005 0.38 2006032 2006032 S216 2006 0.41 2004110 2004110 S217 2004 0.20 2005018 2005018 S217 2005 0.21 2006033 2006033 S217 2006 0.32
Re: [R] How to read in this data format?
If you want to process 'n' lines from the file, then just setup the file as a connection and read the desired length in a loop like below: f.1 - file('/tempxx.txt', 'r') nlines - 0 # read 1000 lines at a time while (TRUE){ lines - readLines(f.1, n=1000) if (length(lines) == 0) break # quit then no lines are read # processing nlines - nlines + length(lines) } cat (nlines, lines read\n) On 3/5/07, Bart Joosen [EMAIL PROTECTED] wrote: Hi, Although the solution worked, I'v got some troubles with some data files. These datafiles are very large (600-700 MB), so my computer starts swapping. If I use the code, written below, I get: Error in .Call(R_lazyLoadDBfetch, key, file, compressed, hook, PACKAGE = base) : recursive default argument reference After about 15 minutes of loading the data with the Lines. - readLines(myfile.dat) command. When I look in the help for readLines, I saw that there is a n to setup a maximum number, but is there a way to set a starting row number? If I can split up my datafiles in 4-8 small datasets, it's ok for me. But I couldn't figure it out. Thanks Bart From: Gabor Grothendieck [EMAIL PROTECTED] To: Bart Joosen [EMAIL PROTECTED] CC: r-help@stat.math.ethz.ch Subject: Re: [R] How to read in this data format? Date: Thu, 1 Mar 2007 16:46:21 -0500 On 3/1/07, Bart Joosen [EMAIL PROTECTED] wrote: Dear All, thanks for the replies, Jim Holtman has given a solution which fits my needs, but Gabor Grothendieck did the same thing, but it looks like the coding will allow faster processing (should check this out tomorrow on a big datafile). @gabor: I don't understand the use of the grep command: grep(^[1-9][0-9. ]*$|Time, Lines., value = TRUE) What is this expression (^[1-9][0-9. ]*$|Time) actually doing? I looked in the help page, but couldn't find a suitable answer. I briefly discussed it in the first paragraph of my response. It matches and returns only those lines that start (^ matches start of line) with a digit, i.e. [1-9], and contains only digits, dots and spaces, i.e. [0-9. ]*, to end of line, i.e. $ matches end of line, or (| means or) contains the word Time. If you don't have lines like ... (which you did in your example) then the regexp could be simplified to ^[0-9. ]+$|Time. You may need to match tabs too if your input contains those. Thanks to All Bart - Original Message - From: Gabor Grothendieck [EMAIL PROTECTED] To: Bart Joosen [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Thursday, March 01, 2007 6:35 PM Subject: Re: [R] How to read in this data format? Read in the data using readLines, extract out all desired lines (namely those containing only numbers, dots and spaces or those with the word Time) and remove Retention from all lines so that all remaining lines have two fields. Now that we have desired lines and all lines have two fields read them in using read.table. Finally, split them into groups and restructure them using by and in the last line we convert the by output to a data frame. At the end we display an alternate function f for use with by should we wish to generate long rather than wide output (using the terminology of the reshape command). Lines - $$ Experiment Number: $$ Associated Data: FUNCTION 1 Scan1 Retention Time 0.017 399.8112184 399.87420 399.9372152 Scan2 Retention Time 0.021 399.8112181 399.87421 399.9372153 # replace next line with: Lines. - readLines(myfile.dat) Lines. - readLines(textConnection(Lines)) Lines. - grep(^[1-9][0-9. ]*$|Time, Lines., value = TRUE) Lines. - gsub(Retention, , Lines.) DF - read.table(textConnection(Lines.), as.is = TRUE) closeAllConnections() f - function(x) c(id = x[1,2], structure(x[-1,2], .Names = x[-1,1])) out.by - by(DF, cumsum(DF[,1] == Time), f) as.data.frame(do.call(rbind, out.by)) We could alternately consider producing long format by replacing the function f with: f - function(x) data.frame(x[-1,], id = x[1,2]) On 3/1/07, Bart Joosen [EMAIL PROTECTED] wrote: Hi, I recieved an ascii file, containing following information: $$ Experiment Number: $$ Associated Data: FUNCTION 1 Scan1 Retention Time 0.017 399.8112184 399.87420 399.9372152 Scan2 Retention Time 0.021 399.8112181 399.87421 399.9372153 . I would like to import this data in R into a dataframe, where there is a column time, the first numbers as column names, and the second numbers as data in the dataframe: Time399.8112399.8742399.9372 0.017 184 0 152
Re: [R] Mitools and lmer
Yes, Thomas' solution fixed my mistake. thank you. On 3/5/07, Thomas Lumley [EMAIL PROTECTED] wrote: Doug, It's mitools, not mltools. I wrote it. I think the problem is just that coef() is not the right function for getting the fixed effects. Beth wants betas - MIextract(model0, fun=fixef) -thomas [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot(): I want to display dates on X-axis.
You can also do it with the following: plot(as.POSIXct(strptime(as.character(dat[,2]), %Y%m%d)), dat[,1]) On 3/5/07, d. sarthi maheshwari [EMAIL PROTECTED] wrote: Hi, I want to display dates on my x-axis of the plot. I was trying to use plot() command for the same and passing the values in following manner: The variable dat is a data frame. The first column has numeric values and second column has date. e.g. dat [,1]dat[,2] [1,]300 20060101 [2,]257 20060102 [3,]320 20060103 [4,]311 20060104 [5,]297 20060105 [6,]454 20060106 [7,]360 20060107 [8,]307 20060108 the command I am performing is:: plot(x=dat[1], y=as.character(dat[2])) Kindly suggest some method by which I can perform my task of displaying the first column values on y-axis against dates on x-axis. -- Thanks Regards Sarthi M. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error loading a dependency in a package: missing namespace?
Carlos J. Gil Bellosta [EMAIL PROTECTED] writes: import(methods, Biobase, outliers) * checking whether the package can be loaded ... ERROR Loading required package: Biobase Loading required package: tools Welcome to Bioconductor Vignettes contain introductory material. To view, type 'openVignette()' or start with 'help(Biobase)'. For details on reading vignettes, see the openVignette help page. Loading required package: outliers Error in loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) : in 'pcrAnalysis' classes for export not defined: pcrExprSet In addition: Warning message: package 'pcrAnalysis' contains no R code in: loadNamespace(package, c(which.lib.loc, lib.loc), keep.source = keep.source) Error: package/namespace load failed for 'pcrAnalysis' Execution halted It seems that the error is related to something having to do with namespaces. The thing is that package outliers does not have a NAMESPACE file. Could this be an issue? Yes, you cannot do import(pkg) in the NAMESPACE file if pkg doesn't itself have a NAMESPACE file. So try just removing that from your NAMESPACE file. + seth -- Seth Falcon | Computational Biology | Fred Hutchinson Cancer Research Center http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 0 * NA = NA
Ted Harding wrote: Is there any way to force 0 * NA to be 0 instead of NA? No (AFAIK), and it is pretty reasonable to define it this way. If you want to treat the NAs as zeros, use x[is.na(x)] - 0 Doing it in precisely that way would have the problem that it would not give you NA when it should. For example: x - c(1, NA, 1) wt - c(2, 1, 1) Then, after x[is.na(x)] - 0, the result of x %*% wt should be NA, but your method would give 3. That's precisely my thought - since you may have read my thoughts, it's time to recalibrate my alluminium helmet. But I also thought about something else. What is the meaning of NA? NA is a _missing_ value, and is.infinite(NA) returns FALSE [OTOH, is.finite(NA) returns FALSE too - this is weird]. A missing value times zero is zero. OTOH, 1/NA is NA, so NA could mean Inf. Maybe binary logic can't adequately handle such ideas :-/ if (NA == 0) is NA, then is.finite(NA) should be NA too... if (NA == 0) 2 else 3 # gives an error This is why I suggested a method which tests for corresponding elements of x = NA and y = 0, since what Alberto Monteiro wanted was 0*NA = 0, when that combination occures. I.e. %*NA% - function(x,y){ X-x;X[(is.na(x))(y==0)]-0; Y-y;Y[(is.na(y))(x==0)]-0; return(X%*%Y) } This method is fine. I had already done something similar Of course, the problem begins to grow if we want, for example, to use elementary matrices to transform a matrix. The 2x2 matrix that switches two lines, rbind(c(0,1), c(1,0)) will not switch a matrix with NAs: switch - rbind(c(0,1), c(1,0)) testmatrix - rbind(c(1,2,3,4), c(5,6,7,8)) switch %*% testmatrix # ok testmatrix[2,2] - NA switch %*% testmatrix # not ok But I digress... Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying last record in individual growth data over different time intervalls
Finally I would like to have a data.frame t2 which only contains the entries of the last measurements. You could also use aggregate to get the max year per plate then join that back to the original dataframe using merge on year and plate (common columns in both dataframes). x-data.frame(id=(1:8), plate=c(15,15,15,20,20,33,43,43), year=c(2004,2005,2006,2004,2005,2004,2005,2006), height=c(0.40,0.43,0.44,0.90,0.94,0.15,0.30,0.38)) merge(x, aggregate(list(year=x$year), list(plate=x$plate), max)) plate year id height 115 2006 3 0.44 220 2005 5 0.94 333 2004 6 0.15 443 2006 8 0.38 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: RFA and nsRFA
-- Forwarded message -- From: amna khan [EMAIL PROTECTED] Date: Feb 25, 2007 8:37 AM Subject: RFA and nsRFA To: [EMAIL PROTECTED], R-help@stat.math.ethz.ch Dear Sir There are two packages of regional frequency analysis RFA and nsRFA. Are both give us same results if not then what you will suggest. I am confused about this. Please guid me in this regard AMINA -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: nsRFA
-- Forwarded message -- From: amna khan [EMAIL PROTECTED] Date: Feb 25, 2007 8:44 AM Subject: nsRFA To: R-help@stat.math.ethz.ch, [EMAIL PROTECTED] Dear Sir I am not understanding the HOMTESTS in package nsRFA. Is vector x is the data from all sites combined combined in one vector? How to assign cod? Your help is really appreciable Regards AMINA -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix/dataframe indexing
Hi all, I am hoping someone can help me out with this: If I have dataframe of years and ages and the first column and first row are filled with leading values: Df-age1age2age3 Yr1 1 0.4 0.16 Yr2 1.5 0 0 Yr3 0.9 0 0 Yr4 1 0 0 Yr5 1.2 0 0 Yr6 1.4 0 0 Yr7 0.8 0 0 Yr8 0.6 0 0 Yr9 1.1 0 0 Now the rest of the cells need to be filled according to the previous year and age cell so arbitrarily, cell [2,2] should be value in cell [1,1] * exp(0.3), and cell [2,3] should be the value in cell [1,2]* exp(0.3), etc. How do I write the for loop so that it will calculate the missing cell values over both dimensions of the dataframe? Thanks in advance Cameron Guenther, Ph.D. 100 8th Ave. SE St. Petersburg, Fl 33701 727-896-8626 ext. 4305 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: RFA
-- Forwarded message -- From: amna khan [EMAIL PROTECTED] Date: Feb 25, 2007 8:51 AM Subject: RFA To: R-help@stat.math.ethz.ch, [EMAIL PROTECTED] Dear Sir in the following example,is the vector lmom a l-moment ratios vector? What is meant by size = northCascades[,1]? And what are the values in c(0.0104,0.0399,0.0405 )? Please help me I am unable to understand these from help manual. Best Regards AMINA data(northCascades) lmom - c(1, 0.1103, 0.0279, 0.1366) kappaParam - kappalmom(lmom) heterogeneity(500, 19, size = northCascades[,1], kappaParam, c(0.0104, .0339, .0405)) ##The heterogeneity statistics given by Hosking for this case ##study are H1 = 0.62, H2 = -1.49 and H3 = -2.37 ##Taking into account sample variability, results should be ##consistent -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] -- AMINA SHAHZADI Department of Statistics GC University Lahore, Pakistan. Email: [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Interface to round robin databases (RRDtool)
I was wondering if anyone has created an interface to RRDtool which is a round robin database that will store timeseries data and aggregate the data so that the storage footprint stays relatively small (older data summarized into larger segments). -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ANNOUNCEMENT: 20% Discount on R books from Chapman Hall/CRC Press
Take advantage of a 20% discount on the most recent R books from Chapman Hall/CRC! Chapman and Hall/CRC is pleased to offer our latest books on R - all available through our website at a 20% discount to users of the software. To take advantage of this permanent offer, that is valid across the board for all of our R books, simply visit http://www.crcpress.com/, choose your titles, and insert the online discount code - 585HH - in the 'Promotion Code' field at checkout. Please note: this offer is permanent but is currently in addition to the 10% promotion running on our website until March 17th. So all prices below are good until that date and represent a 28% discount! Standard shipping is also free! *** NEW TITLES *** Analysis of Correlated Data with SAS and R, Third Edition Mohamed M. Shoukri, King Faisal Specialist Hospital Res. Ctr, Riyadh, Saudi Arabia Mohammad A. Chaudhary, Department of International Health, Baltimore, MD, USA Publication Date: 5/17/2007 Number of Pages: 320 This bestselling resource is one of the first books to discuss the methodologies used for the analysis of clustered and correlated data. It focuses on the analysis of correlated data from epidemiologic and medical investigations, details the statistical analysis of cross-classified data, and covers time series, repeated measures, and logistic regression data. This new edition includes R code for almost all the examples and a CD-ROM with all the datasets and SAS and R code. Discounted Price: $64.79 / £35.99 For more details and to order: http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=C6196 *** Correspondence Analysis in Practice, Second Edition Michael Greenacre, Universitat Pompeu Fabra, Barcelona, Spain Publication Date: 5/7/2007 Number of Pages: 264 Presenting a comprehensive introduction that employs a practical approach to theory and applications, this new edition provides a descriptive and exploratory statistical technique to analyze simple two-way and multi-way tables. It features a large number of examples and case studies with an emphasis on applications in marketing and the social and environmental sciences. Divided into self-contained module sections, the book provides objectives and summaries of theory in each chapter. All analyses are performed using R through the ca package created by the author, a leading authority on the topic. Datasets are also available for download via the Web. Discounted Price: $57.56 / £28.79 For more details and to order: http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=C6161 *** Niche Modeling: Predictions from Statistical Distributions David Stockwell, University of California San Diego, La Jolla, California, USA Publication Date: 12/15/2006 Number of Pages: 224 Using theory, applications, and examples of inferences, this book demonstrates how to conduct and evaluate niche modeling projects in any area of application. It features a series of theoretical and practical exercises for developing and evaluating niche models using R. The author discusses applications of predictive modeling methods with reference to valid inferences from assumptions. He elucidates varied and simplified examples with rigor and completeness. Topics include geographic information systems, multivariate modeling, artificial intelligence methods, data handling, and information infrastructure. Discounted Price: $64.79 / £35.99 For more details and to order: http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=C4940 *** Linear Mixed Models: A Practical Guide Using Statistical Software Brady T. West, Kathleen B. Welch, Andrzej T. Galecki, with contributions from Brenda W. Gillespie, University of Michigan, Ann Arbor, USA Publication Date: 11/22/2006 Number of Pages: 376 Simplifying the often confusing array of software programs for fitting linear mixed models (LMMs), Linear Mixed Models: A Practical Guide Using Statistical Software provides a basic introduction to primary concepts, notation, software implementation, model interpretation, and visualization of clustered and longitudinal data. This easy-to-navigate reference details the use of procedures for fitting LMMs in five popular statistical software packages: SAS, SPSS, Stata, R/S-plus, and HLM. Discounted Price: $57.56 / £32.39 For more details and to order: http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=C4800 *** CURRENT BESTSELLERS! *** A Handbook of Statistical Analyses Using R Brian S. Everitt, Institute of Psychiatry, King's College, London, UK Torsten Hothorn, Institut für Medizininformatik, Biometrie und Epidemiologie, Erlangen, Germany Publication Date: 2/17/2006 Number of Pages: 275 From simple inference to recursive partitioning and cluster analysis, this book methodically leads readers through the necessary steps, commands, and interpretation of results - addressing theory and
[R] error message when using outer function
Dear R-users, I have two sets of code that appear to me to be equivalent, shown below, and yet I get the error message Error in dim(robj) - c(dX, dY) : dim- : dims [product 4] do not match the length of object [1] after executing the assignment to logdens2. Both functions post.a1 and post.a2 return the same values when run alone and not embedded in the function outer. I would appreciate help in understanding the differences between these two sets of code. The code in post.a1 is from Gelman, Carlin, Stern, and Rubin's Bayesian Data Analysis solutions, problem 3.5. I was trying to modify this code in post.a2 when I ran into this error. post.a1 - function(mu,sd,y){ ldens - 0 for (i in 1:length(y)) ldens - ldens + log(dnorm(y[i],mu,sd)) ldens} y - c(10,10,12,11,9) mugrid - c(10,11) sdgrid - c(1,1.2) logdens1 - outer (mugrid, sdgrid, post.a1, y) #*** no error messages *** post.a2 - function(mu,sd,y) { ldens - sum(log(dnorm(y,mu,sd))) ldens } y - c(10,10,12,11,9) mugrid - c(10,11) sdgrid - c(1,1.2) logdens2 - outer (mugrid, sdgrid, post.a2, y) #***error message occurs here *** Thank You! Grant Reinman e-mail:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help on installing RScaLAPACK on Ubuntu
I try to install RScaLAPACK on Ubuntu 6.10 and LAM 7.0.x Does anybody know a useful link top some how-to site about RScaLAPACK. Now I manage to get the package compiling, but the linker shows me lots of unsolved references: sudo R CMD INSTALL RScaLAPACK_0.5.1.tar.gz --configure-args=--with-mpi=/usr/lib/lam: * Installing *source* package 'RScaLAPACK' ... configure: MPI_HOME=/usr/lib/lam .. is set configure: BLACS_LIB=/usr/lib .. is set configure: BLAS_LIB=/usr/lib .. is set configure: SCALAPACK_LIB=/usr/lib .. is set checking for gcc... gcc checking for C compiler default output file name... a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ANSI C... none needed checking for pthread_atfork in -lpthread... yes checking for LAM-MPI... checking LAM-MPI Libraries at /usr/lib/lam... configure: LAM-MPI lib detected @ /usr/lib/lam.. Configured Parameters ... LIBS = -lscalapack -lblacsF77init -lblacsCinit -lblacs -lf77blas -latlas -llamf77mpi -lmpi -llam -lpthread LDFLAGS = -L/usr/lib -L/usr/lib -L/usr/lib/lib -L/usr/lib/lam/lib CFLAGS = -I/usr/lib/lam/include -g -O2 -std=gnu99 PALIBS = -lmpi -llam -lpthread ... *** ... configure: creating ./config.status config.status: creating src/Makefile configure: creating ./config.status config.status: creating src/Makefile config.status: creating R/StartUpLam.R ** libs ** arch - gcc -I/usr/share/R/include -I/usr/share/R/include -fpic -I/usr/lib/lam/include -g -O2 -std=gnu99 -c CRDriver.c -o CRDriver.o gcc -I/usr/share/R/include -I/usr/share/R/include -fpic -I/usr/lib/lam/include -g -O2 -std=gnu99 -c CRscalapack.c -o CRscalapack.o gfortran -fpic -g -O2 -c callpdgesv.f -o callpdgesv.o gfortran -fpic -g -O2 -c callpdgeqrf.f -o callpdgeqrf.o gfortran -fpic -g -O2 -c callpdgesvd.f -o callpdgesvd.o gfortran -fpic -g -O2 -c callpdgemm.f -o callpdgemm.o gfortran -fpic -g -O2 -c callpdpotrf.f -o callpdpotrf.o gfortran -fpic -g -O2 -c callpdpotri.f -o callpdpotri.o gfortran -fpic -g -O2 -c callpdsyevd.f -o callpdsyevd.o gfortran -fpic -g -O2 -c CRcollectData.f -o CRcollectData.o gfortran -fpic -g -O2 -c CRdistData.f -o CRdistData.o gcc CRDriver.o CRscalapack.o callpdgesv.o callpdgeqrf.o callpdgesvd.o callpdgemm.o callpdpotrf.o callpdpotri.o callpdsyevd.o CRcollectData.o CRdistData.o -L/usr/lib -L/usr/lib -L/usr/lib/lib -L/usr/lib/lam/lib -lscalapack -lblacsF77init -lblacsCinit -lblacs -lf77blas -latlas -llamf77mpi -lmpi -llam -lpthread -I/usr/lib/lam/include -g -O2 -std=gnu99 -lg2c -o CRDriver callpdgesv.o: In function `callpdgesv_': /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:96: undefined reference to `blacs_pinfo_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:99: undefined reference to `blacs_get_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:100: undefined reference to `blacs_gridinit_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:101: undefined reference to `blacs_gridinfo_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:201: undefined reference to `blacs_gridexit_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:148: undefined reference to `_gfortran_st_write' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:147: undefined reference to `_gfortran_transfer_character' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:147: undefined reference to `_gfortran_transfer_integer' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:148: undefined reference to `_gfortran_transfer_integer' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgesv.f:148: undefined reference to `_gfortran_st_write_done' callpdgeqrf.o: In function `callpdgeqrf_': /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:95: undefined reference to `blacs_pinfo_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:98: undefined reference to `blacs_get_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:99: undefined reference to `blacs_gridinit_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:100: undefined reference to `blacs_gridinfo_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:253: undefined reference to `blacs_gridexit_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:213: undefined reference to `blacs_barrier_' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:163: undefined reference to `_gfortran_st_write' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:163: undefined reference to `_gfortran_transfer_character' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:163: undefined reference to `_gfortran_transfer_integer' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:163: undefined reference to `_gfortran_st_write_done' /tmp/R.INSTALL.FZN392/RScaLAPACK/src/callpdgeqrf.f:156: undefined reference to `_gfortran_st_write'
[R] generate random numbers for regression model
Hi, please help me coding: model : y = f(x) + e, e ~ N(0, sigma^2) f(x) = 1/1+x^2 generate {(x_i,y_i), i=1,2,...,n} x_i: are ordered H(x_j) = sum_i =1, i neq j-1 y_i+1/x_i+1 - x_j (x_i+1 - x_i ) j = 1, ...,n find {(x_i, H(x_i)), i = 1, ..., n} Then find (x_i, H(H(x_i)), i = 1, ..., n} I do appreciate your help thank you in advance fatemah - It's here! Your new message! Get new email alerts with the free Yahoo! Toolbar. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 0 * NA = NA
On 05-Mar-07 Alberto Monteiro wrote: Of course, the problem begins to grow if we want, for example, to use elementary matrices to transform a matrix. The 2x2 matrix that switches two lines, rbind(c(0,1), c(1,0)) will not switch a matrix with NAs: switch - rbind(c(0,1), c(1,0)) testmatrix - rbind(c(1,2,3,4), c(5,6,7,8)) switch %*% testmatrix # ok testmatrix[2,2] - NA switch %*% testmatrix # not ok Indeed! -- which is the sort of reason I said This is a bit of a tricky one, especially in a more general context. There is no straightforward extension of my %*NA% operator which deals with such a case, since the internal assignments X-x;X[(is.na(x))(y==0)]-0; Y-y;Y[(is.na(Y))(x==0)]-0; fail because x (switch) and y (testmatrix) are non-conformable (one being 2x2, the other 2x4). Nor will it work if conformable, since then the 0's in switch takes out the NA in the %*NA% operator: testmatrix - rbind(c(1,2), c(3,4)) testmatrix [,1] [,2] [1,]12 [2,]34 switch %*NA% testmatrix [,1] [,2] [1,]34 [2,]12 ## OK testmatrix[2,2] - NA switch %*NA% testmatrix [,1] [,2] [1,]30 [2,]12 ## Not OK! So, if you want to simply multiply testmatrix by switch, with terms 0*NA = 0, then you're OK; but you can't then use the same operator for the purpose of switching rows, so you need a new operator just for that kind of purpose. Of course, for that specific purpose, index manipulation will do the job: testmatrix [,1] [,2] [1,]12 [2,]3 NA testmatrix[(2:1),] [,1] [,2] [1,]3 NA [2,]12 but it then disconnects it from the correspondence between matrix multiplication and transformations. Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 05-Mar-07 Time: 19:43:23 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] logistic regression on contingency table
On Mon, 2007-03-05 at 15:31 +, Dieter Menne wrote: Bingshan Li bli1 at bcm.tmc.edu writes: I am wondering if there is a way in R to fit logistic regression on contingency table. If I have original data, I can transform the data into a design matrix and then call glm to fit the regression. But now I have a 2x3 contingency table with first row for response 0 and second row for response 1, and the columns are 3 levels of predictor variable. The 3 levels are not ordinal though and indicator variables would be more appreciate. From Documentation of GLM: For binomial and quasibinomial families the response can also be specified as a factor (when the first level denotes failure and all others success) or as a two-column matrix with the columns giving the numbers of successes and failures. Dieter Menne Just to expand on Dieter's comments, one trick to taking this approach is to coerce the contingency table you are starting with to a data frame, and then specify a 'weights' argument to glm(). Taking some dummy example data in a 2D contingency table: TAB X Y A B C 0 1 9 2 1 3 3 2 So we have X (IV) and Y (Response). Now, coerce TAB to a data frame. See ?as.data.frame.table and ?xtabs, which reverses the process back to a contingency table: DFT - as.data.frame(TAB) DFT Y X Freq 1 0 A1 2 1 A3 3 0 B9 4 1 B3 5 0 C2 6 1 C2 As an FYI, gets us back to 'TAB': xtabs(Freq ~ ., DFT) X Y A B C 0 1 9 2 1 3 3 2 Now create the model, using 'Freq' for the case weights: fit - glm(Y ~ X, weights = Freq, data = DFT, family = binomial) summary(fit) Call: glm(formula = Y ~ X, family = binomial, data = DFT, weights = Freq) Deviance Residuals: 1 2 3 4 5 6 -1.665 1.314 -2.276 2.884 -1.665 1.665 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)1.099 1.155 0.951 0.3414 XB-2.197 1.333 -1.648 0.0994 . XC-1.099 1.528 -0.719 0.4720 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 26.920 on 5 degrees of freedom Residual deviance: 23.540 on 3 degrees of freedom AIC: 29.54 Number of Fisher Scoring iterations: 4 An alternative to the above is to use a function that I just posted in the past week for another query, called expand.dft(): expand.dft - function(x, na.strings = NA, as.is = FALSE, dec = .) { DF - sapply(1:nrow(x), function(i) x[rep(i, each = x$Freq[i]), ], simplify = FALSE) DF - subset(do.call(rbind, DF), select = -Freq) for (i in 1:ncol(DF)) { DF[[i]] - type.convert(as.character(DF[[i]]), na.strings = na.strings, as.is = as.is, dec = dec) } DF } This takes the data frame table 'DFT' from above and converts it back to the raw observations: DF - expand.dft(DFT) DF Y X 1 0 A 2 1 A 2.1 1 A 2.2 1 A 3 0 B 3.1 0 B 3.2 0 B 3.3 0 B 3.4 0 B 3.5 0 B 3.6 0 B 3.7 0 B 3.8 0 B 4 1 B 4.1 1 B 4.2 1 B 5 0 C 5.1 0 C 6 1 C 6.1 1 C As an FYI, gets us back to 'TAB': table(DF) X Y A B C 0 1 9 2 1 3 3 2 So, now we can use the normal approach for glm(): fit2 - glm(Y ~ X, data = DF, family = binomial) summary(fit2) Call: glm(formula = Y ~ X, family = binomial, data = DF) Deviance Residuals: Min 1Q Median 3Q Max -1.6651 -0.7585 -0.7585 0.8632 1.6651 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept)1.099 1.155 0.951 0.3414 XB-2.197 1.333 -1.648 0.0994 . XC-1.099 1.528 -0.719 0.4720 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 26.920 on 19 degrees of freedom Residual deviance: 23.540 on 17 degrees of freedom AIC: 29.54 Number of Fisher Scoring iterations: 4 Note of course that the DF's are different, though the Null and Residual Deviances and AIC are the same: Taking the latter approach, will of course enable you to make subsequent manipulations on the raw data if you wish. HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error message when using outer function
Comments inside. Reinman, Grant napsal(a): Dear R-users, I have two sets of code that appear to me to be equivalent, shown below, and yet I get the error message Error in dim(robj) - c(dX, dY) : dim- : dims [product 4] do not match the length of object [1] after executing the assignment to logdens2. Both functions post.a1 and post.a2 return the same values when run alone and not embedded in the function outer. I would appreciate help in understanding the differences between these two sets of code. The code in post.a1 is from Gelman, Carlin, Stern, and Rubin's Bayesian Data Analysis solutions, problem 3.5. I was trying to modify this code in post.a2 when I ran into this error. post.a1 - function(mu,sd,y){ ldens - 0 for (i in 1:length(y)) ldens - ldens + log(dnorm(y[i],mu,sd)) ldens} y - c(10,10,12,11,9) mugrid - c(10,11) sdgrid - c(1,1.2) logdens1 - outer (mugrid, sdgrid, post.a1, y) #*** no error messages *** Actually pretty ugly coding (for a textbook) in my opinion... Try what you get with y - c(10,10,12,11,9) mugrid - c(10,11) sdgrid - c(1,1.2) log(dnorm(y[1],mean=mugrid,sd=sdgrid)) [1] -0.9189385 -1.4484823 This is in a perfect accordance with help(dnorm), which says you are allowed to specify a vector of means and SDs. Here 2 values were specified, so you obtain 2 density values. Now, adding a vector of length 2 to a constant: #not run here# ldens - ldens + log(dnorm(y[i],mu,sd)) is again a vector of length 2. However, using sum() in your code below gives you just a single number and R starts complaining because the dimensions don't match. Petr post.a2 - function(mu,sd,y) { ldens - sum(log(dnorm(y,mu,sd))) ldens } y - c(10,10,12,11,9) mugrid - c(10,11) sdgrid - c(1,1.2) logdens2 - outer (mugrid, sdgrid, post.a2, y) #***error message occurs here *** Thank You! Grant Reinman e-mail:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Non : Confidence intervals for p**2 ??
Öhagen Patrik Patrik.Ohagen at mpa.se writes: Dear List, I was asked to calculate a confidence interval for p*p. Is there any standard techniques for calculating such an interval? Delta Method? Thank you in advance! if p is a generic value (i.e. not a probability) and you know the variance (and are willing to assume normality) then you can indeed use the delta method; there are a variety of other techniques if you have the original data: fitting profile confidence limits, various resampling methods including bootstrapping, etc. (See section 6 of chapter 7 at http://www.zoo.ufl.edu/emdbook for more details if you like). Ben Bolker __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Scoping issue? [SUMMARY]
A BIG thanks to Luke Tierney and Jim Holtman (and perhaps others? I haven't seen the latest digest) for pointing out two fixes to some code that was troubling me. I since found a third problem, and with that, have good code. Their comments are summarized below this fixed version of the code. mmatplot - function(colnum, x, y, addto = 0, titleroot = Column #, ...){ switch(class(y), array = y - y[, colnum, ], list = y - sapply(X = y, FUN = function(d) d[, colnum])) #Thanks Luke! stopifnot(is.matrix(y)) matplot(x, y, main = paste(titleroot, colnum), ...) } This function is a matplot wrapper, useful (I thought) within lapply or sapply to visually compare several like columns of several matrices. Arg y is either a list of matrices with equal number of rows, or an array. The first arg, scalar n, gives the column of each matrix (or array slab) to plot. The fact that it is in first position allows the program to be used in lapply. par values and matplot args are accepted. Here is the tester function. It also needed fixing. I had called mapply instead of lapply by mistake, but using lapply arguments! ### mmatplotTest - function(){ A - array(data = rnorm(90), dim = c(10, 3, 3)) L - list(A[, , 1], A[, , 2], A[, , 3]) oldmf - par(mfrow) par(mfrow = c(2,3)) # Test with class(y) == array lapply(X = 1:ncol(A), FUN = mmatplot, x = 1:nrow(A), y = A, titleroot = Array, column #) # Test with class(y) == list lapply(1:ncol(L[[1]]), mmatplot, x = 1:nrow(L[[1]]), y = L, titleroot = Listed matrices, column #) par(mfrow = oldmf) } mmatplotTest() Regarding the original, broken version... Jim was first to point out that 'colnum' did not exist when my 'paste' call was made. I asked why, since I thought lazy evaluation should get around this problem. Luke explained... In your test function there is no lexically visible definition for the `colnum` variable used in defining main, so that error is what you expect from lexical scoping. Lazy evaluation dictates when (and if) the `main` argument is evaluated, but the environment in which it is evaluated is determined by the context where the expression is written in the code, i.e. within the function mmatplotTest. Luke also pointed out the source of another error: The error you get when you take out the `main` is coming from `subset` and is due to the fact that subset is one of those functions that uses non-standard evaluation for some of its arguments, in this case `select`. This makes it (slightly) easier to use at interactive top level but much more complicated to use within a functionYou need to use another function in your sapply call I incorporated his suggested alternative above. Jim suggest sidestepping the entire problem by using explicit, R-level looping instead of lapply (mapply in my original flawed version!), asserting that in this case it is even faster, but that, in any case, most time is spent by the mmatplot function itself, rendering inconsequential the method of iterating it (also, I should add, rendering rather superfluous my mmatplot function altogether!). Thanks again, Luke and Jim, and whoever else considered my problem! -John Thaden Little Rock, AR, USA Confidentiality Notice: This e-mail message, including any a...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] factor analysis and pattern matrix
Hi, In a discussion of factor analysis in Using Multivariate Statistics by Tabachnick and Fidell, two matrices are singled out as important for interpreting an exploratory factor analysis (EFA) with an oblique promax rotation. One is the structure matrix. The structure matrix contains the correlations between variables and factors. However, these correlations may be inflated because some of the variance in a factor may not be unique to it. To address this and facilitate the interpretation of factors, the pattern matrix can be calculated as it contains the unique correlations between variables and factors (that is, the variance shared among factors has been removed). Are the loadings returned from factanal() with a promax rotation the structure or the pattern matrix? How do I calculate which ever one of the matrices is not returned by factanal? Thanks, Steve __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rbind with data frames -- column names question
As part of my work, I am trying to append matrices onto data frames. Naively I assumed that when rbinding a data.frame and matrix, the matrix would be coerced and appended, keeping the names from the data frame. Clearly, I am not fully understanding the process by which rbind works. Example code: A-data.frame(1,1,1); names(A)=letters[1:3] ; B-matrix(0,2,3) rbind(A,B) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 rbind(A,as.data.frame(B)) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 Is there a right way to combine the two such that the both end up having the same column names? I have tried to understand the deparse.level argument of rbind, but it doesn't seem to do what I'm asking. Thank you for any help you can give. Gregg -- Gregg Lind, M.S. Division of Epidemiology and Community Health School of Public Health University of Minnesota, United States __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Rbind with data frames -- column names question
Gregg, What about A-data.frame(1,1,1); names(A)=letters[1:3] ; B-matrix(0,2,3) B-as.data.frame(B) names(B)-names(A) rbind(A,B) -Cody Gregg Lind [EMAIL PROTECTED] To Sent by: r-help@stat.math.ethz.ch [EMAIL PROTECTED] cc at.math.ethz.ch Subject [R] Rbind with data frames -- 03/05/2007 02:06 column names question PM As part of my work, I am trying to append matrices onto data frames. Naively I assumed that when rbinding a data.frame and matrix, the matrix would be coerced and appended, keeping the names from the data frame. Clearly, I am not fully understanding the process by which rbind works. Example code: A-data.frame(1,1,1); names(A)=letters[1:3] ; B-matrix(0,2,3) rbind(A,B) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 rbind(A,as.data.frame(B)) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 Is there a right way to combine the two such that the both end up having the same column names? I have tried to understand the deparse.level argument of rbind, but it doesn't seem to do what I'm asking. Thank you for any help you can give. Gregg -- Gregg Lind, M.S. Division of Epidemiology and Community Health School of Public Health University of Minnesota, United States __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rbind with data frames -- column names question
This takes care of things quite nicely. The other solutions (explicitly coercing things) work as well, but this seems to me the minimum necessary solution for my particular problem. (P.s.: I hope I responded in the correct way to ensure threading... to the main list address.) A- data.frame(1,1,1); names(A)=letters[1:3] ; B-matrix(0,2,3) colnames(B) - colnames(A) rbind(A,B) GL jim holtman wrote: colnames(B) - colnames(A) On 3/5/07, *Gregg Lind* [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] wrote: As part of my work, I am trying to append matrices onto data frames. Naively I assumed that when rbinding a data.frame and matrix, the matrix would be coerced and appended, keeping the names from the data frame. Clearly, I am not fully understanding the process by which rbind works. Example code: A- data.frame(1,1,1); names(A)=letters[1:3] ; B-matrix(0,2,3) rbind(A,B) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 rbind(A,as.data.frame(B)) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 Is there a right way to combine the two such that the both end up having the same column names? I have tried to understand the deparse.level argument of rbind, but it doesn't seem to do what I'm asking. Thank you for any help you can give. Gregg -- Gregg Lind, M.S. Division of Epidemiology and Community Health School of Public Health University of Minnesota, United States __ R-help@stat.math.ethz.ch mailto:R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rbind with data frames -- column names question
I don't know if this is the right way, but I think this would work (I'm assuming you want the result to be a data frame): data.frame(rbind(as.matrix(A),B)) You might get a warning about row names, but I think it works OK. On 05/03/07, Gregg Lind [EMAIL PROTECTED] wrote: As part of my work, I am trying to append matrices onto data frames. Naively I assumed that when rbinding a data.frame and matrix, the matrix would be coerced and appended, keeping the names from the data frame. Clearly, I am not fully understanding the process by which rbind works. Example code: A-data.frame(1,1,1); names(A)=letters[1:3] ; B-matrix(0,2,3) rbind(A,B) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 rbind(A,as.data.frame(B)) Error in match.names(clabs, names(xi)) : names do not match previous names: V1, V2, V3 Is there a right way to combine the two such that the both end up having the same column names? I have tried to understand the deparse.level argument of rbind, but it doesn't seem to do what I'm asking. Thank you for any help you can give. Gregg -- Gregg Lind, M.S. Division of Epidemiology and Community Health School of Public Health University of Minnesota, United States __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Linear programming with sparse matrix input format?
Hi. I am aware of three different R packages for linear programming: glpk, linprog, lpSolve. From what I can tell, if there are N variables and M constraints, all these solvers require the full NxM constraint matrix. Some linear solvers I know of (not in R) have a sparse matrix input format. Are there any linear solvers in R that have a sparse matrix input format? (including the possibility of glpk, linprog, and lpSolve, in case I might have missed something in the documentation). Thanks! -- TMK -- 212-460-5430home 917-656-5351cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mixed effects multinomial regression and meta-analysis
R Experts: I am conducting a meta-analysis where the effect measures to be pooled are simple proportions. For example, consider this data from Fleiss/Levin/Paik's Statistical methods for rates and proportions (2003, p189) on smokers: Study N Event P(Event) 1 86 830.965 2 93 900.968 3 136 1290.949 4 82 700.854 Total397 372 A test of heterogeneity for a table like this could simply be Pearson' chi-square test. -- smoke.data - matrix(c(83,90,129,70,3,3,7,12), ncol=2, byrow=F) chisq.test(smoke.data, correct=T) X-squared = 12.6004, df = 3, p-value = 0.005585 -- Now this test implies that the data is heterogenous and that pooling might be inappropriate. This type of analysis could be considered a fixed effects analysis because it assumes that the 4 studies are all coming from one underlying population. But what if I wanted to do a mixed effects (fixed + random) analysis of data like this, possibly adjusting for an important covariate or two (assuming I had more studies, of course)...how would I go about doing it? One thought that I had would be to use a mixed effects multinomial logistic regression model, such as that reported by Hedeker (Stat Med 2003, 22: 1433), though I don't know if (or where) it is implemented in R. I am certain there are also other ways... So, my questions to the R experts are: 1) What method would you use to estimate or account for the between study variance in a dataset like the one above that would also allow you to adjust for a variable that might explain the heterogeneity? 2) Is it implemented in R? Brant Inman Mayo Clinic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Document classes with tm
does anyone have any tips for using the tm package for supporting autoclassifying textual documents? while tm works very well for parsing text documents and creating term-document matrices, it doesnt seem to support tracking document classes by default. without a way to know the classes of your training documents, building a classifier is kind of a non starter. i know i could just do this manually by just reading in the classes from a csv, but im hoping there is a fascility in tm for doing this that im just missing. thanks, alan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to override ordering of panels in xyplot()
On Mar 3, 2007, at 3:19 PM, Deepayan Sarkar wrote: On 3/3/07, Michael Kubovy [EMAIL PROTECTED] wrote: Dear r-helpers, I'm conditioning an xyplot on a variable whose levels are'low', 'med', 'high'. How do I override the alphabetical ordering for the panels of the plot? This has less to do with xyplot and more to do with the default of the 'levels' argument in the factor() function. Just make sure the levels are in the right order in your data when xyplot is called. Unless one makes the factor ordered, reordering the levels of a factor does not seem to be a trivial matter. The only R function I've found that makes it easy is reorder_factor() (package:reshape). Or am I missing something? _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear programming with sparse matrix input format?
If you can reformulate your LP as an L1 problem, which is known to be possible without loss of generality, but perhaps not without loss of sleep, then you could use the sparse quantile regression functions in the quantreg package. url:www.econ.uiuc.edu/~rogerRoger Koenker email [EMAIL PROTECTED] Department of Economics vox:217-333-4558University of Illinois fax:217-244-6678Champaign, IL 61820 On Mar 5, 2007, at 5:30 PM, Talbot Katz wrote: Hi. I am aware of three different R packages for linear programming: glpk, linprog, lpSolve. From what I can tell, if there are N variables and M constraints, all these solvers require the full NxM constraint matrix. Some linear solvers I know of (not in R) have a sparse matrix input format. Are there any linear solvers in R that have a sparse matrix input format? (including the possibility of glpk, linprog, and lpSolve, in case I might have missed something in the documentation). Thanks! -- TMK -- 212-460-5430 home 917-656-5351 cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Heteroskedastic Time Series
check fseris library. On 3/5/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Hi R-helpers, I'm new to time series modelling, but my requirement seems to fall just outside the capabilities of the arima function in R. I'd like to fit an ARMA model where the variance of the disturbances is a function of some exogenous variable. So something like: Y_t = a_0 + a_1 * Y_(t-1) +...+ a_p * Y_(t-p) + b_1 * e_(t-1) +...+ b_q * e_(t-q) + e_t, where e_t ~ N(0, sigma^2_t), and with the variance specified by something like sigma^2_t = exp(beta_t * X_t), where X_t is my exogenous variable. I would be very grateful if somebody could point me in the direction of a library that could fit this (or a similar) model. Thanks, James Kirkby Actuarial Maths and Stats Heriot Watt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- WenSui Liu A lousy statistician who happens to know a little programming (http://spaces.msn.com/statcompute/blog) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] different random effects for each level of a factor in lme
I have an interesting lme - problem. The data is part of the Master Thesis of my friend, and she had some problems analysing this data, until one of her Jurors proposed to use linear mixed-effect models. I'm trying to help her since she has no experience with R. I'm very used to R but have very few experience with lme. The group calls of one species of parrot were recorded at many localities in mainland and islands. Within the localities the parrots move in groups, several calls were recorded for each group but the calls can not be separated by individuals. We use the variable s1 to measure one property of the calls (the length of the first part of the call). We are interested in explaining the variability of the calls, one hypothesis is that variability of calls tends to be greater in islands compared with mainland. So we have s1 : as a measure of interest loc : as a grouping variable (locality) grp : as a grouping variable (group) nested in loc isla : is an factor that identify which localities are in island and which are in mainland (it is outer to loc) I began with a simple model with fixed effects in isla (since there are some differences in the length of s1) and nested random effects: f00 - lme(s1~isla,data=s1.ap, random=~1|loc/grp) My final model should have fixed effects in isla, different nested random for both levels of isla, and different error per stratum, something like: f11 - lme(s1~isla,data=s1.ap, random=~isla|loc/grp, weights=varIdent(form=~1|isla)) or perhaps: f11b - lme(s1~isla,s1.ap, random=~isla-1|loc/grp, weights=varIdent(form=~1|isla)) Is this a valid formulation? I have seen that ~x|g1/g2 is usually used for modelling random effects in (the intercept and slope of) covariates and that ~1|g1/f1 is used to model interactions between grouping factors and treatment factors. I fitted the above models (and a few other variants between the simpler and the complex ones) and found f11 to be the best model, f11 and f11b are both identical in terms of AIC or LR-Tests since they are the same model with different parametrization (I guess...). Now, I suppose I did everything right, and I want to compare the variance decomposition in islands and mainland, I use VarCorr(f11) VarianceStdDev Corr loc = pdLogChol(isla) (Intercept) 1643.5904 40.54122 (Intr) islaT962.2991 31.02095 -0.969 grp = pdLogChol(isla) (Intercept) 501.7315 22.39936 (Intr) islaT622.5393 24.95074 -0.818 Residual 547.0888 23.38993 VarCorr(f11b) VarianceStdDev Corr loc =pdLogChol(isla - 1) islaI1643.4821 40.53988 islaI islaT 168.6514 12.98659 0 grp =pdLogChol(isla - 1) islaI 501.7357 22.39946 islaI islaT 209.8698 14.48688 0 Residual 547.0871 23.38989 The variance for islands (islaI) is always greater than the ones for mainland (islaT) as expected, and the estimates of Intercept in f11 are nearly equal to the estimates for islaI in f11b. However, the estimates for islaT are completely different. It seems to me that the estimates in f11b are the correct ones and the ones in f11 are obtained by reparametrization of the variance-covariance matrix. Am I right? I want to say what percentage of the variance is explained by each level, something like this: tmp - data.frame(I=c(1643.48,501.74,(547.09)),T=c(168.65,209.86,(0.7823097*547.09))) rownames(tmp) - c(loc,grp,res) t(t(tmp)/(colSums(tmp))) I T loc 0.6104349 0.2091125 grp 0.1863604 0.2602096 res 0.2032047 0.5306780 (0.7823097 was the result of varIdent for islaT) If I compare the sum of variances in f11b for each level of isla with the variances of the data frame I get similar results: colSums(tmp) I T 2692.3100 806.5038 tapply(s1.ap$s1,s1.ap$isla,var) I T 2417.1361 731.8165 So I guess this is the right way to interpret the variances in the fitted model. Or, is it not? thanks, JR -- Dipl.-Biol. JR Ferrer Paris ~~~ Laboratorio de Biología de Organismos --- Centro de Ecología Instituto Venezolano de Investigaciones Científicas (IVIC) Apdo. 21827, Caracas 1020-A República Bolivariana de Venezuela Tel: (+58-212) 504-1452 Fax: (+58-212) 504-1088 email: [EMAIL PROTECTED] clave-gpg: 2C260A95 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tournaments to dendrograms
I've had no response to the enquiry below, so I made a rather half-baked version in grid -- code and pdf are available here: http://www.econ.uiuc.edu/~roger/research/ncaa comments would be welcome. This is _the_ ubiquitous graphic this time of year in the US, so R should take a shot at it. My first attempt is rather primitive but I have to say that Paul's grid package is superb. url:www.econ.uiuc.edu/~rogerRoger Koenker email [EMAIL PROTECTED] Department of Economics vox:217-333-4558University of Illinois fax:217-244-6678Champaign, IL 61820 On Feb 22, 2007, at 4:08 PM, roger koenker wrote: Does anyone have (good) experience converting tables of tournament results into dendrogram-like graphics? Tables, for example, like this: read.table(url(http://www.econ.uiuc.edu/~roger/research/ncaa/ NCAA.d)) Any pointers appreciated. RK url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Linear programming with sparse matrix input format?
Thank you for the tip. Can quantreg actually handle mixed / integer programs? -- TMK -- 212-460-5430home 917-656-5351cell - Original Message - From: roger koenker [EMAIL PROTECTED] To: Talbot Katz [EMAIL PROTECTED] Cc: r-help R-help@stat.math.ethz.ch Sent: Monday, March 05, 2007 9:08 PM Subject: Re: [R] Linear programming with sparse matrix input format? If you can reformulate your LP as an L1 problem, which is known to be possible without loss of generality, but perhaps not without loss of sleep, then you could use the sparse quantile regression functions in the quantreg package. url:www.econ.uiuc.edu/~rogerRoger Koenker email [EMAIL PROTECTED] Department of Economics vox:217-333-4558University of Illinois fax:217-244-6678Champaign, IL 61820 On Mar 5, 2007, at 5:30 PM, Talbot Katz wrote: Hi. I am aware of three different R packages for linear programming: glpk, linprog, lpSolve. From what I can tell, if there are N variables and M constraints, all these solvers require the full NxM constraint matrix. Some linear solvers I know of (not in R) have a sparse matrix input format. Are there any linear solvers in R that have a sparse matrix input format? (including the possibility of glpk, linprog, and lpSolve, in case I might have missed something in the documentation). Thanks! -- TMK -- 212-460-5430 home 917-656-5351 cell __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mixed effects multinomial regression and meta-analysis
R-Experts: I just realized that the example I used in my previous posting today is incorrect because it is a binary response, not a multilevel response (small, medium, large) such as my real life problem has. I apologize for the confusion. The example is incorrect, but the multinomial problem is real. Brant [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to override ordering of panels in xyplot()
On 3/5/07, Michael Kubovy [EMAIL PROTECTED] wrote: Unless one makes the factor ordered, reordering the levels of a factor does not seem to be a trivial matter. The only R function I've found that makes it easy is reorder_factor() (package:reshape). Or am I missing something? Don't think in terms of reordering factors, just create a new factor, using a call to factor(), with a suitable 'levels' argument. It's perfectly OK to call factor() on a factor; e.g. foo = factor(letters[1:4]) foo [1] a b c d Levels: a b c d factor(foo, levels = rev(letters[1:4])) [1] a b c d Levels: d c b a -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is there a quick way to count the number of times each element in a vector appears?
Hi there, I'm writing a function that calculates the probability of different outcomes of dice rolls (e.g., the sum of the highest three rolls of five six-sided dice). I'm using the combinations function from the gtools package, which is great: it gives me a matrix with all of the possible combinations (with repetitions allowed). Now I want to count the number of times each element appears in each arrangement so I can calculate the number of permutations of that arrangement. E.g., if I get output like: combinations(3,3, rep=TRUE) [,1] [,2] [,3] [1,]111 [2,]112 [3,]113 [4,]122 [5,]123 [6,]133 [7,]222 [8,]223 [9,]233 [10,]333 I'd like to be able to determine that the first row has 3 repetitions, yielding 3!/3! = 1 permutation, while the second row has 3 repetitions, yielding 3!/2! = 3 permutations, etc. (This gets harder when there are large numbers of dice with many faces.) I know there are simple things to do, like iterating over the rows with for loops, but I've heard that for loops are sub-optimal in R, and I'd like to see what an elegant solution would look like. E.g., I might like to use sapply() with whatever function I come up with; I thought of using something like duplicated() and just counting the number of TRUEs that are returned for each vector (since the elements are always returned in non-decreasing order), but I'm optimistic that there is a better (faster/cleaner) way. So here is my question in a nutshell: Does anyone have ideas for how I might efficiently process a matrix like that returned by a call to combinations(n, r, rep=TRUE) to determine the number of repetitions of each element in each row of the matrix? If so, I'd love to hear them! Thanks very much for your time, Dylan Arena (Statistics M.S. student) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying last record in individual growth data over different time intervalls
Hi jim holtman wrote: What is wrong with the method that you have? It looks reasonable Actually there is nothing wrong with the approach I am using - it just seemed to be quite complicated and I assumed that there is an easier approach around. The dataset is not that large that I really have to worry about efficiency. Thanks a lot , Rainer efficient. As with other languages, there are always other ways of doing it. Here is another to consider, but it is basically the same: sapply(split(t, t$plate), function(x) x$id[which.max(x$year)]) 15 20 33 43 44 47 64 D72S200 S201S202S203S204 2006001 2006003 2006005 2006007 2006008 2006009 2006014 2006015 2006016 2006017 2004095 2006019 2006020 S205S206S207S208S209S210S211S212S213 S214S215S216S217 2006021 2006022 2006023 2006024 2006025 2006026 2006027 2006028 2006029 2006030 2006031 2006032 2006033 S218S219S220S222S223S224 2006034 2006035 2006036 2006037 2006038 2006039 On 3/5/07, Rainer M. Krug [EMAIL PROTECTED] wrote: Hi I have a plist t which contains size measurements of individual plants, identified by the field plate. It contains, among other, a field year indicating the year in which the individual was measured and the height. The number of measurements range from 1 to 4 measurements in different years. My problem is that I would need the LAST measurement. I only came up with the solution below which is probably way to complicated, but I can't think of another solution. Does anybody has an idea how to do this more effectively? Finally I would like to have a data.frame t2 which only contains the entries of the last measurements. Thanks in advance, Rainer -- NEW EMAIL ADDRESS AND ADDRESS: [EMAIL PROTECTED] [EMAIL PROTECTED] WILL BE DISCONTINUED END OF MARCH Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Leslie Hill Institute for Plant Conservation University of Cape Town Rondebosch 7701 South Africa Fax:+27 - (0)86 516 2782 Fax:+27 - (0)21 650 2440 (w) Cell: +27 - (0)83 9479 042 Skype: RMkrug email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Identifying last record in individual growth data over different time intervalls
Hi Chris Chris Stubben wrote: Finally I would like to have a data.frame t2 which only contains the entries of the last measurements. You could also use aggregate to get the max year per plate then join that back to the original dataframe using merge on year and plate (common columns in both dataframes). Thanks for the idea to use aggregate and merge - as I like SQL, this seems to be a nice approach. Rainer x-data.frame(id=(1:8), plate=c(15,15,15,20,20,33,43,43), year=c(2004,2005,2006,2004,2005,2004,2005,2006), height=c(0.40,0.43,0.44,0.90,0.94,0.15,0.30,0.38)) merge(x, aggregate(list(year=x$year), list(plate=x$plate), max)) plate year id height 115 2006 3 0.44 220 2005 5 0.94 333 2004 6 0.15 443 2006 8 0.38 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- NEW EMAIL ADDRESS AND ADDRESS: [EMAIL PROTECTED] [EMAIL PROTECTED] WILL BE DISCONTINUED END OF MARCH Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation Biology (UCT) Leslie Hill Institute for Plant Conservation University of Cape Town Rondebosch 7701 South Africa Fax:+27 - (0)86 516 2782 Fax:+27 - (0)21 650 2440 (w) Cell: +27 - (0)83 9479 042 Skype: RMkrug email: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a quick way to count the number of times each element in a vector appears?
is this what you mean? tmp - combinations(3, 3, rep=TRUE) colSums(apply(tmp, 1, duplicated))+1 b On Mar 6, 2007, at 1:16 AM, Dylan Arena wrote: Hi there, I'm writing a function that calculates the probability of different outcomes of dice rolls (e.g., the sum of the highest three rolls of five six-sided dice). I'm using the combinations function from the gtools package, which is great: it gives me a matrix with all of the possible combinations (with repetitions allowed). Now I want to count the number of times each element appears in each arrangement so I can calculate the number of permutations of that arrangement. E.g., if I get output like: combinations(3,3, rep=TRUE) [,1] [,2] [,3] [1,]111 [2,]112 [3,]113 [4,]122 [5,]123 [6,]133 [7,]222 [8,]223 [9,]233 [10,]333 I'd like to be able to determine that the first row has 3 repetitions, yielding 3!/3! = 1 permutation, while the second row has 3 repetitions, yielding 3!/2! = 3 permutations, etc. (This gets harder when there are large numbers of dice with many faces.) I know there are simple things to do, like iterating over the rows with for loops, but I've heard that for loops are sub-optimal in R, and I'd like to see what an elegant solution would look like. E.g., I might like to use sapply() with whatever function I come up with; I thought of using something like duplicated() and just counting the number of TRUEs that are returned for each vector (since the elements are always returned in non-decreasing order), but I'm optimistic that there is a better (faster/cleaner) way. So here is my question in a nutshell: Does anyone have ideas for how I might efficiently process a matrix like that returned by a call to combinations(n, r, rep=TRUE) to determine the number of repetitions of each element in each row of the matrix? If so, I'd love to hear them! Thanks very much for your time, Dylan Arena (Statistics M.S. student) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a quick way to count the number of times each element in a vector appears?
sorry, i forgot to mention that you will need an extra test |-) tmp - combinations(3, 3, rep=TRUE) out - colSums(apply(tmp, 1, duplicated))+1 out[out == 1] - 0 but now, re-reading your message, you say (..) want to count the number of times each element appears in each arrangement (...) apply(tmp, 1, function(v) table(factor(v, levels=1:3))) might be what you actually meant. sorry for the confusion, b On Mar 6, 2007, at 2:00 AM, Benilton Carvalho wrote: is this what you mean? tmp - combinations(3, 3, rep=TRUE) colSums(apply(tmp, 1, duplicated))+1 b On Mar 6, 2007, at 1:16 AM, Dylan Arena wrote: Hi there, I'm writing a function that calculates the probability of different outcomes of dice rolls (e.g., the sum of the highest three rolls of five six-sided dice). I'm using the combinations function from the gtools package, which is great: it gives me a matrix with all of the possible combinations (with repetitions allowed). Now I want to count the number of times each element appears in each arrangement so I can calculate the number of permutations of that arrangement. E.g., if I get output like: combinations(3,3, rep=TRUE) [,1] [,2] [,3] [1,]111 [2,]112 [3,]113 [4,]122 [5,]123 [6,]133 [7,]222 [8,]223 [9,]233 [10,]333 I'd like to be able to determine that the first row has 3 repetitions, yielding 3!/3! = 1 permutation, while the second row has 3 repetitions, yielding 3!/2! = 3 permutations, etc. (This gets harder when there are large numbers of dice with many faces.) I know there are simple things to do, like iterating over the rows with for loops, but I've heard that for loops are sub-optimal in R, and I'd like to see what an elegant solution would look like. E.g., I might like to use sapply() with whatever function I come up with; I thought of using something like duplicated() and just counting the number of TRUEs that are returned for each vector (since the elements are always returned in non-decreasing order), but I'm optimistic that there is a better (faster/cleaner) way. So here is my question in a nutshell: Does anyone have ideas for how I might efficiently process a matrix like that returned by a call to combinations(n, r, rep=TRUE) to determine the number of repetitions of each element in each row of the matrix? If so, I'd love to hear them! Thanks very much for your time, Dylan Arena (Statistics M.S. student) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] optima setting of multiple response
Dears: I have a question about multiple response and find the optima setting. There are 7 responses(y1,y2,y3,y4,y5,y7,y8) and 5 variables(x1-x5). y1=0.3567+ 0.0154*x1-0.0003*x2+ 0.2295*x3-0.0082*x4 y2=278.6814-4.3832*x1+0.0831*x2-24.3953*x3+8.1404*x4 y3=8.9813-0.0025*x2-0.1746*x3+ 0.0560*x4+ 0.0346*x5 y4=220.216+1.204*x2+53.634*x4-15.473*x5 y5=1.1404+0.0644*x3-0.0278*x4-0.0044*x5 y7=8.9155-0.0042*x2-0.1647*x3-0.2026*x4+0.0538*x5 y8= -22.5899+0.0719*x2+4.2494*x4 subject to 0.783 y1 0.957 324 y2 396 8.1 y3 9.9 0.9 y4 1.1 0.585 y5 0.715 8.1 y7 9.9 0.9 y8 1.1 3 x1 6 260 x2 460 2 x3 4 1 x4 10 35 x5 45 Which function or package can I us? I have try the lp function but there is no solution. Thanks for your answer. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] delete selecting rows and columns
I mean something like in MATLAB matrix(sel_r,:)=[] jastar wrote: Hi, I'm working with a big square matrix (15k x 15k) and I have some trouble. I want to delete selecting rows and columns. I'm using something like this: sel_r=c(15,34,384,985,4302,6213) sel_c=c(3,151,324,3384,7985,14302) matrix=matrix[-sel_r,-sel_c] but it works very slow. Does anybody know how to make it in faster way? Thank's -- View this message in context: http://www.nabble.com/delete-selecting-rows-and-columns-tf3308726.html#a9327427 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.