[R] browser() can not stop the execution
I need to use browser() to stop a while loop to input some value for the loop. But the browser() just will not stop until the last line of the code. Does anyone know the possible reason? I use ggobi in the loop, and open a few ggobi windows before the browser(), will that be the reason? Thanks A LOT! -- View this message in context: http://old.nabble.com/browser%28%29-can-not-stop-the-execution-tp26356069p26356069.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re move row if the column date_abandoned has a date in it
sorry David, im really new to R (my first week) and appreciate your help. Also I dont always know what info to give people on the forum (although im starting to catch the drift). heres what i get... summary(new_data4$date_abandoned) Min.1st Qu.Median Mean 3rd Qu. Max. NA's 1601 1998 2001 1993 2004 2009 315732 ls() [1] data new_data new_data2 new_data3 new_data4 small - head(new_data4, 20) dump(small, 20) Error in dump(small, 20) : cannot write to this connection frenchcr David Winsemius wrote: On Nov 14, 2009, at 5:24 PM, frenchcr wrote: I tried the following but it does the opposite of what i want: new_data5 - subset(new_data4, date_abandoned 0101) I want to remove the rows with dates and leave just the rows without a date. This removes all the rows that dont have a date in the date_abandoned column ...on a positive note, as i did this next... dim(new_data5) [1] 263 80 i now know that i have 263 dates in that column :) I want to remove the 263 rows with dates and leave just the rows without a date. Con=me on frenchcr. Stop making us guess. Give us enough information to work with. You asked for something which I construed as saying you wanted dates greater than the the first day of the year 101. You did not address this question. What do you get with str(new_data4) and summary(new_data4$date_abandoned) ? In order to know what sort of comparison to use we need to know what the data looks like. Even better if you offered the output from: small - head(new_data4, 20) dump(small, 20), -- David David Winsemius wrote: On Nov 14, 2009, at 1:21 PM, frenchcr wrote: I want to go through a column in data called Bad name for a data.frame. Fortunes, dog and all that. date_abandoneddata[date_abandoned]and remove all the rows that have numbers greater than 1,010,000. Are you doing archeology? Given what you say next I wondered what range you were really asking for. The dates are in the format 20091114 so i'm just going to treat them as numbers for clean up purposes. I know that i use subset but not sure how to proceed from there. subdata - subset(data, date_abandoned 0101() The problem with 101 is that your specified minimum point had an insufficient number of places to be in MMDD format. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26355689.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R crashing
Hello, This is what I am trying to do: I wrote a little function that takes addresses (coordinates) as input, and returns the road distance between every two points using Google Maps. Catch is, there are 2000 addresses, so I have to get around 2x10^6 addresses. On my first go, this is what I did: # getRoadDist = function(X,complete=F){# X must be a matrix or data frame of coordinates; lat and lon require(RCurl) Y = apply( X, 1, function(x){ paste(x[1], ,, x[2], sep=) } ) grid = expand.grid(Y,Y,KEEP.OUT.ATTRS=F) grid = apply(grid,1,function(x){paste(x[1],daddr=,x[2],sep=)}) grid = matrix(grid,ncol=length(Y),dimnames=list(names(Y),names(Y))) grid[upper.tri(grid,T)] = NA Distances = function(x){ if (is.na(x)) { NA } else { URL = getURL(paste(http://maps.google.com/maps?saddr= ,x,sep=)) y = strsplit(URL, divb) y = strsplit(y[[1]][2], #160;mi/b )[[1]][1] as.numeric(y) } } dists = sapply(grid,Distances) dists = matrix(dists,ncol=ncol(grid),dimnames=dimnames(grid)) if (complete) { diag(dists)=0 dists[upper.tri(dists)]=dists[lower.tri(dists)] dists } else { dists } } # But R was crashing after 1 hour or so -- it either said Reached total allocation of 1535Mb or, became unresponsive. Then, I tried to modify the procedure to avoid big matrices at the. What I did was, I got the distances and, one by one, appended them to a file in the hope that this would use less memory: ## # X is the matrix of addresses, as before require(RCurl) Y = apply( X, 1, function(x){ paste(x[1], ,, x[2], sep=) } ) grid = expand.grid(Y,Y,KEEP.OUT.ATTRS=F) grid = apply(grid,1,function(x){paste(x[1],daddr=,x[2],sep=)}) grid = matrix(grid,ncol=length(Y),dimnames=list(names(Y),names(Y))) grid[upper.tri(grid,T)] = NA Distances = function(x){ if (is.na(x)) { NA } else { URL = getURL(paste(http://maps.google.com/maps?saddr= ,x,sep=)) y = strsplit(URL, divb) y = strsplit(y[[1]][2], #160;mi/b )[[1]][1] as.numeric(y) } } grid2=grid[!is.na(grid)] n = length(grid2) for (i in 1:n) { temp = Distances(grid2[i]) write.table(temp,distances.csv,col.names=F,row.names=F,append=T) } ## But R still crashes after 2 hours (all I got was around 20.000 distances). It doesn't really matter how long this will take me (I can always use more than one machine), but I'd really like to get this done. Any thoughts? Many many thanks, Dimitri [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R crashing
On Sun, Nov 15, 2009 at 11:10 AM, Dimitri Szerman dimitri...@gmail.com wrote: Hello, This is what I am trying to do: I wrote a little function that takes addresses (coordinates) as input, and returns the road distance between every two points using Google Maps. Catch is, there are 2000 addresses, so I have to get around 2x10^6 addresses. On my first go, this is what I did: I hope on your first go you didn't run it with 2000 addresses. You did test it with 13 addresses first didn't you? Another idea is to replace your Distance function with a function that returns runif(1). This will either make your code fail much much quicker or identify that the problem is in the Distance function (some memory leak there). Also, you should check the return value from your google query - I've seen google get a bit upset about repeated automated queries and return a message saying This looks like an automated query and a CAPTCHA test. grid2=grid[!is.na(grid)] n = length(grid2) for (i in 1:n) { temp = Distances(grid2[i]) write.table(temp,distances.csv,col.names=F,row.names=F,append=T) } This won't work - you're overwriting distances.csv with the new value of 'temp' every time. Another good reason to test with 13 values before waiting and failing after six hours, and then having to hammer google's map server again. I'd write this as a simple loop, and dump all the apply stuff. And rewrite Distance to be a function of two lat-longs: Distance=function(lat1,lon1,lat2,lon2){ return(distance) } Then (untested): Dmat = matrix(NA,nrow(X),nrow(X)) for(i in 2:nrow(X)){ for(j in 1:i){ d = Distance(X[i,1],X[i,2],X[j,1],X[j,2]) Dmat[i,j]=d } } I'm not sure apply wins much here. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R crashing
2009/11/15 Barry Rowlingson b.rowling...@lancaster.ac.uk On Sun, Nov 15, 2009 at 11:10 AM, Dimitri Szerman dimitri...@gmail.com wrote: Hello, This is what I am trying to do: I wrote a little function that takes addresses (coordinates) as input, and returns the road distance between every two points using Google Maps. Catch is, there are 2000 addresses, so I have to get around 2x10^6 addresses. On my first go, this is what I did: I hope on your first go you didn't run it with 2000 addresses. You did test it with 13 addresses first didn't you? I did, and it worked well. Another idea is to replace your Distance function with a function that returns runif(1). This will either make your code fail much much quicker or identify that the problem is in the Distance function (some memory leak there). Also, you should check the return value from your google query - I've seen google get a bit upset about repeated automated queries and return a message saying This looks like an automated query and a CAPTCHA test. Mmmm, I weren't aware of that. grid2=grid[!is.na(grid)] n = length(grid2) for (i in 1:n) { temp = Distances(grid2[i]) write.table(temp,distances.csv,col.names=F,row.names=F,append=T) } This won't work - you're overwriting distances.csv with the new value of 'temp' every time. No, I am not, because append=TRUE. I did this, and I managed to get 20.000 distances or so. Another good reason to test with 13 values before waiting and failing after six hours, and then having to hammer google's map server again. I'd write this as a simple loop, and dump all the apply stuff. And rewrite Distance to be a function of two lat-longs: Distance=function(lat1,lon1,lat2,lon2){ return(distance) } Then (untested): Dmat = matrix(NA,nrow(X),nrow(X)) for(i in 2:nrow(X)){ for(j in 1:i){ d = Distance(X[i,1],X[i,2],X[j,1],X[j,2]) Dmat[i,j]=d } } I'm not sure apply wins much here. Thanks. The reason I didn't want to do something like that is because, in the event of a crash, I'll loose everything that was done. That's why I though of appending the results often. Barry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] browser() can not stop the execution
How are you executing your script? Are you doing a cut/paste into the command window? On WIndows using Tinn-R, this procedure will invoke 'browser' and it will then continue to read the rest of the script since it is coming from the standard input. The way around it is to put the script in a file and then 'source' it. Try this and report back. On Sun, Nov 15, 2009 at 1:22 AM, chao83 chaohan1...@yahoo.com wrote: I need to use browser() to stop a while loop to input some value for the loop. But the browser() just will not stop until the last line of the code. Does anyone know the possible reason? I use ggobi in the loop, and open a few ggobi windows before the browser(), will that be the reason? Thanks A LOT! -- View this message in context: http://old.nabble.com/browser%28%29-can-not-stop-the-execution-tp26356069p26356069.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naive collinear weighted linear regression
Peter Dalgaard p.dalgaard at biostat.ku.dk writes: The point is that R (as well as almost all other mainstream statistical software) assumes that a weight means that the variance of the corresponding observation is the general variance divided by the weight factor. The general variance is still determined from the residuals, and if they are zero to machine precision, well, there you go. I suspect you get closer to the mark with glm, which allows you to assume that the dispersion is known: summary(glm(y~x,family=gaussian),dispersion=0.3^2) or summary(glm(y~x,family=gaussian,weights=1/error^2),dispersion=1) Excellent; any of these commands provide Std. Errors which now coincide with my naive expectation: though the data fall perfectly in a straight line, since they have some associated uncertainties (only) in the response variables (homoskedasticity), the estimated coefficients should have some kind of nonvanishing uncertainties as well, should they not?? Now, forgive me, but I did not get the explanation for the distinct meanings of Std. Error when calling simply summary(lm(y~x,weights=1/error^2), which I had done before, and your suggested calls; could you rephrase and dwell a little bit more upon this point. What does the option dispersion exactly mean? Also, could you suggest some specific reference for me to read about this? I have your excellent book Introductory statistics with R, 1st edition, but was not able (perhaps I have missed some point) to find this kind of distinction there... Does this theme is specifically what statisticians call really generalized linear models (glm) as opposed to (ordinary) linear models? If so, which good references could you please suggest?? I thought of the following books and would feel much obliged should you give me your impressions about them, if any, or about any other relevant references at all: 1) Faraway, Linear models with R 2) Faraway, Extending the linear model with R: generalized linear... 3) Fox, An R and S-Plus companion.. 4) Uusipaikka, Confidence intervals in generalized linear regression models Thank you very much!! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R crashing
On Sun, Nov 15, 2009 at 11:57 AM, Dimitri Szerman dimitri...@gmail.com wrote: Thanks. The reason I didn't want to do something like that is because, in the event of a crash, I'll loose everything that was done. That's why I though of appending the results often. Oops yes, I missed the 'append=TRUE' flag. That's a good idea. Last time I did something similar to this I used a relational database for saving. I created a table of all the i,j pairs with columns i,j,distance and 'ok'. 'ok' was set to False initially. Then I'd query the db for a row with 'ok=False', and go about getting the distance. If I got a good distance back I set 'ok=True' and never bothered getting that again. This was in Python with SQLite as the database engine, but you can do something similar in R. With a distributed database you could easily split the queries between as many servers as you can get your hands on. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Presentation of data in Graphical format
Hello My data contains following columns: 1st column: Posts (GM, Secretary, AM, Office Boy) 2nd Column: Dept (Finance, HR, ...) 3rd column: Tasks (Open the door, Fix an appointment, Fill the register, etc.) depending on the post 4th column: Average Time required to do the task So the sample data would look like PostsDeptTask Average time Office Boy HR Open the door 00:00:09 Secretary FinanceFix an appointment00.00.30 . .. I am trying to represent this data in Graphical format, I tried graphs like Mosaic plot, etc. But it does not represent the data correctly. My aim is to check the amount of time and its variability for groups of tasks Thank you in advance Regards Sunita -- View this message in context: http://old.nabble.com/Presentation-of-data-in-Graphical-format-tp26358857p26358857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme model specification
Dear all this is a question of model specification in lme which I'd for which I'd greatly appreciate some guidance. Suppose I have data in long format gene treatment rep Y 11 1 4.32 11 2 4.67 11 3 5.09 .. .. .. .. .. .. 14 1 3.67 14 2 4.64 14 3 4.87 .. .. .. .. .. .. 2000 1 1 5.12 2000 1 2 2.87 2000 1 3 7.23 .. .. .. .. .. .. 2000 4 1 2.48 2000 4 2 3.93 2000 4 3 5.17 that is, I have data Y_{gtr} for g (gene) =1,...,2000t (treatment) = 1,...,4 andr (replicate) = 1,...,3 I would like to fit the following linear mixed model using lme Y_{gtr} = \mu_{g} + W_{gt} + Z_{gtr} where the \mu_{g}'s are fixed gene effects, W_{gt} ~ N(0, \sigma^{2}) gene-treatment interactions, and residual errors Z_{gtr} ~ N(0,\tau^{2}). (Yes, I know I'm specifying an interaction between gene and treatment without specifying a treatment main effect ! - there is good reason for this) I know that specifying model.1 - lme(Y ~ -1 + factor(gene), data=data, random= ~1|gene/treatment) fits Y_{gtr} = \mu_{g} + U_{g} + W_{gt} + Z_{gtr} with \mu_{g}, W_{gt} and Z_{gtr} as previous and U_{g} ~ N(0,\gamma^{2}), but I do NOT want to specify a random gene effect. I have scoured Bates and Pinheiro without coming across a parallel example. Any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster Uinversity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] naive collinear weighted linear regression
David Winsemius dwinsemius at comcast.net writes: It's really not that difficult to get the variance covariance matrix. What is not so clear is why you think differential weighting of a set that has a perfect fit should give meaningfully different results than a fit that has no weights. Again, David, what I have in mind is: since there are errors or uncertainties in the response variables (despite the perfect collinearity of the data), which I assume are Gaussian, if I make a large enough number of simulations of four response values, there will undoubtedly be a dispersion in the best fit intercept and slope obtained from a usual unweighted least squares procedure, right? Then, if I calculate the arithmetic mean of these simulated intercept and slope, I would certainly check that they would be 0 and 2, respectively. However, and THAT IS THE POINT, there will also be a standard deviation associated with each one of these two coefficients, right??, and that is what I would assign as the measure of uncertainty in the estimation of the coefficients. This is not, as Dalgaard has called attention to, what the simple command summary(lm(y~x,weights=1/err^2)) provides in its Std. Error. However, as Dalgaard also recalled, the command summary(glm(y~x,family=gaussian,weights=1/err^2),dispersion=1) does provide Std. Errors in the coefficients which look plausible (at least to me) and, at any rate, which do coincide with results from other packages (Numerical Recipes, ROOT and possibly GSL...) ?lm ?vcov y - c(2,4,6,8) # response vect fit_mod - lm(y~x,weights=1/error^2) Error in eval(expr, envir, enclos) : object 'error' not found error - c(0.3,0.3,0.3,0.3) fit_mod - lm(y~x,weights=1/error^2) vcov(fit_mod) (Intercept) x (Intercept) 2.396165e-30 -7.987217e-31 x -7.987217e-31 3.194887e-31 Numerically those are effectively zero. fit_mod - lm(y~x) vcov(fit_mod) (Intercept) x (Intercept) 0 0 x 0 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re move row if the column date_abandoned has a date in it
On Nov 14, 2009, at 8:43 PM, frenchcr wrote: sorry David, im really new to R (my first week) and appreciate your help. Also I dont always know what info to give people on the forum (although im starting to catch the drift). heres what i get... summary(new_data4$date_abandoned) Min.1st Qu.Median Mean 3rd Qu. Max. NA's 1601 1998 2001 1993 2004 2009 315732 So new_data4$data_abandoned is not of type Date and is instead a character vector. If you are resisting turning it into a date and want to work with characters, you can, you just need to deal somehow with the items that are not 8 characters wide. What does 315732 represent? How were we supposed to interpret the starting date you gave of 0101? nchar(101) [1] 7 What does table(nchar(new_data4$date_abandoned)) give you? ls() [1] data new_data new_data2 new_data3 new_data4 small - head(new_data4, 20) dump(small, 20) Error in dump(small, 20) : cannot write to this connection Well, sorry, I meant to type dump(small, stdout()) ... As per the Posting Guide. -- David. David Winsemius wrote: On Nov 14, 2009, at 5:24 PM, frenchcr wrote: I tried the following but it does the opposite of what i want: new_data5 - subset(new_data4, date_abandoned 0101) I want to remove the rows with dates and leave just the rows without a date. This removes all the rows that dont have a date in the date_abandoned column ...on a positive note, as i did this next... dim(new_data5) [1] 263 80 i now know that i have 263 dates in that column :) I want to remove the 263 rows with dates and leave just the rows without a date. Con=me on frenchcr. Stop making us guess. Give us enough information to work with. You asked for something which I construed as saying you wanted dates greater than the the first day of the year 101. You did not address this question. What do you get with str(new_data4) and summary(new_data4$date_abandoned) ? In order to know what sort of comparison to use we need to know what the data looks like. Even better if you offered the output from: small - head(new_data4, 20) dump(small, 20), -- David David Winsemius wrote: On Nov 14, 2009, at 1:21 PM, frenchcr wrote: I want to go through a column in data called Bad name for a data.frame. Fortunes, dog and all that. date_abandoneddata[date_abandoned]and remove all the rows that have numbers greater than 1,010,000. Are you doing archeology? Given what you say next I wondered what range you were really asking for. The dates are in the format 20091114 so i'm just going to treat them as numbers for clean up purposes. I know that i use subset but not sure how to proceed from there. subdata - subset(data, date_abandoned 0101() The problem with 101 is that your specified minimum point had an insufficient number of places to be in MMDD format. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26355689.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] update.lm question
Hello, at the Rgui command line I can easily remove a term from a fitted lm object, like fit - lm(y~x1+x2+x3, data=myData) update(fit, .~.-x1) However, I would like to do this in a function with term given as string, like removeTerm - function(linModel, termName) { ??? } removeTerm(fit, x1) but I can not fill the ???. I already tried removeTerm - function(linModel, termName) { update(linModel, .~. - termName }, removeTerm - function(linModel, termName) { update(linModel, .~. - as.name(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval.parent(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - get(termName) }, but these attempts produce error messages. Can you advise me here? Kind regards, Karsten __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error running lda example from Help File (MASS library )
On Nov 14, 2009, at 7:02 PM, Greg Riddick wrote: Hello all, I'm trying to run lda() from the MASS library but the Help example generates the following error: #Code from example in lda Help file no code included # Resulting Error Error in if (targetlist[i] == stringname) { : argument is of length zero Cannot reproduce on setup possibly similar to yours: sessionInfo() R version 2.10.0 Patched (2009-10-29 r50258) x86_64-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] splines stats graphics grDevices utils datasets methods base other attached packages: [1] MASS_7.3-3 plyr_0.1.9 survey_3.18 Design_2.3-0 Hmisc_3.7-0 [6] survival_2.35-7 lattice_0.17-26 loaded via a namespace (and not attached): [1] cluster_1.12.1 grid_2.10.0tools_2.10.0 My Current R Installation: MacOSX: 10.5.8 R: 2.10.0 What is your sessionInfo(), requested in Posting Guide. -- Gregory Riddick, PhD. CRTA Research Fellow National Institutes of Health National Cancer Institute, Neuro-Oncology Branch http://home.ccr.cancer.gov/nob/ 37 Convent Drive Building 37, Room 1142 Bethesda, MD 20892-8202 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update.lm question
You need to review: ?update ?update.formula ?as.formula The way should be clear at that point. If not, then include a reproducible data example to work with. -- David On Nov 15, 2009, at 9:23 AM, Karsten Weinert wrote: Hello, at the Rgui command line I can easily remove a term from a fitted lm object, like fit - lm(y~x1+x2+x3, data=myData) update(fit, .~.-x1) However, I would like to do this in a function with term given as string, like removeTerm - function(linModel, termName) { ??? } removeTerm(fit, x1) but I can not fill the ???. I already tried removeTerm - function(linModel, termName) { update(linModel, .~. - termName }, removeTerm - function(linModel, termName) { update(linModel, .~. - as.name(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval.parent(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - get(termName) }, but these attempts produce error messages. Can you advise me here? Kind regards, Karsten -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] where is a value in my list
I heve got a list: lista=list() a=c(2,4,5,5,6) b=c(3,5,4,2) c=c(1,1,1,8) lista[[1]]=a lista[[2]]=b lista[[3]]=c lista [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 [[3]] [1] 1 1 1 8 I would like to know where is number 5 (which line)? For example I have got a loop: k= vector(mode = integer, length = 3) for(i in 1:3) { for (j in 1:length(lista[[i]])){ if ((lista[[i]][j])==5 k[i]= [i]) } } This loop is wrong but I would like to get in my vector k sth like this: k = lista[[1]][1], lista[[2]][1] ...or sth similar -- View this message in context: http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26359843.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Relase positive with log and zero of negative with 0
This is a very simple question but I couldn't form a site search quesry that would return a reasonable result set. Say I have a vector: x - c(0,2,3,4,5,-1,-2) I want to replace all of the values in 'x' with the log of x. Naturally this runs into problems since some of the values are negative or zero. So how can I replace all of the positive elements of x with the log(x) and the rest with zero? Thank you. Kevin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Presentation of data in Graphical format
Google R graph grallery Google R ggplot2 Google R lattice and good luck milton On Sun, Nov 15, 2009 at 7:48 AM, Sunita22 sunita...@gmail.com wrote: Hello My data contains following columns: 1st column: Posts (GM, Secretary, AM, Office Boy) 2nd Column: Dept (Finance, HR, ...) 3rd column: Tasks (Open the door, Fix an appointment, Fill the register, etc.) depending on the post 4th column: Average Time required to do the task So the sample data would look like PostsDeptTask Average time Office Boy HR Open the door 00:00:09 Secretary FinanceFix an appointment00.00.30 . .. I am trying to represent this data in Graphical format, I tried graphs like Mosaic plot, etc. But it does not represent the data correctly. My aim is to check the amount of time and its variability for groups of tasks Thank you in advance Regards Sunita -- View this message in context: http://old.nabble.com/Presentation-of-data-in-Graphical-format-tp26358857p26358857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where is a value in my list
On Nov 15, 2009, at 10:01 AM, Grzes wrote: I heve got a list: lista=list() a=c(2,4,5,5,6) b=c(3,5,4,2) c=c(1,1,1,8) lista[[1]]=a lista[[2]]=b lista[[3]]=c lista [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 [[3]] [1] 1 1 1 8 I would like to know where is number 5 (which line)? For example I have got a loop: k= vector(mode = integer, length = 3) for(i in 1:3) { for (j in 1:length(lista[[i]])){ if ((lista[[i]][j])==5 k[i]= [i]) } } This loop is wrong but I would like to get in my vector k sth like this: k = lista[[1]][1], lista[[2]][1] ...or sth similar I am a bit confused, since clearly lista[[1]][1] does _not_ == 5. It's also unclear what type of output you expect ... character, list, numeric? See if these take you any further to your vaguely expressed goal: lapply(lista, %in%, 5) [[1]] [1] FALSE FALSE TRUE TRUE FALSE [[2]] [1] FALSE TRUE FALSE FALSE [[3]] [1] FALSE FALSE FALSE FALSE lapply(lista, function(x) which(x == 5) ) [[1]] [1] 3 4 [[2]] [1] 2 [[3]] integer(0) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relase positive with log and zero of negative with 0
G'day Kevin, On Sun, 15 Nov 2009 7:18:18 -0800 rkevinbur...@charter.net wrote: This is a very simple question but I couldn't form a site search quesry that would return a reasonable result set. Say I have a vector: x - c(0,2,3,4,5,-1,-2) I want to replace all of the values in 'x' with the log of x. Naturally this runs into problems since some of the values are negative or zero. So how can I replace all of the positive elements of x with the log(x) and the rest with zero? If you do not mind a warning message: R x - c(0,2,3,4,5,-1,-2) R x - ifelse(x = 0,0, log(x)) Warning message: In log(x) : NaNs produced R x [1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000 If you do mind, then: R x - c(0,2,3,4,5,-1,-2) R ind - x0 R x[!ind] - 0 R x[ind] - log(x[ind]) R x [1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000 HTH. Cheers, Berwin == Full address Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr) School of Maths and Stats (M019)+61 (8) 6488 3383 (self) The University of Western Australia FAX : +61 (8) 6488 1028 35 Stirling Highway Crawley WA 6009e-mail: ber...@maths.uwa.edu.au Australiahttp://www.maths.uwa.edu.au/~berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Relase positive with log and zero of negative with 0
On Nov 15, 2009, at 10:18 AM, rkevinbur...@charter.net wrote: This is a very simple question but I couldn't form a site search quesry that would return a reasonable result set. Say I have a vector: x - c(0,2,3,4,5,-1,-2) I want to replace all of the values in 'x' with the log of x. Naturally this runs into problems since some of the values are negative or zero. So how can I replace all of the positive elements of x with the log(x) and the rest with zero? x - c(0,2,3,4,5,-1,-2) x - ifelse(x0, log(x), 0) Warning message: In log(x) : NaNs produced x [1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000 The warning is harmless as you can see, but if you wanted to avoid it, then: x[x=0] - 0; x[x0] -log(x[x0]) In the second command, you need to have the logical test on both sides to avoid replacement out of synchrony. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] package tm fails to remove the with remove stopwords
On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote: I am using code that previously worked to remove stopwords using package tm. Thanks for reporting. This is a bug in the removeWords() function in tm version 0.5-1 available from CRAN: require(tm) myDocument - c(the rain in Spain, falls mainly on the plain, jack and jill ran up the hill, to fetch a pail of water) text.corp - Corpus(VectorSource(myDocument)) # text.corp - tm_map(text.corp, stripWhitespace) text.corp - tm_map(text.corp, removeNumbers) text.corp - tm_map(text.corp, removePunctuation) ## text.corp - tm_map(text.corp, stemDocument) text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english))) dtm - DocumentTermMatrix(text.corp) dtm dtm.mat - as.matrix(dtm) dtm.mat dtm.mat Terms Docs falls fetch hill jack jill mainly pail plain rain ran spain the water 1 0 0000 00 01 0 1 1 0 2 1 0000 10 10 0 0 0 0 3 0 0111 00 00 1 0 0 0 4 0 1000 01 00 0 0 0 1 The function removeWords() fails to remove patterns at the beginning or at the end of a line. This bug is fixed in the latest development version on R-Forge, and the fix will be included in the next CRAN release. Please see https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tmview=markup for a list of all bug fixes and changes between each tm version. Best regards, Ingo Feinerer -- Ingo Feinerer Vienna University of Technology http://www.dbai.tuwien.ac.at/staff/feinerer __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum(row1==y) if row2=x
Thanks to all R is fantastic but ... not easy to know all possible terms ;-) Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update.lm question
Hello David, of course I read those help pages and searched the r-help archive. I agree that the way should be clear, but I am stuck and looking for help. Here is reproducible code removeTerm1 - function(linModel, termName) { update(linModel, .~. - termName) } removeTerm2 - function(linModel, termName) { update(linModel, .~. - as.name(termName)) } removeTerm3 - function(linModel, termName) { update(linModel, .~. - eval(termName)) } removeTerm4 - function(linModel, termName) { update(linModel, .~. - eval.parent(termName)) } removeTerm5 - function(linModel, termName) { update(linModel, .~. - get(termName)) } myData - data.frame(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10), y=rnorm(10)) fit - lm(y~x1+x2+x3, data=myData) # all this does not work, as I am expecting the function to return a lm object with formula y~x2+x3 removeTerm1(fit, x1) removeTerm2(fit, x1) removeTerm3(fit, x1) removeTerm4(fit, x1) removeTerm5(fit, x1) Any help appreciated, kind regards, Karsten Weinert 2009/11/15 David Winsemius dwinsem...@comcast.net: You need to review: ?update ?update.formula ?as.formula The way should be clear at that point. If not, then include a reproducible data example to work with. -- David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update.lm question
On 15/11/2009 9:23 AM, Karsten Weinert wrote: Hello, at the Rgui command line I can easily remove a term from a fitted lm object, like fit - lm(y~x1+x2+x3, data=myData) update(fit, .~.-x1) However, I would like to do this in a function with term given as string, like removeTerm - function(linModel, termName) { ??? } removeTerm(fit, x1) but I can not fill the ???. I already tried removeTerm - function(linModel, termName) { update(linModel, .~. - termName }, removeTerm - function(linModel, termName) { update(linModel, .~. - as.name(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval.parent(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - get(termName) }, but these attempts produce error messages. Can you advise me here? There are two problems: 1. .~. is different from . ~ .. 2. You need to construct the formula . ~ . - x1, and none of your expressions do that. You need to use substitute() or bquote() to edit a formula. For example, I think both of these should work: removeTerm - function(linModel, termName) update(linModel, bquote(. ~ . - .(as.name(termName removeTerm - function(linModel, termName) update(linModel, substitute(. ~ . - x, list(x=as.name(termName Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R crashing
So here's the funny thing: I've now ran my function 5 times, and in each one of them R crashes after I got around 20.000 distances. It could be google, but then as soon as I quit and launch R again, I manage to get another 20.000 distances. So maybe it does have something to do with the memory usage. Do you think that adding gc(reset=T) at the end of each loop would do good? Thanks again, Dimitri 2009/11/15 Barry Rowlingson b.rowling...@lancaster.ac.uk On Sun, Nov 15, 2009 at 11:57 AM, Dimitri Szerman dimitri...@gmail.com wrote: Thanks. The reason I didn't want to do something like that is because, in the event of a crash, I'll loose everything that was done. That's why I though of appending the results often. Oops yes, I missed the 'append=TRUE' flag. That's a good idea. Last time I did something similar to this I used a relational database for saving. I created a table of all the i,j pairs with columns i,j,distance and 'ok'. 'ok' was set to False initially. Then I'd query the db for a row with 'ok=False', and go about getting the distance. If I got a good distance back I set 'ok=True' and never bothered getting that again. This was in Python with SQLite as the database engine, but you can do something similar in R. With a distributed database you could easily split the queries between as many servers as you can get your hands on. Barry [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where is a value in my list
But it's not what I wont I need get a number of line my list 5 is in: list[[1]][1] and list[[2]][1] so I would like to get a vector k = 1,2 David Winsemius wrote: On Nov 15, 2009, at 10:01 AM, Grzes wrote: I heve got a list: lista=list() a=c(2,4,5,5,6) b=c(3,5,4,2) c=c(1,1,1,8) lista[[1]]=a lista[[2]]=b lista[[3]]=c lista [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 [[3]] [1] 1 1 1 8 I would like to know where is number 5 (which line)? For example I have got a loop: k= vector(mode = integer, length = 3) for(i in 1:3) { for (j in 1:length(lista[[i]])){ if ((lista[[i]][j])==5 k[i]= [i]) } } This loop is wrong but I would like to get in my vector k sth like this: k = lista[[1]][1], lista[[2]][1] ...or sth similar I am a bit confused, since clearly lista[[1]][1] does _not_ == 5. It's also unclear what type of output you expect ... character, list, numeric? See if these take you any further to your vaguely expressed goal: lapply(lista, %in%, 5) [[1]] [1] FALSE FALSE TRUE TRUE FALSE [[2]] [1] FALSE TRUE FALSE FALSE [[3]] [1] FALSE FALSE FALSE FALSE lapply(lista, function(x) which(x == 5) ) [[1]] [1] 3 4 [[2]] [1] 2 [[3]] integer(0) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360251.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where is a value in my list
On Nov 15, 2009, at 10:47 AM, Grzes wrote: But it's not what I wont I need get a number of line my list 5 is in: list[[1]][1] and list[[2]][1] so I would like to get a vector k = 1,2 I am sorry. I do not understand what you want the second solution offered gave you numbers (and they were the numbers that were for 5's rather than one that were not for 5's as your offered solution. If you just want to know which lists contain a 5, but not the position within the list (which was not what you appeared to be asking..): which(sapply(lista, function(x) any(x == 5))) [1] 1 2 David Winsemius wrote: On Nov 15, 2009, at 10:01 AM, Grzes wrote: I heve got a list: lista=list() a=c(2,4,5,5,6) b=c(3,5,4,2) c=c(1,1,1,8) lista[[1]]=a lista[[2]]=b lista[[3]]=c lista [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 [[3]] [1] 1 1 1 8 I would like to know where is number 5 (which line)? For example I have got a loop: k= vector(mode = integer, length = 3) for(i in 1:3) { for (j in 1:length(lista[[i]])){ if ((lista[[i]][j])==5 k[i]= [i]) } } This loop is wrong but I would like to get in my vector k sth like this: k = lista[[1]][1], lista[[2]][1] ...or sth similar I am a bit confused, since clearly lista[[1]][1] does _not_ == 5. It's also unclear what type of output you expect ... character, list, numeric? See if these take you any further to your vaguely expressed goal: lapply(lista, %in%, 5) [[1]] [1] FALSE FALSE TRUE TRUE FALSE [[2]] [1] FALSE TRUE FALSE FALSE [[3]] [1] FALSE FALSE FALSE FALSE lapply(lista, function(x) which(x == 5) ) [[1]] [1] 3 4 [[2]] [1] 2 [[3]] integer(0) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360251.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update.lm question
On Nov 15, 2009, at 11:15 AM, Karsten Weinert wrote: Hello David, of course I read those help pages and searched the r-help archive. But did you look at the as.formula page? I agree that the way should be clear, but I am stuck and looking for help. Here is reproducible code And here is what seemed like the obvious extension of your example but using as.formula: myData - data.frame(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10), y=rnorm(10)) fit - lm(y~x1+x2+x3, data=myData) remvterm - function(ft, termname) update(ft, as.formula(paste(.~.-, termname, sep=))) remvterm(fit, x3) Call: lm(formula = y ~ x1 + x2, data = myData) Coefficients: (Intercept) x1 x2 -0.2598 -0.0290 -0.2645 -- David removeTerm1 - function(linModel, termName) { update(linModel, .~. - termName) } removeTerm2 - function(linModel, termName) { update(linModel, .~. - as.name(termName)) } removeTerm3 - function(linModel, termName) { update(linModel, .~. - eval(termName)) } removeTerm4 - function(linModel, termName) { update(linModel, .~. - eval.parent(termName)) } removeTerm5 - function(linModel, termName) { update(linModel, .~. - get(termName)) } myData - data.frame(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10), y=rnorm(10)) fit - lm(y~x1+x2+x3, data=myData) # all this does not work, as I am expecting the function to return a lm object with formula y~x2+x3 removeTerm1(fit, x1) removeTerm2(fit, x1) removeTerm3(fit, x1) removeTerm4(fit, x1) removeTerm5(fit, x1) Any help appreciated, kind regards, Karsten Weinert 2009/11/15 David Winsemius dwinsem...@comcast.net: You need to review: ?update ?update.formula ?as.formula The way should be clear at that point. If not, then include a reproducible data example to work with. -- David __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JGR GUI for R-2.10.0 Help Print
I have updated R 2.9.1 to 2.10.0. and JGR GUI 1.7. I am running Windows XP. I can't seem to get the JGR Print or Help functions to work. The system locks and requires me to stop the process. In the past I have preferred the opreation and feel of JGR GUI. I realize that this help forum is for R; but, I am hoping that some other R-user is a JGR GUI user and might have a hint about this. At one point I received the following: Loading required package: rJava Loading required package: JavaGD Loading required package: iplots Attaching package: 'utils' The following object(s) are masked from package:rJava : head, str, tail starting httpd help server ...Error in tools:::startDynamicHelp() : could not find function runif Loading required package: stats Loading required package: graphics Loading Tcl/Tk interface ... done During startup - Warning message: package JGR in options(defaultPackages) was not found Loading required package: JGR starting httpd help server ... done q() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] resampling problem counting number of means above a specific value
I am trying to modify some code from Good 2005. I am trying to resample the mean of 8 values and then count how many times the resampled mean is greater than 10. But my count of means above 10 is coming out as zero, which I know isn't correct. I would appreciate it if someone could look at the code below and tell me what I am doing wrong. Many thanks, Graham LL- c(12.5,17,12,11.5,9.5,15.5,16,14) N-1000 n-length(LL) threshold-10 cnt-0 for(i in 1:N){ + LLb - sample (LL, n, replace=TRUE) + if (mean(LLb)=threshold) cnt-cnt+1 + } cnt [1] 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update.lm question
On 15/11/2009 11:28 AM, Duncan Murdoch wrote: On 15/11/2009 9:23 AM, Karsten Weinert wrote: Hello, at the Rgui command line I can easily remove a term from a fitted lm object, like fit - lm(y~x1+x2+x3, data=myData) update(fit, .~.-x1) However, I would like to do this in a function with term given as string, like removeTerm - function(linModel, termName) { ??? } removeTerm(fit, x1) but I can not fill the ???. I already tried removeTerm - function(linModel, termName) { update(linModel, .~. - termName }, removeTerm - function(linModel, termName) { update(linModel, .~. - as.name(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - eval.parent(termName) }, removeTerm - function(linModel, termName) { update(linModel, .~. - get(termName) }, but these attempts produce error messages. Can you advise me here? There are two problems: 1. .~. is different from . ~ .. Oops, wrong. Those are the same. Sorry... Duncan Murdoch 2. You need to construct the formula . ~ . - x1, and none of your expressions do that. You need to use substitute() or bquote() to edit a formula. For example, I think both of these should work: removeTerm - function(linModel, termName) update(linModel, bquote(. ~ . - .(as.name(termName removeTerm - function(linModel, termName) update(linModel, substitute(. ~ . - x, list(x=as.name(termName Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] resampling problem counting number of means above a specific value
On Nov 15, 2009, at 12:12 PM, Graham Smith wrote: I am trying to modify some code from Good 2005. I am trying to resample the mean of 8 values and then count how many times the resampled mean is greater than 10. But my count of means above 10 is coming out as zero, which I know isn't correct. If that is your goal, then why are you using = and not in your test? for(i in 1:N){ + LLb - sample (LL, n, replace=TRUE) + if (mean(LLb) threshold) cnt-cnt+1 + } cnt [1] 1000 I would appreciate it if someone could look at the code below and tell me what I am doing wrong. Many thanks, Graham LL- c(12.5,17,12,11.5,9.5,15.5,16,14) N-1000 n-length(LL) threshold-10 cnt-0 for(i in 1:N){ + LLb - sample (LL, n, replace=TRUE) + if (mean(LLb)=threshold) cnt-cnt+1 + } cnt [1] 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] update.lm question
Thanks Duncan and David for opening my eyes :-). It took quite a while but I think I learned a lot about lm today. I used your advice to produce added variable plots as mentioned here [1], [2]. I would bet someone did it in R already (may leverage.plot in car) but it was worth doing it myself. Kind regards, Karsten. [1] http://www.minitab.com/support/documentation/answers/AVPlots.pdf [2] http://www.mathworks.com/access/helpdesk/help/toolbox/stats/addedvarplot.html plotAddedVar.lm - function( linModel, termName, main=, xlab=paste(termName, | andere), ylab=paste(colnames(linModel$model)[1], | andere), cex=0.7, ...) { oldpar - par(no.readonly = TRUE); on.exit(par(oldpar)) par(mar=c(3,4,0.4,0)+0.1, las=1, cex=cex) yData = residuals(update(linModel, substitute(. ~ . - x, list(x=as.name(termName) xData = residuals(update(linModel, substitute(x ~ . - x, list(x=as.name(termName) plot(xData, yData, main=main, xlab=, ylab=) mtext(side=2, text=ylab, line=3, las=0, cex=cex) mtext(side=1, text=xlab, line=2, las=0, cex=cex) abline(h=0) abline(a=0, b=coefficients(linModel)[termName], col=blue) } plotAddedVar - function(linModel,...) UseMethod(plotAddedVar) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] resampling problem counting number of means above a specific value
try the following: LL - c(12.5,17,12,11.5,9.5,15.5,16,14) n - length(LL) N - 1000 threshold - 10 smpls - sample(LL, N*n, replace = TRUE) dim(smpls) - c(n, N) cnt - sum(colMeans(smpls) threshold) cnt I hope it helps. Best, Dimitris Graham Smith wrote: I am trying to modify some code from Good 2005. I am trying to resample the mean of 8 values and then count how many times the resampled mean is greater than 10. But my count of means above 10 is coming out as zero, which I know isn't correct. I would appreciate it if someone could look at the code below and tell me what I am doing wrong. Many thanks, Graham LL- c(12.5,17,12,11.5,9.5,15.5,16,14) N-1000 n-length(LL) threshold-10 cnt-0 for(i in 1:N){ + LLb - sample (LL, n, replace=TRUE) + if (mean(LLb)=threshold) cnt-cnt+1 + } cnt [1] 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] resampling problem counting number of means above a specific value
David, Thanks, its me getting mixed up I actually meant less than or equal to 10. That apart, I guess the code is OK, I just expected, especially as I increased N that I might have got some means less than 10, but having gone back to it , I see I need a million iterations before getting two means less than 10. It seems I misjudged the probabilities. Thanks again. Graham 2009/11/15 David Winsemius dwinsem...@comcast.net: On Nov 15, 2009, at 12:12 PM, Graham Smith wrote: I am trying to modify some code from Good 2005. I am trying to resample the mean of 8 values and then count how many times the resampled mean is greater than 10. But my count of means above 10 is coming out as zero, which I know isn't correct. If that is your goal, then why are you using = and not in your test? for(i in 1:N){ + LLb - sample (LL, n, replace=TRUE) + if (mean(LLb) threshold) cnt-cnt+1 + } cnt [1] 1000 I would appreciate it if someone could look at the code below and tell me what I am doing wrong. Many thanks, Graham LL- c(12.5,17,12,11.5,9.5,15.5,16,14) N-1000 n-length(LL) threshold-10 cnt-0 for(i in 1:N){ + LLb - sample (LL, n, replace=TRUE) + if (mean(LLb)=threshold) cnt-cnt+1 + } cnt [1] 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme model specification
On Sun, Nov 15, 2009 at 7:19 AM, Green, Gerwyn (greeng6) g.gre...@lancaster.ac.uk wrote: Dear all this is a question of model specification in lme which I'd for which I'd greatly appreciate some guidance. Suppose I have data in long format gene treatment rep Y 1 1 1 4.32 1 1 2 4.67 1 1 3 5.09 . . . . . . . . . . . . 1 4 1 3.67 1 4 2 4.64 1 4 3 4.87 . . . . . . . . . . . . 2000 1 1 5.12 2000 1 2 2.87 2000 1 3 7.23 . . . . . . . . . . . . 2000 4 1 2.48 2000 4 2 3.93 2000 4 3 5.17 that is, I have data Y_{gtr} for g (gene) =1,...,2000 t (treatment) = 1,...,4 and r (replicate) = 1,...,3 I would like to fit the following linear mixed model using lme Y_{gtr} = \mu_{g} + W_{gt} + Z_{gtr} where the \mu_{g}'s are fixed gene effects, W_{gt} ~ N(0, \sigma^{2}) gene-treatment interactions, and residual errors Z_{gtr} ~ N(0,\tau^{2}). (Yes, I know I'm specifying an interaction between gene and treatment without specifying a treatment main effect ! - there is good reason for this) You are going to end up estimating 2000 fixed-effects parameters for gene, which will take up a lot of memory (one copy of the model matrix for the fixed-effects will be 24000 by 2000 double precision numbers or about 400 MB). You might be able to fit that in lme as lme(Y ~ -1 + factor(gene), data = data, random = ~ 1|gene:treatment) but it will probably take a long time or run out of memory. There is an alternative which is to use the development branch of the lme4 package that allows for a sparse model matrix for the fixed-effects parameters. Or ask yourself if you really need to model the genes as fixed effects instead of random effects. We have seen situations where users do not want the shrinkage involved with random effects but it is rare. If you want to follow up on the development branch (for which binary packages are not currently available, i.e. you need to compile it yourself) then we can correspond off-list. I know that specifying model.1 - lme(Y ~ -1 + factor(gene), data=data, random= ~1|gene/treatment) fits Y_{gtr} = \mu_{g} + U_{g} + W_{gt} + Z_{gtr} with \mu_{g}, W_{gt} and Z_{gtr} as previous and U_{g} ~ N(0,\gamma^{2}), but I do NOT want to specify a random gene effect. I have scoured Bates and Pinheiro without coming across a parallel example. Any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster Uinversity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] resampling problem counting number of means above a specific value
Dimitris, Thanks, I shall give this a try as an alternative. Graham 2009/11/15 Dimitris Rizopoulos d.rizopou...@erasmusmc.nl: try the following: LL - c(12.5,17,12,11.5,9.5,15.5,16,14) n - length(LL) N - 1000 threshold - 10 smpls - sample(LL, N*n, replace = TRUE) dim(smpls) - c(n, N) cnt - sum(colMeans(smpls) threshold) cnt I hope it helps. Best, Dimitris Graham Smith wrote: I am trying to modify some code from Good 2005. I am trying to resample the mean of 8 values and then count how many times the resampled mean is greater than 10. But my count of means above 10 is coming out as zero, which I know isn't correct. I would appreciate it if someone could look at the code below and tell me what I am doing wrong. Many thanks, Graham LL- c(12.5,17,12,11.5,9.5,15.5,16,14) N-1000 n-length(LL) threshold-10 cnt-0 for(i in 1:N){ + LLb - sample (LL, n, replace=TRUE) + if (mean(LLb)=threshold) cnt-cnt+1 + } cnt [1] 0 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gregmisc library (Mandriva)
On Sat, Oct 3, 2009 at 9:18 AM, chi ball c...@hotmail.it wrote: Hi, I'm not able to find a rpm of gregmisc library (2.0.0) for Linux Mandriva 2008 Spring. Any suggestion? Thanks If you don't find an up to date RPM, either you have to learn how to build an RPM or just install the package yourself. You can install the package yourself in a number of ways, I think R FAQ outlines it. To update and download a whole bunch of packages, I use a script. It should be easy for you to see how this works. I scan the system to update what packages there are, and then install a lot of others if they are not installed yet. as root run R CMD BATCH R_installFaves-2.R or inside R as root you could type source(R_installFaves-2.R) On Ubuntu, if you do this as root it installes the packages into /usr/local/lib/R, but on Fedora it installs them under /usr/lib/R. I do not know where they will go with Mandriva. I used to run the script to get ALL packages, but when the CRAN list accumulated more than 600 packages, my systems just spent all day building packages. So I had to narrow my sites. I'm looking at administering a cluster computer on which I'll need to make RPMs for many packages, and so I'm in the same boat as you are if you are wanting RPMs. You could check back with me in about a month to find out if I have packages for you. pj I think gregmisc is a bundle, those are deprecated. Instead, you install gdata, gmodels, and so forth. -- Paul E. Johnson Professor, Political Science 1541 Lilac Lane, Room 504 University of Kansas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with unstack() function
Hi Everyone, I am trying to understand the unstack() function but after struggling for two days, I have given up. More specifically, I am trying the exercises at the end of Chapter 1 of Data Analysis and Graphics Using R by Maindonald and Braun, 2nd ed. Exercise 18 (p. 41) asks to unstack the Rabbit data frame from the MASS package to get a certain data frame that is shown in the exercise. Authors suggest to use the unstack() three times but I am so new to R that I have absolutely no clue as to what is to be done each of those times. Sadly for me, the help page for unstack() does not give much help either. For example, the statement in the help page regarding the argument form, a two-sided formula whose left side evaluates to the vector to be unstacked and whose right side evaluates to the indicator of the groups to create is very cryptic to me. Basically, I have tried things like: unstack(Rabbit, Dose ~ Animal) but notice that what I get is a data frame in which other columns of the Rabbit data frame disappear. I would appreciate if someone could help me understand this function. On page 17 of the same book, there is an example of unstack() function but that one uses a very simple data frame (only two columns). I would like to know how to handle more complex data frames as in the exercise. BTW, this is not a school assignment; I am learning R using this book on my own. Thanks for any help. Regards, Tariq [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] wilcox.test loop through variable names
Often I perform the same task on a series of variables in a dataframe, by looping through a character vector that holds the names and using paste(), eval(), and parse() inside the loop. For instance: rm(environmental) thesevars-names(environmental) environmental$ToyReal -rnorm(nrow(environmental)) environmental$ToyDichot- environmental$ToyReal 0.53 tableOfResults-data.frame(var=thesevars) tableOfResults$p_wilcox - NA tableOfResults$Beta_lm - NA rownames(tableOfResults)-thesevars for( thisvar in thesevars) { thiscommand- paste(thiswilcox - wilcox.test (, thisvar, ~ ToyDichot , data=environmental)) eval(parse(text=thiscommand)) tableOfResults[thisvar, p_wilcox] - thiswilcox$p.value thislm-lm( environmental[ c( ToyReal, thisvar )]) tableOfResults[thisvar, Beta_lm] - coef(thislm)[thisvar] } print(tableOfResults) Of course, the loop above is a toy example. In real life I might first figure out whether the variable is continuous, dichotomous, or categorical taking on several values, then perform an operation depending on its type. The use of paste(), eval(), and parse() seems awkward. As Gabor Grothendieck showed (http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we are calling a regression function such as lm() we can avoid using paste(), as shown above. But is there a way to avoid paste() and eval() when one uses t.test() or wilcox.test()? Thanks Jacob A. Wegelin Department of Biostatistics Virginia Commonwealth University Richmond VA 23298-0032 U.S.A. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] multivariate meta-analysis with the metafor package
Dear Antonio, Yes, I am currently working on these extensions. It will take some time before those functions are sufficiently tested and documented and can become part of the metafor package. My goal is to do so within the next year, but I cannot give any specific date for when this will be finished. To answer your second question, I will have to take a closer look at the mvmeta command (I wasn't aware of this command until you mentioned it). I will try to do so when I get a chance. I guess an immediate advantage of an rma.multi (or whatever it will be called) command is that it will run under R =) Best, -- Wolfgang Viechtbauerhttp://www.wvbauer.com/ Department of Methodology and StatisticsTel: +31 (0)43 388-2277 School for Public Health and Primary Care Office Location: Maastricht University, P.O. Box 616 Room B2.01 (second floor) 6200 MD Maastricht, The Netherlands Debyeplein 1 (Randwyck) Original Message From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of antonio.gasparr...@lshtm.ac.uk Sent: Friday, November 13, 2009 19:47 To: r-help@r-project.org Subject: [R] multivariate meta-analysis with the metafor package Dear Wolfgang Viechtbauer and R users, I have few questions regarding the development of the package 'metafor. As you suggested , I post to the R-help mailing list. I read you're planning an extension of this method to the multivariate case. I think it would be a useful tool. I'm currently performing some analyses with R on multiple outcomes, using the Stata command mvmeta to get meta-analytic multivariate estimates, then coming back to R to use these results. Obviously, it's irritating to switch software every time. Briefly: - Are you still planning this extension? And in this case, do you have a planned date? - What are likely to be the advantages and limitations of a potential 'rma.multi' if compared to Stata's 'mvmeta'? Thank you for your time Regards, Antonio Gasparrini Public and Environmental Health Research Unit (PEHRU) London School of Hygiene Tropical Medicine Keppel Street, London WC1E 7HT, UK Office: 0044 (0)20 79272406 - Mobile: 0044 (0)79 64925523 Skype contact: a.gasparrini http://www.lshtm.ac.uk/people/gasparrini.antonio ( http://www.lshtm.ac.uk/pehru/ ) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re move row if the column date_abandoned has a date in it
this works perfectly... new_data5 - new_data4[nchar(new_data4$date_abandoned) != 8, ] ...and i can now think of a few different ways to manipulate my data with what ive learned from these tricks, thanks alot David! David Winsemius wrote: On Nov 15, 2009, at 11:00 AM, frenchcr wrote: Yes they are not in date format, theyre just characters. the earliest date is 1601 i originally had one of 0101 00 00 (101 years BC)...this was a software problem. table(nchar(new_data4$date_abandoned)) 2 8 315732263 The 315732 are empty fields i thought. They are actually 2 characters wide. The 263 are dates, i want to remove their rows. If you want to remove the ones that are _not_ 8 characters long, then: new_data5 - new_data4[nchar(new_data4$date_abandoned) != 8, ] or: new_data5 - subset(new_data4, date_abandoned != 8) -- David. David Winsemius wrote: On Nov 14, 2009, at 8:43 PM, frenchcr wrote: sorry David, im really new to R (my first week) and appreciate your help. Also I dont always know what info to give people on the forum (although im starting to catch the drift). heres what i get... summary(new_data4$date_abandoned) Min.1st Qu.Median Mean 3rd Qu. Max. NA's 1601 1998 2001 1993 2004 2009 315732 So new_data4$data_abandoned is not of type Date and is instead a character vector. If you are resisting turning it into a date and want to work with characters, you can, you just need to deal somehow with the items that are not 8 characters wide. What does 315732 represent? How were we supposed to interpret the starting date you gave of 0101? nchar(101) [1] 7 What does table(nchar(new_data4$date_abandoned)) give you? ls() [1] data new_data new_data2 new_data3 new_data4 small - head(new_data4, 20) dump(small, 20) Error in dump(small, 20) : cannot write to this connection Well, sorry, I meant to type dump(small, stdout()) ... As per the Posting Guide. -- David. David Winsemius wrote: On Nov 14, 2009, at 5:24 PM, frenchcr wrote: I tried the following but it does the opposite of what i want: new_data5 - subset(new_data4, date_abandoned 0101) I want to remove the rows with dates and leave just the rows without a date. This removes all the rows that dont have a date in the date_abandoned column ...on a positive note, as i did this next... dim(new_data5) [1] 263 80 i now know that i have 263 dates in that column :) I want to remove the 263 rows with dates and leave just the rows without a date. Con=me on frenchcr. Stop making us guess. Give us enough information to work with. You asked for something which I construed as saying you wanted dates greater than the the first day of the year 101. You did not address this question. What do you get with str(new_data4) and summary(new_data4$date_abandoned) ? In order to know what sort of comparison to use we need to know what the data looks like. Even better if you offered the output from: small - head(new_data4, 20) dump(small, 20), -- David David Winsemius wrote: On Nov 14, 2009, at 1:21 PM, frenchcr wrote: I want to go through a column in data called Bad name for a data.frame. Fortunes, dog and all that. date_abandoneddata[date_abandoned]and remove all the rows that have numbers greater than 1,010,000. Are you doing archeology? Given what you say next I wondered what range you were really asking for. The dates are in the format 20091114 so i'm just going to treat them as numbers for clean up purposes. I know that i use subset but not sure how to proceed from there. subdata - subset(data, date_abandoned 0101() The problem with 101 is that your specified minimum point had an insufficient number of places to be in MMDD format. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
[R] Finding Largest(or smallest) values
Hi, I am trying to find a model of best fit, with 8 parameters. As such, I have created a table with 2^8=256 rows with every variable either in or out of the model(denoted by 1 or 0), and for each row, I have computed the Adjusted R^2, AIC, CP and Press. I know I can use the leaps package to find the best model (for every number of parameters n=1...8) for the Adjusted R2 and CP, but not for AIC and PRESS. I was wondering, if anyone has any code to say find the minimum press when the model has 1 parameter, 2 parameters, 3 parameters, etc... I have attached a copy of my table below for reference. Thank you for your help! DF P X1 X2 X3 X4 X5 X6 X7 X8 Adjusted R2 AICCP PRESS 1 0 0 0 0 0 0 0 0 0 0.00 0.99464282 14.367679 95.01075 2 1 1 0 0 0 0 0 0 0 0.0246074998 -0.36362381 12.634643 93.70900 3 1 0 1 0 0 0 0 0 0 0.0382916889 -1.69172730 11.194739 92.46010 4 1 0 0 1 0 0 0 0 0 -0.0005184148 2.02713507 15.278491 96.11336 5 1 0 0 0 1 0 0 0 0 -0.002134 2.24957370 15.527913 96.38092 6 1 0 0 0 0 1 0 0 0 0.0272855281 -0.62206426 12.352850 92.64962 7 1 0 0 0 0 0 1 0 0 0.0111851503 0.92108831 14.046995 94.47172 8 1 0 0 0 0 0 0 1 0 0.0020990063 1.78090275 15.003075 94.60974 9 1 0 0 0 0 0 0 0 1 -0.0108209410 2.99012118 16.362563 97.09546 10 2 1 1 0 0 0 0 0 0 0.0810965349 -4.99889125 7.639659 89.49671 11 2 1 0 1 0 0 0 0 0 0.0266267059 0.41424802 13.308890 94.64288 12 2 0 1 1 0 0 0 0 0 0.0315413664 -0.06156976 12.797371 94.04593 13 2 1 0 0 1 0 0 0 0 0.0145736386 1.57108162 14.563375 95.37943 14 2 0 1 0 1 0 0 0 0 0.0479608482 -1.66893313 11.088428 92.39152 15 2 0 0 1 1 0 0 0 0 0.0005123888 2.90290718 16.026873 97.24004 16 2 1 0 0 0 1 0 0 0 0.0541222887 -2.27926275 10.447144 91.16842 17 2 0 1 0 0 1 0 0 0 0.0754976703 -4.42788850 8.222390 88.96712 18 2 0 0 1 0 1 0 0 0 0.0241535771 0.65277858 13.566293 93.96203 19 2 0 0 0 1 1 0 0 0 0.0275121981 0.32869589 13.216727 93.77620 20 2 1 0 0 0 0 1 0 0 0.0245589269 0.61372449 13.524104 94.06972 21 2 0 1 0 0 0 1 0 0 0.0622492592 -3.09039933 9.601287 90.89229 22 2 0 0 1 0 0 1 0 0 0.0085444331 2.14445635 15.190896 95.73424 23 2 0 0 0 1 0 1 0 0 0.0089907346 2.10213293 15.15 95.79122 24 2 0 0 0 0 1 1 0 0 0.0407860378 -0.96318114 11.835183 91.76667 25 2 1 0 0 0 0 0 1 0 0.0244877014 0.62058800 13.531518 93.36514 26 2 0 1 0 0 0 0 1 0 0.0407457114 -0.95922936 11.839381 92.16665 27 2 0 0 1 0 0 0 1 0 -0.0047937805 3.40062281 16.579140 96.5 28 2 0 0 0 1 0 0 1 0 0.0020858488 2.75480952 15.863107 95.82804 29 2 0 0 0 0 1 0 1 0 0.0248637496 0.58434515 13.492378 92.73311 30 2 0 0 0 0 0 1 1 0 0.0171446547 1.32551143 14.295783 93.74595 31 2 1 0 0 0 0 0 0 1 0.0138889013 1.63637615 14.634643 95.78017 32 2 0 1 0 0 0 0 0 1 0.0277245197 0.30817080 13.194629 94.26947 33 2 0 0 1 0 0 0 0 1 -0.0115063328 4.02650409 17.277784 98.26103 34 2 0 0 0 1 0 0 0 1 -0.0130170902 4.16679510 17.435024 98.57288 35 2 0 0 0 0 1 0 0 1 0.0193924225 1.11028937 14.061835 94.52438 36 2 0 0 0 0 0 1 0 1 0.0004693207 2.90695757 16.031355 96.66280 37 2 0 0 0 0 0 0 1 1 -0.0087026647 3.76559547 16.985978 96.73581 38 3 1 1 1 0 0 0 0 0 0.0754624308 -3.46299014 9.168628 90.98724 39 3 1 1 0 1 0 0 0 0 0.0761667614 -3.53462845 9.096127 90.47049 40 3 1 0 1 1 0 0 0 0 0.0179377080 2.21094883 15.090020 96.18373 41 3 0 1 1 1 0 0 0 0 0.0443170822 -0.34853563 12.374620 93.90532 42 3 1 1 0 0 1 0 0 0 0.1233192905 -8.45916839 4.242412 85.48181 43 3 1 0 1 0 1 0 0 0 0.0532048878 -1.22682152 11.459741 92.37654 44 3 0 1 1 0 1 0 0 0 0.0668617200 -2.59257716 10.053955 90.62446 45 3 1 0 0 1 1 0 0 0 0.0452547720 -0.44081112 12.278098 92.71992 46 3 0 1 0 1 1 0 0 0 0.0924857470 -5.20992520 7.416308 88.22170 47 3 0 0 1 1 1 0 0 0 0.0281772097 1.22570975 14.036002 94.88753 48 3 1 1 0 0 0 1 0 0 0.0890014298 -4.84971194 7.774972 89.34392 49 3 1 0 1 0 0 1 0 0 0.0245049716 1.58023923 14.414009 95.16454 50 3 0 1 1 0 0 1 0 0 0.0535554539 -1.26163298 11.423655 92.56181 51 3 1 0 0 1 0 1 0 0 0.0152798441 2.46500779 15.363611 95.72452 52 3 0 1 0 1 0 1 0 0 0.0755765040 -3.47458896 9.156886 90.73695 53 3 0 0 1 1 0 1 0 0 0.0099403328 2.9710 15.913241 96.85582 54 3 1 0 0 0 1 1 0 0 0.0555455543 -1.45949602 11.218801 91.36679 55 3 0 1 0 0 1 1 0 0 0.1041509877 -6.42603949 6.215530 86.77444 56 3 0 0 1 0 1 1 0 0 0.0357196889 0.49331421 13.259606 93.21327 57 3 0 0 0 1 1 1 0 0
Re: [R] where is a value in my list
It's excellent! Now, if I have a vector k=c( TRUE, TRUE, FALSE) how I may get lines from list? which (list ?? k) ? David Winsemius wrote: On Nov 15, 2009, at 10:47 AM, Grzes wrote: But it's not what I wont I need get a number of line my list 5 is in: list[[1]][1] and list[[2]][1] so I would like to get a vector k = 1,2 I am sorry. I do not understand what you want the second solution offered gave you numbers (and they were the numbers that were for 5's rather than one that were not for 5's as your offered solution. If you just want to know which lists contain a 5, but not the position within the list (which was not what you appeared to be asking..): which(sapply(lista, function(x) any(x == 5))) [1] 1 2 David Winsemius wrote: On Nov 15, 2009, at 10:01 AM, Grzes wrote: I heve got a list: lista=list() a=c(2,4,5,5,6) b=c(3,5,4,2) c=c(1,1,1,8) lista[[1]]=a lista[[2]]=b lista[[3]]=c lista [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 [[3]] [1] 1 1 1 8 I would like to know where is number 5 (which line)? For example I have got a loop: k= vector(mode = integer, length = 3) for(i in 1:3) { for (j in 1:length(lista[[i]])){ if ((lista[[i]][j])==5 k[i]= [i]) } } This loop is wrong but I would like to get in my vector k sth like this: k = lista[[1]][1], lista[[2]][1] ...or sth similar I am a bit confused, since clearly lista[[1]][1] does _not_ == 5. It's also unclear what type of output you expect ... character, list, numeric? See if these take you any further to your vaguely expressed goal: lapply(lista, %in%, 5) [[1]] [1] FALSE FALSE TRUE TRUE FALSE [[2]] [1] FALSE TRUE FALSE FALSE [[3]] [1] FALSE FALSE FALSE FALSE lapply(lista, function(x) which(x == 5) ) [[1]] [1] 3 4 [[2]] [1] 2 [[3]] integer(0) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360251.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360930.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] where is a value in my list
The term line in R refers to a sequence of text ending in and EOL marker, which I doubt is what you meant. Please stop referring to the items or elements of a list as lines if you hope to communicate with R-users. lista[1:2] [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 lista[ which(sapply(lista, function(x) any(x == 5))) ] [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 -- David. On Nov 15, 2009, at 11:57 AM, Grzes wrote: It's excellent! Now, if I have a vector k=c( TRUE, TRUE, FALSE) how I may get lines from list? which (list ?? k) ? David Winsemius wrote: On Nov 15, 2009, at 10:47 AM, Grzes wrote: But it's not what I wont I need get a number of line my list 5 is in: list[[1]][1] and list[[2]][1] so I would like to get a vector k = 1,2 I am sorry. I do not understand what you want the second solution offered gave you numbers (and they were the numbers that were for 5's rather than one that were not for 5's as your offered solution. If you just want to know which lists contain a 5, but not the position within the list (which was not what you appeared to be asking..): which(sapply(lista, function(x) any(x == 5))) [1] 1 2 David Winsemius wrote: On Nov 15, 2009, at 10:01 AM, Grzes wrote: I heve got a list: lista=list() a=c(2,4,5,5,6) b=c(3,5,4,2) c=c(1,1,1,8) lista[[1]]=a lista[[2]]=b lista[[3]]=c lista [[1]] [1] 2 4 5 5 6 [[2]] [1] 3 5 4 2 [[3]] [1] 1 1 1 8 I would like to know where is number 5 (which line)? For example I have got a loop: k= vector(mode = integer, length = 3) for(i in 1:3) { for (j in 1:length(lista[[i]])){ if ((lista[[i]][j])==5 k[i]= [i]) } } This loop is wrong but I would like to get in my vector k sth like this: k = lista[[1]][1], lista[[2]][1] ...or sth similar I am a bit confused, since clearly lista[[1]][1] does _not_ == 5. It's also unclear what type of output you expect ... character, list, numeric? See if these take you any further to your vaguely expressed goal: lapply(lista, %in%, 5) [[1]] [1] FALSE FALSE TRUE TRUE FALSE [[2]] [1] FALSE TRUE FALSE FALSE [[3]] [1] FALSE FALSE FALSE FALSE lapply(lista, function(x) which(x == 5) ) [[1]] [1] 3 4 [[2]] [1] 2 [[3]] integer(0) -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to permute, simulate Markov chain
Hi all, I am new to R. Can someone please give me some hints in how to do the following things: 1- Get ONE permutation of a set. I have looked at the gregmisc package's permutations() method, but I just want to get one permutation at a time. 2- Simulate a Markov chain in R. For instance, I want to simulate the simple random walk problem, in which a person can walk randomly around 4 places. I know how to set up the transition matrix in R. I'm stuck at what to do next. I'm grateful if someone can give me hint or a pointer. Thanks. Martin -- View this message in context: http://old.nabble.com/how-to-permute%2C-simulate-Markov-chain-tp26363411p26363411.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to permute, simulate Markov chain
On Nov 15, 2009, at 4:11 PM, martin08 wrote: Hi all, I am new to R. Can someone please give me some hints in how to do the following things: 1- Get ONE permutation of a set. I have looked at the gregmisc package's permutations() method, but I just want to get one permutation at a time. Assuming that set1 is your set object and x is = number of permutations of z things from set1, then permutations(set1, z)[x] will give your the x-th permutation. the [ operator/function can often be appended to the end of a function to extract the desired subset. 2- Simulate a Markov chain in R. For instance, I want to simulate the simple random walk problem, in which a person can walk randomly around 4 places. I know how to set up the transition matrix in R. I'm stuck at what to do next. Learn to search for yourself: http://search.r-project.org/cgi-bin/namazu.cgi?query=%22transition+matrix%22+markovmax=100result=normalsort=scoreidxname=Rhelp08idxname=Rhelp02 I'm grateful if someone can give me hint or a pointer. Thanks. Martin -- View this message in context: http://old.nabble.com/how-to-permute%2C-simulate-Markov-chain-tp26363411p26363411.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to get the string '\'?
I can not get the string '\'. Could somebody let me know how to get it? print('\') + + print('\\') [1] \\ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding Largest(or smallest) values
There are probably better ways but cant you subset each parameter? So create new variables for parameter 1, 2, ... and look at the summary data for those which will include a min and max for all variables. Joe King 206-913-2912 j...@joepking.com Never throughout history has a man who lived a life of ease left a name worth remembering. --Theodore Roosevelt -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of alphaace Sent: Sunday, November 15, 2009 8:54 AM To: r-help@r-project.org Subject: [R] Finding Largest(or smallest) values Hi, I am trying to find a model of best fit, with 8 parameters. As such, I have created a table with 2^8=256 rows with every variable either in or out of the model(denoted by 1 or 0), and for each row, I have computed the Adjusted R^2, AIC, CP and Press. I know I can use the leaps package to find the best model (for every number of parameters n=1...8) for the Adjusted R2 and CP, but not for AIC and PRESS. I was wondering, if anyone has any code to say find the minimum press when the model has 1 parameter, 2 parameters, 3 parameters, etc... I have attached a copy of my table below for reference. Thank you for your help! DF P X1 X2 X3 X4 X5 X6 X7 X8 Adjusted R2 AICCP PRESS 1 0 0 0 0 0 0 0 0 0 0.00 0.99464282 14.367679 95.01075 2 1 1 0 0 0 0 0 0 0 0.0246074998 -0.36362381 12.634643 93.70900 3 1 0 1 0 0 0 0 0 0 0.0382916889 -1.69172730 11.194739 92.46010 4 1 0 0 1 0 0 0 0 0 -0.0005184148 2.02713507 15.278491 96.11336 5 1 0 0 0 1 0 0 0 0 -0.002134 2.24957370 15.527913 96.38092 6 1 0 0 0 0 1 0 0 0 0.0272855281 -0.62206426 12.352850 92.64962 7 1 0 0 0 0 0 1 0 0 0.0111851503 0.92108831 14.046995 94.47172 8 1 0 0 0 0 0 0 1 0 0.0020990063 1.78090275 15.003075 94.60974 9 1 0 0 0 0 0 0 0 1 -0.0108209410 2.99012118 16.362563 97.09546 10 2 1 1 0 0 0 0 0 0 0.0810965349 -4.99889125 7.639659 89.49671 11 2 1 0 1 0 0 0 0 0 0.0266267059 0.41424802 13.308890 94.64288 12 2 0 1 1 0 0 0 0 0 0.0315413664 -0.06156976 12.797371 94.04593 13 2 1 0 0 1 0 0 0 0 0.0145736386 1.57108162 14.563375 95.37943 14 2 0 1 0 1 0 0 0 0 0.0479608482 -1.66893313 11.088428 92.39152 15 2 0 0 1 1 0 0 0 0 0.0005123888 2.90290718 16.026873 97.24004 16 2 1 0 0 0 1 0 0 0 0.0541222887 -2.27926275 10.447144 91.16842 17 2 0 1 0 0 1 0 0 0 0.0754976703 -4.42788850 8.222390 88.96712 18 2 0 0 1 0 1 0 0 0 0.0241535771 0.65277858 13.566293 93.96203 19 2 0 0 0 1 1 0 0 0 0.0275121981 0.32869589 13.216727 93.77620 20 2 1 0 0 0 0 1 0 0 0.0245589269 0.61372449 13.524104 94.06972 21 2 0 1 0 0 0 1 0 0 0.0622492592 -3.09039933 9.601287 90.89229 22 2 0 0 1 0 0 1 0 0 0.0085444331 2.14445635 15.190896 95.73424 23 2 0 0 0 1 0 1 0 0 0.0089907346 2.10213293 15.15 95.79122 24 2 0 0 0 0 1 1 0 0 0.0407860378 -0.96318114 11.835183 91.76667 25 2 1 0 0 0 0 0 1 0 0.0244877014 0.62058800 13.531518 93.36514 26 2 0 1 0 0 0 0 1 0 0.0407457114 -0.95922936 11.839381 92.16665 27 2 0 0 1 0 0 0 1 0 -0.0047937805 3.40062281 16.579140 96.5 28 2 0 0 0 1 0 0 1 0 0.0020858488 2.75480952 15.863107 95.82804 29 2 0 0 0 0 1 0 1 0 0.0248637496 0.58434515 13.492378 92.73311 30 2 0 0 0 0 0 1 1 0 0.0171446547 1.32551143 14.295783 93.74595 31 2 1 0 0 0 0 0 0 1 0.0138889013 1.63637615 14.634643 95.78017 32 2 0 1 0 0 0 0 0 1 0.0277245197 0.30817080 13.194629 94.26947 33 2 0 0 1 0 0 0 0 1 -0.0115063328 4.02650409 17.277784 98.26103 34 2 0 0 0 1 0 0 0 1 -0.0130170902 4.16679510 17.435024 98.57288 35 2 0 0 0 0 1 0 0 1 0.0193924225 1.11028937 14.061835 94.52438 36 2 0 0 0 0 0 1 0 1 0.0004693207 2.90695757 16.031355 96.66280 37 2 0 0 0 0 0 0 1 1 -0.0087026647 3.76559547 16.985978 96.73581 38 3 1 1 1 0 0 0 0 0 0.0754624308 -3.46299014 9.168628 90.98724 39 3 1 1 0 1 0 0 0 0 0.0761667614 -3.53462845 9.096127 90.47049 40 3 1 0 1 1 0 0 0 0 0.0179377080 2.21094883 15.090020 96.18373 41 3 0 1 1 1 0 0 0 0 0.0443170822 -0.34853563 12.374620 93.90532 42 3 1 1 0 0 1 0 0 0 0.1233192905 -8.45916839 4.242412 85.48181 43 3 1 0 1 0 1 0 0 0 0.0532048878 -1.22682152 11.459741 92.37654 44 3 0 1 1 0 1 0 0 0 0.0668617200 -2.59257716 10.053955 90.62446 45 3 1 0 0 1 1 0 0 0 0.0452547720 -0.44081112 12.278098 92.71992 46 3 0 1 0 1 1 0 0 0 0.0924857470 -5.20992520 7.416308 88.22170 47 3 0 0 1 1 1 0 0 0 0.0281772097 1.22570975 14.036002 94.88753 48 3 1 1 0 0 0 1 0 0 0.0890014298 -4.84971194 7.774972 89.34392 49 3 1 0 1 0 0 1 0 0
Re: [R] How to get the string '\'?
?cat cat(\\) \ On Nov 15, 2009, at 5:30 PM, Peng Yu wrote: I can not get the string '\'. Could somebody let me know how to get it? print('\') + + print('\\') [1] \\ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Basic Question about local
I have some beginner's questions regarding local, in the docs, it says that local evaluates an expression in a local environment. Q1: why is B different from A? In B, is a-a+1 getting evaluated before eval proceeds? #A a=0 eval(quote(a-a+1),new.env()) a # 0 #B a=0 eval(a-a+1,new.env()) a # 1 Q2: Why does mlocal behave differently? #C local #function (expr, envir = new.env()) #eval.parent(substitute(eval(quote(expr), envir))) #environment: namespace:base a=0 local(a-a+1) a #0 mlocal - function (expr, envir = new.env()) eval(quote(expr), envir) a=0 mlocal(a-a+1) a #1 Thank you S __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pairs
Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. Thank you, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the string '\'?
My question was from replacing a pattern by '\\'. How to replace '/' in string by '\'? string='abc/efg' gsub('/','\\',string) On Sun, Nov 15, 2009 at 5:07 PM, David Winsemius dwinsem...@comcast.net wrote: ?cat cat(\\) \ On Nov 15, 2009, at 5:30 PM, Peng Yu wrote: I can not get the string '\'. Could somebody let me know how to get it? print('\') + + print('\\') [1] \\ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the string '\'?
Regular expression needs double the '\' again, so try this: gsub('/','',string) On Mon, Nov 16, 2009 at 7:35 AM, Peng Yu pengyu...@gmail.com wrote: My question was from replacing a pattern by '\\'. How to replace '/' in string by '\'? string='abc/efg' gsub('/','\\',string) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the string '\'?
On Nov 15, 2009, at 6:35 PM, Peng Yu wrote: My question was from replacing a pattern by '\\'. How to replace '/' in string by '\'? string='abc/efg' gsub('/','\\',string) No, that was most definitely _not_ your posed question. If you want now to change your question and supply a reproducible example, that's fine, just don't claim that your mind should have been read more properly that it was, please. The problem with your _second_ question is that the printed representation of \ is a problem because of its special use as an escape symbol. So sometimes it needs to be displayed as \\. What gets written to the screen may be different that the internal representation. Look at the results of: string='abc/efg' cat(gsub('/','',string), file=test.txt) You should see: abc\efg ...although at the screen you would see: string='abc/efg' gsub('/','',string) [1] abc\\efg The first \ escapes second \ which in turn allows whatever follows to be interpreted as escaped, while the third \ escapes the 4th \ so that it can be examined by the R interpreter as a real \. -- David. On Sun, Nov 15, 2009 at 5:07 PM, David Winsemius dwinsem...@comcast.net wrote: ?cat cat(\\) \ On Nov 15, 2009, at 5:30 PM, Peng Yu wrote: I can not get the string '\'. Could somebody let me know how to get it? print('\') + + print('\\') [1] \\ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. Thank you, Cindy Use %in% to check for the presence of the numbers in a row and apply() to efficiently execute the test for each row: tstMatrix - matrix( c(2,5,1,6, 1,7,8,2, 3,7,6,2, 9,8,5,7), nrow=4, byrow=T ) matches - apply( tstMatrix, 1, function( row ){ if( 2 %in% row 6 %in% row ){ return( 2 ) } else { return( 0 ) } }) matches [1] 2 0 2 0 If you have more than one pair, it gets a little tricky. Say you are also looking for the pair (7,8). Store them as a list: pairList - list( c(2,6), c(7,8) ) Then use sapply() to efficiently iterate over the pair list and execute the apply() test: matchMatrix - sapply( pairList, function( pair ){ matches - apply( tstMatrix, 1, function( row ){ if( pair[1] %in% row pair[2] %in% row ){ return( pair[1] ) } else { return( 0 ) } }) return( matches ) }) matchMatrix [,1] [,2] [1,]20 [2,]07 [3,]20 [4,]07 If you're looking to apply the above method to every possible permutation of 2 numbers that may be generated from the range of numbers 1:15000... that's 225,000,000 pairs. expand.grid() can generate the required pair list-- but that step alone causes a memory allocation of ~6 GB on my machine. If you don't have a pile of CPU cores and RAM at your disposal, you can probably: 1. Restrict the upper end of your range to the maximal entry present in your matrix since all other combinations have zero occurrences. 2. Break the list of pairs up into several sublists, run the tests, and aggregate the results. Either way, the analysis will take some time despite the efficiencies of the apply family of functions due to the shear size of the problem. If you have more than one CPU, I would recommend taking a look at parallelized apply functions, perhaps using a package like snowfall, as the testing of the pairs is an embarrassingly parallel problem. Hopefully I'm misunderstanding the scope of your problem. Good luck! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/pairs-tp26364801p26365206.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
Hope this help: m - matrix(c(2,1,3,9,5,7,7,8,1,8,6,5,6,2,2,7),4,4) p - c(2, 6) apply(m == p[1], 1, any) apply(m == p[2], 1, any) [1] TRUE FALSE TRUE FALSE If you want the number of rows which contain the pair, sum() could be used: sum(apply(m == p[1], 1, any) apply(m == p[2], 1, any)) [1] 2 On Mon, Nov 16, 2009 at 6:26 AM, cindy Guo cindy.g...@gmail.com wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. Thank you, Cindy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get the string '\'?
On Sun, Nov 15, 2009 at 6:05 PM, David Winsemius dwinsem...@comcast.net wrote: On Nov 15, 2009, at 6:35 PM, Peng Yu wrote: My question was from replacing a pattern by '\\'. How to replace '/' in string by '\'? string='abc/efg' gsub('/','\\',string) No, that was most definitely _not_ your posed question. If you want now to change your question and supply a reproducible example, that's fine, just don't claim that your mind should have been read more properly that it was, please. Sorry for the misunderstanding. I realized that the answer to the first question could not solve my original question (but I thought it could). So I stated my original question. The problem with your _second_ question is that the printed representation of \ is a problem because of its special use as an escape symbol. So sometimes it needs to be displayed as \\. What gets written to the screen may be different that the internal representation. Look at the results of: string='abc/efg' cat(gsub('/','',string), file=test.txt) You should see: abc\efg ...although at the screen you would see: string='abc/efg' gsub('/','',string) [1] abc\\efg The first \ escapes second \ which in turn allows whatever follows to be interpreted as escaped, while the third \ escapes the 4th \ so that it can be examined by the R interpreter as a real \. -- David. On Sun, Nov 15, 2009 at 5:07 PM, David Winsemius dwinsem...@comcast.net wrote: ?cat cat(\\) \ On Nov 15, 2009, at 5:30 PM, Peng Yu wrote: I can not get the string '\'. Could somebody let me know how to get it? print('\') + + print('\\') [1] \\ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Normalization of Data
Hi All I am looking for some resource to learn data normalization. I understand I am talking very broad here, I need something like a primer to give me a jump start. If you happen to know any good resource please do let me know. Cheers, -Abhi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
Hi, Charlie, Thank you for the reply. Maybe I don't need the frequency of each pair. I only need the top, say 50, pairs with the highest frequency. Is there anyway which can avoid calculating for all the pairs? Thanks, Cindy On Sun, Nov 15, 2009 at 4:18 PM, cls59 ch...@sharpsteen.net wrote: cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. Thank you, Cindy Use %in% to check for the presence of the numbers in a row and apply() to efficiently execute the test for each row: tstMatrix - matrix( c(2,5,1,6, 1,7,8,2, 3,7,6,2, 9,8,5,7), nrow=4, byrow=T ) matches - apply( tstMatrix, 1, function( row ){ if( 2 %in% row 6 %in% row ){ return( 2 ) } else { return( 0 ) } }) matches [1] 2 0 2 0 If you have more than one pair, it gets a little tricky. Say you are also looking for the pair (7,8). Store them as a list: pairList - list( c(2,6), c(7,8) ) Then use sapply() to efficiently iterate over the pair list and execute the apply() test: matchMatrix - sapply( pairList, function( pair ){ matches - apply( tstMatrix, 1, function( row ){ if( pair[1] %in% row pair[2] %in% row ){ return( pair[1] ) } else { return( 0 ) } }) return( matches ) }) matchMatrix [,1] [,2] [1,]20 [2,]07 [3,]20 [4,]07 If you're looking to apply the above method to every possible permutation of 2 numbers that may be generated from the range of numbers 1:15000... that's 225,000,000 pairs. expand.grid() can generate the required pair list-- but that step alone causes a memory allocation of ~6 GB on my machine. If you don't have a pile of CPU cores and RAM at your disposal, you can probably: 1. Restrict the upper end of your range to the maximal entry present in your matrix since all other combinations have zero occurrences. 2. Break the list of pairs up into several sublists, run the tests, and aggregate the results. Either way, the analysis will take some time despite the efficiencies of the apply family of functions due to the shear size of the problem. If you have more than one CPU, I would recommend taking a look at parallelized apply functions, perhaps using a package like snowfall, as the testing of the pairs is an embarrassingly parallel problem. Hopefully I'm misunderstanding the scope of your problem. Good luck! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://old.nabble.com/pairs-tp26364801p26365206.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to name a tag in a list or a data.frame from a string?
Suppose I have a string variable string='some_string' Now I want to have a list, where tag is the same as the string in the variable string. I'm wondering if this is possible in R. list(tag=1:3) data.frame(tag=1:3) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to name a tag in a list or a data.frame from a string?
On 15/11/2009 8:15 PM, Peng Yu wrote: Suppose I have a string variable string='some_string' Now I want to have a list, where tag is the same as the string in the variable string. I'm wondering if this is possible in R. list(tag=1:3) data.frame(tag=1:3) The most straightforward way is x - list(1:3) names(x) - string y - data.frame(dummy=1:3) names(y) - string You can also build expressions and parse and evaluate them, but the lines above are the easiest way. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to make all elements all elements lower-cap ?
I have a vector of letters like c(a, B, c). Is there any R function to force all elements to lower-cap ? Thanks, -- View this message in context: http://old.nabble.com/How-to-make-all-elements-all-elements-lower-cap---tp26365794p26365794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make all elements all elements lower-cap ?
try this: x - c(a, B, c) ?tolower tolower(x) [1] a b c On Sun, Nov 15, 2009 at 8:45 PM, RON70 ron_michae...@yahoo.com wrote: I have a vector of letters like c(a, B, c). Is there any R function to force all elements to lower-cap ? Thanks, -- View this message in context: http://old.nabble.com/How-to-make-all-elements-all-elements-lower-cap---tp26365794p26365794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make all elements all elements lower-cap ?
Hi Ron, Yes. Take a look at ?tolower HTH, Jorge On Sun, Nov 15, 2009 at 8:45 PM, RON70 wrote: I have a vector of letters like c(a, B, c). Is there any R function to force all elements to lower-cap ? Thanks, -- View this message in context: http://old.nabble.com/How-to-make-all-elements-all-elements-lower-cap---tp26365794p26365794.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to generate dependency file that can be used by gnu make?
gcc has options like -MM, which can generate the dependence files for a C/C++ file that I can be used by gnu make. I'm wondering if there is a tool that can generate dependence file for an R script. For example, I have an R script test.R #test.R load('input.RData') save.image('output.RData') I want to generate a dependence file like the following. Is there a tool to do so? output.RData:test.R input.RData __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] wilcox.test loop through variable names
On Sun, 15 Nov 2009 14:33 -0500, Jacob Wegelin jacobwege...@fastmail.fm wrote: Often I perform the same task on a series of variables in a dataframe, by looping through a character vector that holds the names and using paste(), eval(), and parse() inside the loop. For instance: rm(environmental) thesevars-names(environmental) environmental$ToyReal -rnorm(nrow(environmental)) environmental$ToyDichot- environmental$ToyReal 0.53 tableOfResults-data.frame(var=thesevars) tableOfResults$p_wilcox - NA tableOfResults$Beta_lm - NA rownames(tableOfResults)-thesevars for( thisvar in thesevars) { thiscommand- paste(thiswilcox - wilcox.test (, thisvar, ~ ToyDichot , data=environmental)) eval(parse(text=thiscommand)) tableOfResults[thisvar, p_wilcox] - thiswilcox$p.value thislm-lm( environmental[ c( ToyReal, thisvar )]) tableOfResults[thisvar, Beta_lm] - coef(thislm)[thisvar] } print(tableOfResults) Of course, the loop above is a toy example. In real life I might first figure out whether the variable is continuous, dichotomous, or categorical taking on several values, then perform an operation depending on its type. The use of paste(), eval(), and parse() seems awkward. As Gabor Grothendieck showed (http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we are calling a regression function such as lm() we can avoid using paste(), as shown above. But is there a way to avoid paste() and eval() when one uses t.test() or wilcox.test()? Here is a solution: rm(environmental) thesevars-names(environmental) environmental$ToyReal -rnorm(nrow(environmental)) environmental$ToyDichot- environmental$ToyReal 0.53 ThisList- lapply( environmental[thesevars], function( OneVar ) { c( p_wilcox= wilcox.test( OneVar ~ environmental$ToyDichot )$p.value , Beta_lm = as.numeric(coef(lm( environmental$ToyReal ~ OneVar ))[OneVar]) ) } ) do.call(rbind, ThisList) Jacob A. Wegelin Department of Biostatistics Virginia Commonwealth University Richmond VA 23298-0032 U.S.A. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] pairs
Hi, David, The matrix has 20 columns. Thank you very much for your help. I think it's right, but it seems I need some time to figure it out. I am a green hand. There are so many functions here I never used before. :) Cindy On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.netwrote: Assuming that the number of columns is 4, then consider this approach: prs -scan() 1: 2 5 1 6 5: 1 7 8 2 9: 3 7 6 2 13: 9 8 5 7 17: Read 16 items prmtx - matrix(prs, 4,4, byrow=T) #Now make copus of x.y and y.x pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2), 2,function(x) paste(x[2],x[1], sep=.))) ) tpair -table(pair.str) # This then gives you a duplicated list tpair[tpair1] pair.str 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7 2 2 2 2 2 2 2 2 # So only take the first half of the pairs: head(tpair[tpair1], sum(tpair1)/2) pair.str 1.2 2.1 2.6 2.7 2 2 2 2 -- David. On Nov 15, 2009, at 8:06 PM, David Winsemius wrote: I could of course be wrong but have you yet specified the number of columns for this pairing exercise? On Nov 15, 2009, at 5:26 PM, cindy Guo wrote: Hi, All, I have an n by m matrix with each entry between 1 and 15000. I want to know the frequency of each pair in 1:15000 that occur together in rows. So for example, if the matrix is 2 5 1 6 1 7 8 2 3 7 6 2 9 8 5 7 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return the value 2 for this pair as well as that for all pairs. Is there a fast way to do this avoiding loops? Loops take too long. and provide commented, minimal, self-contained, reproducible code. ^^ David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normalization of Data
What do you mean with normalization? Maybe you are looking for scale function on R. El dom, 15-11-2009 a las 16:29 -0800, Abhishek Pratap escribió: Hi All I am looking for some resource to learn data normalization. I understand I am talking very broad here, I need something like a primer to give me a jump start. If you happen to know any good resource please do let me know. Cheers, -Abhi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to use SQL code in R
Dear All, How to use SQL code in R? Thanks! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple if else statement problem
Hi r-help-boun...@r-project.org napsal dne 13.11.2009 18:54:05: Ok Jim it worked, thank you! it´s funny because it worked with the first syntax in some cases... you can use another approach in this case P-max(c(P1,P2)) Regards Petr anna_l wrote: Hello, I am getting an error with the following code: if( P2 P1) + { + P-P2 + } else Erro: unexpected 'else' in else { + P-P1 + } I checked the syntax so I don´t understand, I have other if else statements with the same syntax working. Thanks in advance -- View this message in context: http://old.nabble.com/Simple-if-else-statement- problem-tp26340336p26340642.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Normalization of Data
Hey Sorry if it was not very clear, I assumed few things. Normalization is actually subjective, I am basically comparing data from two similar experiments and would like to normalize for the systematic errors which might affect the actual result and inference. What I am looking for is some examples where people have normalized data and some standard methods of doing so. Thanks, -Abhi On Sun, Nov 15, 2009 at 8:46 PM, Kenneth Roy Cabrera Torres krcab...@une.net.co wrote: What do you mean with normalization? Maybe you are looking for scale function on R. El dom, 15-11-2009 a las 16:29 -0800, Abhishek Pratap escribió: Hi All I am looking for some resource to learn data normalization. I understand I am talking very broad here, I need something like a primer to give me a jump start. If you happen to know any good resource please do let me know. Cheers, -Abhi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ARMAX model fitting with arima
I am trying to understand how to fit an ARMAX model with the arima function from the stats package. I tried the simple data below, where the time series (vector x) is generated by filtering a step function (vector u, the exogenous signal) through a lowpass filter with AR coefficient equal to 0.8. The input gain is 0.3 and there is a 0.01 normal white noise added to the output: x - u - c (rep (0, 50), rep (1, 50)) x [1] - 0 set.seed (0) for (i in 2 : length (x)) { x [i] - 0.3 * u [i] + 0.8 * x [i - 1] + 0.01 * rnorm (1) } Then, I fit the model: arima (x, c (1, 0, 0), xreg = u, include.mean = FALSE, method = ML) Coefficients: ar1 u 0.9988 0.2995 Why don't I get ar1 close to 0.8? If I use lm to regress the data, it works: lm (x [2 : length (x)] ~ x [1 : (length (x) - 1)] + u [2 : length (u)] - 1) Coefficients: x[1:(length(x) - 1)]u[2:length(u)] 0.79890.3015 Any help will be appreciated. Best, -- Rafael Laboissiere __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.