Re: [R] Conditional looping over a set of variables in R
Adrienne - this solves the problem nicely. Thanks for your help. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com From: wootten.adrie...@gmail.com [mailto:wootten.adrie...@gmail.com] On Behalf Of Adrienne Wootten Sent: Friday, October 22, 2010 9:09 AM To: David Herzberg Cc: r-help@r-project.org Subject: Re: [R] Conditional looping over a set of variables in R David, here I'm referring to your data as testmat, a matrix of 140 columns and 1500 rows, but the same or similar notation can be applied to data frames in R. If I understand correctly, you are looking for the first response (column) where you got a value of 1. I'm assuming also that since your missing values are characters then your two numeric values are also characters. keeping all this in mind, try something like this. first = c() # your extra variable which will eventually contain the first correct response for each case for(i in 1:nrow(testmat)){ c = 1 while( c=ncol(testmat) | testmat[i,c] != 1 ){ if( testmat[i,c] == 1){ first[i] = c break # will exit the while loop once it finds the first correct answer, and then jump to the next case } else { c=c+1 # procede to the next column if not } } } Hope this helps you out a bit. Adrienne Wootten NCSU On Fri, Oct 22, 2010 at 11:33 AM, David Herzberg dav...@wpspublish.commailto:dav...@wpspublish.com wrote: Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. #i IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. + comp x(#i) = 99. * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE, THUS CAPTURING THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE FOR THAT CASE. CHANGING THE VALUE OF LCfirst1 ALSO CAUSE S THE LOOP TO STOP EXECUTING FOR THAT CASE, AND THE PROGRAM MOVES TO THE NEXT CASE AND RESTARTS THE LOOP. + comp LCfirst1 = #i. + end if. end loop. exe. After several hours of trying to translate this procedure to R, I'm stumped. I played around with creating a list to hold the item responses variables (analogous to 'vector' in SPSS), but when I tried to use the list in an R procedure, I kept getting a warning along the lines of 'the list contains 1 element, only the first element will be used'. So perhaps a list is not the appropriate class to 'hold' these variables? It seems that some nested arrangement of 'for' 'while' and/or 'lapply' will allow me to recreate the operation described above? How do I set up the indexing operation analogous to 'loop #i' in SPSS? Any help is appreciated, and I'm happy to provide more information if needed. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.commailto:dav...@wpspublish.com [[alternative HTML version deleted]]
Re: [R] Long model formulae
Here is a dodge I often use. This is a mock-up example. ___ bar - data.frame(matrix(rnorm(1001), nrow = 1)) names(bar)[1] - y ## say head(bar[,1:5]) nbar - names(bar) form - as.formula(paste(nbar[1], ~, paste(nbar[-1], collapse = +))) fitModel - substitute(tm - rpart(FORM, data = DATA), list(FORM = form, DATA = quote(bar))) fitModel ## the screen quietly erupts... library(rpart) eval(fitModel) ## to do the job. ___ The advantage of proceeding this way is that the object you create, fm, has a meaningful (but large!) formula in it and the name of the dataframe from which the variables come. This makes it easy, e.g. to use manipulation tools on it. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of James Hirschorn Sent: Sunday, 24 October 2010 11:51 AM To: r-help@r-project.org Subject: [R] Long model formulae What is a good way to enter a very long model formula. For example: y ~ Input.2 + Input.3 + ... + Input.1000 (assuming the corresponding dataframe has many other columns). Is there a way to convert a character string to a formula? Are there command line expansions in R besides the simple '.'? Thanks. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Feedback on you manual
Okay, I've been convinced of both the feasibility and the desirability of a pdf. However, such a thing is unlikely to appear very soon. On 23/10/2010 20:23, 刘力平 wrote: Hi, all: My opinion is, provide a PDF tutorial is important, but not one web page containing everything. From the perspective of end-user programming, most people will have a fast reading of your tutorial to get principles of R. When they become working with R, they need come back often to search the detail they have seen but could not remember. Most end-users spend much time than reading the tutorial. So provide a tutorial easy to search is very important. They can search by going down with link tree, or by searching utility provide by website(like cplusplus.com http://cplusplus.com), or google. Even if the PDF file is better than the single long webpage, at least user could remember page number. Seldom can remember the exact position of scroll. best Liping Liu On Sat, Oct 23, 2010 at 4:28 AM, Patrick Burns pbu...@pburns.seanet.com mailto:pbu...@pburns.seanet.com wrote: No (and I have an excuse). It is a tree of pages rather than a single document. My impression is that a pdf needs to be linear. On 23/10/2010 10:09, Liviu Andronic wrote: (off-topic) Dear Patrick On Sat, Oct 23, 2010 at 10:57 AM, Patrick Burns pbu...@pburns.seanet.com mailto:pbu...@pburns.seanet.com wrote: Perhaps 'Some hints for the R beginner' http://www.burns-stat.com/pages/Tutor/hints_R_begin.html Do you provide a PDF version of this human-friendly introduction to R? :) Regards Liviu is closer to what you have in mind. It includes links to other documents that are possibly along the lines you seek. On 23/10/2010 07:18, 刘力平 wrote: Dear Sir/Madam: Great thanks for R project and you contribution. I am Liping Liu, a beginner of R. Recently, I use R much. I wish you could improve the manual by making it search engine friendly. The Introduction to R page is too long. I am often redirected to this page by goole, but I still can not find the content I need easily. Could you please make it a structured: one page concentrated on a small topic and all these pages linked together? Indeed the tutorial of Weka is much better than R's, in my point of view. And I can not find a enterance of references. I appriciate it if you take my feedback seriously. best, Liping Liu [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com mailto:pbu...@pburns.seanet.com http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com mailto:pbu...@pburns.seanet.com http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') -- Patrick Burns pbu...@pburns.seanet.com http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Feedback on you manual
On 24/10/2010 04:35, Steve Lianoglou wrote: [ ... ] Speaking as as an older R-user to a new R-user, I can understand some of the frustrations you feel with coming up to speed with a new programming language. One piece of advice I have for you is that you should actually take the time to read through the entirety of that An Introduction to R page you keep stumbling upon. You can get through most of it in a night, and having read through it you'll likely be able to jump to relevant places of it when google sends you there as a result of one of your queries. This is good advice for a set of people, but I think quite bad advice for a lot of people. An Introduction to R makes a LOT of assumptions about the reader's knowledge. To most of us who have been using R (or other computer languages) for a while, those assumptions are invisible. If you look at 'Introduction' through the eyes of a complete novice, it is really, really scary. If they were to think this was the only way to learn R, many will give up well before the night gets under way. [ ... ] Hope that helps, -steve -- Patrick Burns pbu...@pburns.seanet.com http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gamma glm - using of weights gives error
Dear R-users, i try to use the following code to do a gamma regression glm(x1 / x2 ~ x3 + x4 + x5 + x6 + x7 + x8, family=Gamma(link=log), weights=x2) but here i get the error Error: NA/NaN/Inf in foreign function call (arg 1) In addition: Warning message: step size truncated due to divergence x2 has integer values ranging from 1 to 6. If i do instead glm(x1 / x2 ~ x3 + x4 + x5 + x6 + x7 + x8, family=Gamma(link=log)) without using the weights-argument i get no error. So far i don't really understand what this argument realy does, what is the difference in the results? What can i do to locate the root of the error and how can i avoid this error? Can i to the regression without the weights argument? Thanks and best regards Andreas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Feedback on you manual
On Sun, Oct 24, 2010 at 10:33 AM, Patrick Burns pbu...@pburns.seanet.com wrote: If you look at 'Introduction' through the eyes of a complete novice, it is really, really scary. Ditto. I had no programming background prior to learning R, and after half a minute glancing through the 'Intro to R' I've decided that this wasn't my kind of intro, and went hunting for other resources. Currently I feel relatively comfy in R, and I still haven't read the 'Intro'. Regards Liviu If they were to think this was the only way to learn R, many will give up well before the night gets under way. [ ... ] Hope that helps, -steve -- Patrick Burns pbu...@pburns.seanet.com http://www.portfolioprobe.com/blog http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Do you know how to read? http://www.alienetworks.com/srtest.cfm http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader Do you know how to write? http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian constrained regression method?
Hello Jim, Please reply to the list - you'll have a much better chance of getting useful suggestions. OK so some addition info. I know each of the X2 is in (0,1). Is there any method available? I don't think that's sufficient to estimate b, at least not in my experience of fitting Bayesian models with MCMC. To get any sort of precise posterior for b I think you would need to know that, for instance, X2 is correlated with X1 in some way, or that it can be described by a particular Beta distribution etc. I'd be happy to be corrected by others here who know much more than I do but if the best prior you can come up with for X2 is uniform in (0,1) I think you have insufficient information to proceed. Michael On 24 October 2010 09:28, Jim Silverton jim.silver...@gmail.com wrote: I am trying to estimate the parameter b. I have Y and X1 which I know and they are both random. However, I also have X2 which I don't know and is also random. I want to estimat b from the model: Y = b*X1 + ( 1 - b ) * X2 so my constraints areCan anyone offer some suggestions. The values of Y and X1 are both pvalues so they are constrained in (0,1). OK so some addition info. I know each of the X2 is in (0,1). Is there any method available? Jim On Sat, Oct 23, 2010 at 8:31 AM, Michael Bedward michael.bedw...@gmail.com wrote: Hi Jim, You don't mention whether you have any prior information regarding X2 that can be used to constrain values imputed for it. I think you will need some because without it values sampled for b and X2 respectively will just see-saw against each other. Michael On 22 October 2010 18:37, Jim Silverton jim.silver...@gmail.com wrote: Hello everyone, I am trying to estimate the parameter b. I have Y and X1 which I know and they are both random. However, I also have X2 which I don't know and is also random. I want to estimat b from the model: Y = b*X1 + ( 1 - b ) * X2 Can anyone offer some suggestions. The values of Y and X1 are both pvalues so they are constrained in (0,1). -- Thanks, Jim. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Thanks, Jim. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Contour Plot on a non Rectangular Grid
Dear All, I would like to plot a scalar (e.g. a temperature) on a non-rectangular domain (or even better: I would simply like to be able to draw a contour plot on an arbitrary 2D domain). I wonder if there is any tool to achieve that with R. I did some online search in particular on the list archives, found several queries similar to this one but was not able to find any conclusive answer. I am interested in the following 2 options (1) just read a file of the form x1 y1 z1 x2 y2 x2 ... ... ... xn yn zn where the set of {xi} and {yi} are coordinates on an arbitrary domain and {zi} are the values of the scalar for the corresponding {x,y} coordinates. (2) Sometimes the domain where I want to draw a contour plot is nothing too fancy and the scalar itself is given by an analytical function. Consider e.g. the case of a circle of radius R=pi/2 centered about the origin and a function like z=f(x,y)=abs(cos(y)) NB: in this case a satisfactory solution could be to plot z on a rectangular grid and then clip a circular region To fix the ideas, the final result in this case (with a colorjet map) should look like this http://dl.dropbox.com/u/5685598/scalar_plot.pdf Any suggestion is appreciated. Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
On 24-Oct-10 11:30:57, Lorenzo Isella wrote: Dear All, I would like to plot a scalar (e.g. a temperature) on a non-rectangular domain (or even better: I would simply like to be able to draw a contour plot on an arbitrary 2D domain). I wonder if there is any tool to achieve that with R. I did some online search in particular on the list archives, found several queries similar to this one but was not able to find any conclusive answer. I am interested in the following 2 options (1) just read a file of the form x1 y1 z1 x2 y2 x2 ... ... ... xn yn zn where the set of {xi} and {yi} are coordinates on an arbitrary domain and {zi} are the values of the scalar for the corresponding {x,y} coordinates. (2) Sometimes the domain where I want to draw a contour plot is nothing too fancy and the scalar itself is given by an analytical function. Consider e.g. the case of a circle of radius R=pi/2 centered about the origin and a function like z=f(x,y)=abs(cos(y)) NB: in this case a satisfactory solution could be to plot z on a rectangular grid and then clip a circular region To fix the ideas, the final result in this case (with a colorjet map) should look like this http://dl.dropbox.com/u/5685598/scalar_plot.pdf Any suggestion is appreciated. Many thanks Lorenzo For your option (1), the fundamental issue is interpolation. There are many methods for this, with different proprties! An R Site Search on interpolation yields a lot of hits. One (which is fairly basic, but may suit your purposes) is the interpp() function in package akima: http://finzi.psych.upenn.edu/R/library/akima/html/interpp.html Hoping this helps, Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 24-Oct-10 Time: 12:51:03 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
On 10/24/2010 01:51 PM, (Ted Harding) wrote: On 24-Oct-10 11:30:57, Lorenzo Isella wrote: Dear All, I would like to plot a scalar (e.g. a temperature) on a non-rectangular domain (or even better: I would simply like to be able to draw a contour plot on an arbitrary 2D domain). I wonder if there is any tool to achieve that with R. I did some online search in particular on the list archives, found several queries similar to this one but was not able to find any conclusive answer. I am interested in the following 2 options (1) just read a file of the form x1 y1 z1 x2 y2 x2 ... ... ... xn yn zn where the set of {xi} and {yi} are coordinates on an arbitrary domain and {zi} are the values of the scalar for the corresponding {x,y} coordinates. (2) Sometimes the domain where I want to draw a contour plot is nothing too fancy and the scalar itself is given by an analytical function. Consider e.g. the case of a circle of radius R=pi/2 centered about the origin and a function like z=f(x,y)=abs(cos(y)) NB: in this case a satisfactory solution could be to plot z on a rectangular grid and then clip a circular region To fix the ideas, the final result in this case (with a colorjet map) should look like this http://dl.dropbox.com/u/5685598/scalar_plot.pdf Any suggestion is appreciated. Many thanks Lorenzo For your option (1), the fundamental issue is interpolation. There are many methods for this, with different proprties! An R Site Search on interpolation yields a lot of hits. One (which is fairly basic, but may suit your purposes) is the interpp() function in package akima: http://finzi.psych.upenn.edu/R/library/akima/html/interpp.html Hoping this helps, Ted. E-Mail: (Ted Harding)ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 24-Oct-10 Time: 12:51:03 -- XFMail -- Hi, And thanks for helping. I am anyway a bit puzzled, since case (1) is not only a matter of interpolation. Probably the point I did not make clear (my fault) is that case (1) in my original email does not refer to an irregular grid on a rectangular domain; the set of (x,y) coordinate could stand e.g. a flat metal slab along which I have temperature measurements. The slab could be e.g. elliptical or any other funny shape. What also matters is that the final outcome should not look rectangular, but by eye one should be able to tell the shape of the slab. Case (1) is a generalization of case (2) where I do not have either an analytical expression for the surface not for the scalar. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
On 24.10.2010 14:14, Lorenzo Isella wrote: On 10/24/2010 01:51 PM, (Ted Harding) wrote: On 24-Oct-10 11:30:57, Lorenzo Isella wrote: Dear All, I would like to plot a scalar (e.g. a temperature) on a non-rectangular domain (or even better: I would simply like to be able to draw a contour plot on an arbitrary 2D domain). I wonder if there is any tool to achieve that with R. I did some online search in particular on the list archives, found several queries similar to this one but was not able to find any conclusive answer. I am interested in the following 2 options (1) just read a file of the form x1 y1 z1 x2 y2 x2 ... ... ... xn yn zn where the set of {xi} and {yi} are coordinates on an arbitrary domain and {zi} are the values of the scalar for the corresponding {x,y} coordinates. (2) Sometimes the domain where I want to draw a contour plot is nothing too fancy and the scalar itself is given by an analytical function. Consider e.g. the case of a circle of radius R=pi/2 centered about the origin and a function like z=f(x,y)=abs(cos(y)) NB: in this case a satisfactory solution could be to plot z on a rectangular grid and then clip a circular region To fix the ideas, the final result in this case (with a colorjet map) should look like this http://dl.dropbox.com/u/5685598/scalar_plot.pdf Any suggestion is appreciated. Many thanks Lorenzo For your option (1), the fundamental issue is interpolation. There are many methods for this, with different proprties! An R Site Search on interpolation yields a lot of hits. One (which is fairly basic, but may suit your purposes) is the interpp() function in package akima: http://finzi.psych.upenn.edu/R/library/akima/html/interpp.html Hoping this helps, Ted. E-Mail: (Ted Harding)ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 24-Oct-10 Time: 12:51:03 -- XFMail -- Hi, And thanks for helping. I am anyway a bit puzzled, since case (1) is not only a matter of interpolation. Probably the point I did not make clear (my fault) is that case (1) in my original email does not refer to an irregular grid on a rectangular domain; the set of (x,y) coordinate could stand e.g. a flat metal slab along which I have temperature measurements. The slab could be e.g. elliptical or any other funny shape. What also matters is that the final outcome should not look rectangular, but by eye one should be able to tell the shape of the slab. Case (1) is a generalization of case (2) where I do not have either an analytical expression for the surface not for the scalar. Cheers What about the facilities in package rgl then? Uwe Ligges Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
On Oct 24, 2010, at 4:30 AM, Lorenzo Isella wrote: Dear All, I would like to plot a scalar (e.g. a temperature) on a non- rectangular domain (or even better: I would simply like to be able to draw a contour plot on an arbitrary 2D domain). I wonder if there is any tool to achieve that with R. I did some online search in particular on the list archives, found several queries similar to this one but was not able to find any conclusive answer. One implemented approach to this exists with the rms/Hmisc package combination. The perimeter function is used to define a region within which the are a sufficient number of cases and the perimeter object is passed to the bplot function, which is a wrapper for a lattice contourplot call. There is no reason you couldn't emulate I am interested in the following 2 options (1) just read a file of the form x1 y1 z1 x2 y2 x2 ... ... ... xn yn zn where the set of {xi} and {yi} are coordinates on an arbitrary domain and {zi} are the values of the scalar for the corresponding {x,y} coordinates. (2) Sometimes the domain where I want to draw a contour plot is nothing too fancy and the scalar itself is given by an analytical function. Consider e.g. the case of a circle of radius R=pi/2 centered about the origin and a function like z=f(x,y)=abs(cos(y)) That defines the contours but does not restrict the domain. NB: in this case a satisfactory solution could be to plot z on a rectangular grid and then clip a circular region To fix the ideas, the final result in this case (with a colorjet map) should look like this http://dl.dropbox.com/u/5685598/scalar_plot.pdf And that color encoded output would not be the output of a contourplot but is more like a levelplot or an image plot. Nonetheless, the perimeter and bplot combination can deliver a similar result if you supply either code or data as a suitable test case for analysis and display. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
On 10/24/2010 02:55 PM, David Winsemius wrote: On Oct 24, 2010, at 4:30 AM, Lorenzo Isella wrote: Dear All, I would like to plot a scalar (e.g. a temperature) on a non-rectangular domain (or even better: I would simply like to be able to draw a contour plot on an arbitrary 2D domain). I wonder if there is any tool to achieve that with R. I did some online search in particular on the list archives, found several queries similar to this one but was not able to find any conclusive answer. One implemented approach to this exists with the rms/Hmisc package combination. The perimeter function is used to define a region within which the are a sufficient number of cases and the perimeter object is passed to the bplot function, which is a wrapper for a lattice contourplot call. There is no reason you couldn't emulate I am interested in the following 2 options (1) just read a file of the form x1 y1 z1 x2 y2 x2 ... ... ... xn yn zn where the set of {xi} and {yi} are coordinates on an arbitrary domain and {zi} are the values of the scalar for the corresponding {x,y} coordinates. (2) Sometimes the domain where I want to draw a contour plot is nothing too fancy and the scalar itself is given by an analytical function. Consider e.g. the case of a circle of radius R=pi/2 centered about the origin and a function like z=f(x,y)=abs(cos(y)) That defines the contours but does not restrict the domain. NB: in this case a satisfactory solution could be to plot z on a rectangular grid and then clip a circular region To fix the ideas, the final result in this case (with a colorjet map) should look like this http://dl.dropbox.com/u/5685598/scalar_plot.pdf And that color encoded output would not be the output of a contourplot but is more like a levelplot or an image plot. Nonetheless, the perimeter and bplot combination can deliver a similar result if you supply either code or data as a suitable test case for analysis and display. I agree that contour plot was a misleading name for what I had in mind. I'll try your suggestion and the one by Uwe about rgl and post again if I had troubles. As to the domain of the function, at least in case (1), that should arise from the collected data points in (x,y) if the sampling is dense enough. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
On Oct 24, 2010, at 6:12 AM, Lorenzo Isella wrote: As to the domain of the function, at least in case (1), that should arise from the collected data points in (x,y) if the sampling is dense enough. And that is precisely what you get from the perimeter function. The earlier Design package provided an those facilities in base graphics. The paradigm for plotting regression objects changed a bit when Harrell shifted over to Lattice, but he has always provided worked examples that generalize nicely to real situations. There are also som nice examples of contourplots constrained to geographic regions in Woods' text on generalized additive models. I'm sure the spatial stats people have such facilities as well. Cheers Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Long model formulae
Here is a dodge I often use. This is a mock-up example. Very instructive (and helpful) ... ___ bar - data.frame(matrix(rnorm(1001), nrow = 1)) names(bar)[1] - y ## say head(bar[,1:5]) nbar - names(bar) form - as.formula(paste(nbar[1], ~, paste(nbar[-1], collapse = +))) fitModel - substitute(tm - rpart(FORM, data = DATA), list(FORM = form, DATA = quote(bar))) fitModel ## the screen quietly erupts... library(rpart) eval(fitModel) ## to do the job. ___ The advantage of proceeding this way is that the object you create, fm, has a meaningful (but large!) formula in it and the name of the dataframe from which the variables come. This makes it easy, e.g. to use manipulation tools on it. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Turning ppp into im in spatstat
Dear all, I'm working with two point patterns (ppp) in spatstat. I turned one of them into a spatial covariate (im) object. After that, I used this im object to fit a Poisson model for the second point pattern, using the covariate layer from the first one. In R, the whole thing looks somewhat like this: my_first.im - as.im(my_first.ppp) test.ppm - ppm (my_second.ppp, ~my_first, covariates = list (my_first = my_first.im)) The fitting seems to be working, but when I try to simulate a point pattern with the model I get an error message that Google doesn't know: rmh (test.ppm) Extracting model information...Evaluating trend...done. Checking arguments..determining simulation windows...Error in rmh.default(X, start = start, control = control, ..., verbose = verbose) : Expanded simulation window does not contain model window I checked the my_first.im$xrange and my_first.im$yrange and found it to be congruent with the bbox of the ppp. It would be very nice if someone could give me hint on why this error occurs and whether or not there is a possible workaround. Thanks in advance, Sebastian Schutte __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest AUC
The OOB error estimates in RF is one really nifty feature that alleviate the need for additional cross-validation or resampling. I've done some empirical comparison between OOB estimates and 10-fold CV estimates, and they are basically the same. Andy -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Claudia Beleites Sent: Saturday, October 23, 2010 3:39 PM To: r-help@r-project.org Subject: Re: [R] Random Forest AUC Dear List, Just curiosity (disclaimer: I never used random forests till now for more than a little playing around): Is there no out-of-bag estimate available? I mean, there are already ca. 1/e trees where a (one) given sample is out-of-bag, as Andy explained. If now the voting is done only over the oob trees, I should get a classical oob performance measure. Or is the oob estimate internally used up by some kind of optimization (what would that be, given that the trees are grown till the end?)? Hoping that I do not spoil the pedagogic efforts of the list in teaching Ravishankar to do his homework reasoning himself... Claudia Am 23.10.2010 20:49, schrieb Changbin Du: I think you should use 10 fold cross validation to judge your performance on the validation parts. What you did will be overfitted for sure, you test on the same training set used for your model buliding. On Sat, Oct 23, 2010 at 6:39 AM, mxkuhnmxk...@gmail.com wrote: I think the issue is that you really can't use the training set to judge this (without resampling). For example, k nearest neighbors are not known to over fit, but a 1nn model will always perfectly predict the training data. Max On Oct 23, 2010, at 9:05 AM, Liaw, Andyandy_l...@merck.com wrote: What Breiman meant is that as the model gets more complex (i.e., as the number of trees tends to infinity) the geneeralization error (test set error) does not increase. This does not hold for boosting, for example; i.e., you can't boost forever, which nececitate the need to find the optimal number of iterations. You don't need that with RF. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of vioravis Sent: Saturday, October 23, 2010 12:15 AM To: r-help@r-project.org Subject: Re: [R] Random Forest AUC Thanks Max and Andy. If the Random Forest is always giving an AUC of 1, isn't it over fitting??? If not, how do you differentiate this from over fitting??? I believe Random forests are claimed to never over fit (from the following link). http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.ht http://www.stat.berkeley.edu/%7Ebreiman/RandomForests/cc_home.ht m#features Ravishankar R -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-AUC-tp3006649p3008157.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Notice: This e-mail message, together with any attachme...{{dropped:11}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional looping over a set of variables in R
This won't be as quick as Bill's elegant solution, but it's a one-liner: apply(d, 1, function(x), match(1, x)) See ?match. -Peter Ehlers On 2010-10-22 10:36, David Herzberg wrote: Bill, thanks so much for this. I'll get a chance to test it later today, and will post the outcome. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Friday, October 22, 2010 9:52 AM To: David Herzberg; r-help@r-project.org Subject: RE: [R] Conditional looping over a set of variables in R You were a bit vague about the format of your data. I'm assuming all columns were numeric and the entries are one of 0, 1, and NA (missing value). I made a little function to generate random data of that format for testing purposes: makeData- function (nrow = 1500, ncol = 140, pMissing = 0.1) { # pMissing if proportion of missing values m- matrix(sample(c(1, 0), size = nrow * ncol, replace = TRUE), nrow, ncol) m[runif(nrow * ncol) pMissing]- NA data.frame(m) } E.g., set.seed(168) d- makeData(15,3) d X1 X2 X3 1 1 1 1 2 0 0 NA 3 0 1 0 4 0 0 NA 5 0 1 1 6 0 0 NA 7 1 0 0 8 0 1 1 9 0 0 1 10 1 1 NA 11 0 0 1 12 0 0 0 13 NA NA NA 14 0 0 0 15 1 0 0 I think the following function does what you want. The algorithm is pretty similar to what you showed. columnOfFirstOne- function(data) { # col will be return value, one entry per row of data. # Fill it with NA's: NA in output will mean there were no 1's in row col- rep(as.integer(NA), nrow(data)) for (j in seq_len(ncol(data))) { # loop over columns # For each entry in 'col', if it has not been set yet # and this entry the j'th column of data is 1 (and not missing) # then set to the column number. col[is.na(col) !is.na(data[, j]) data[, j] == 1]- j } col # return this from function } With the above data we get columnOfFirstOne(d) [1] 1 NA 2 NA 2 NA 1 2 3 1 3 NA NA NA 1 It seems quick enough for a dataset of your size dd- makeData(nrow=1500, ncol=140) system.time(columnOfFirstOne(dd)) # time in seconds user system elapsed 0.080.000.08 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Herzberg Sent: Friday, October 22, 2010 8:34 AM To: r-help@r-project.org Subject: [R] Conditional looping over a set of variables in R Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. #i IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF THAT VECTOR ELEMENT TO '99'. + comp x(#i) = 99. * AND THEN CONTROL PASSES TO THE NEXT STATEMENT, WHICH RECODES THE VALUE OF LCfirst1 TO THE CURRENT INDEX VALUE,
[R] best predictive model for mixed catagorical/continuous variables
Would anybody be able to advise on which package would offer the best approach for producing a model able to predict the probability of species occupation based upon a range of variables, some of them catagorical (eg. ten soil types where the numbers assigned are not related to any qualitative/quantitative continuum or vegetation type) and others continuous such as field size or vegetation height. I have tried using the TREE package but the models produced seem too simplistic and discard most variables with the result that there is no predictive power in the result. I would expect that there will be interactions between variables eg. if the vegetation is grassland then the vegetation height variable will mediate the interaction, if the vegetation is arable then crop type will be more significant. Would it be possible to use GLM or GAM models for this type of predictive modelling? Any assistance would be greatly appreciated - it's several years since I last used R for this type of work and unfortunately I don't have the support network of a university to turn to for advice these days! -- View this message in context: http://r.789695.n4.nabble.com/best-predictive-model-for-mixed-catagorical-continuous-variables-tp3009275p3009275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] covariance matrix
Hi all, I generated a covariance matrix and visualized as a 2D contour plot (x,y, covariance matrix), I would like to extract from the matrix the values ( in x and y) that auto-correlate which I will plot as an normal (x,y(being the values that auto-corelate to a certain x and y values in my original matrix). Any suggestions? Cheers, Marcelo -- Marcelo Andrade de Lima UNIFESP - Universidade Federal de São Paulo Departamento de Bioquímica Disciplina de Biologia Molecular Rua Três de Maio 100, 4 andar - Vila Clementino, 04044-020 Lab +55 11 55764438 R.1188 Cell +55 11 92725274 ml...@unifesp.br [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
Hi, And thanks for helping. I am anyway a bit puzzled, since case (1) is not only a matter of interpolation. Probably the point I did not make clear (my fault) is that case (1) in my original email does not refer to an irregular grid on a rectangular domain; the set of (x,y) coordinate could stand e.g. a flat metal slab along which I have temperature measurements. The slab could be e.g. elliptical or any other funny shape. What also matters is that the final outcome should not look rectangular, but by eye one should be able to tell the shape of the slab. Case (1) is a generalization of case (2) where I do not have either an analytical expression for the surface not for the scalar. Cheers What about the facilities in package rgl then? Uwe Ligges Hello, I feel I am drowning in a glass of water. Consider the following snippet at the end of the email, where I generated a set of {x,y,s=f(x,y)} values, i.e. a set of 2D coordinates + a scalar on a circle. Now, I can get a scatterplot in 3D, but how to get a 2D surface plot/levelplot? An idea could be to artificially set the z coordinate of the plot as a constant (instead of having it equal to s as in the scatterplot) and calculate the colormap with the values of s, along the lines of the volcano example + surface plot at http://bit.ly/9MRncd but I am experiencing problems. However, should I really go through all this? There is nothing truly 3D in the plot that I have in mind, you can think of it as e.g. some temperature measurement along a tube cross section. Any help is appreciated. Cheers Lorenzo library(scatterplot3d) library(rgl) R - pi/2 n - 100 x - y - seq(-R,R, length=n) xys - c() temp - seq(3) for (i in seq(n)){ for (j in seq(n)) #check I am inside the circle if ((sqrt(x[i]^2+y[j]^2))=R){ temp[1] - x[i] temp[2] - y[j] temp[3] - abs(cos(y[j])) xys - rbind(xys,temp) } } scatterplot3d(xys[,1], xys[,2], xys[,3], highlight.3d=TRUE, col.axis=blue, col.grid=lightblue, main=scatterplot3d - 2, pch=20) # __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cvs fpr R
Hello everyone. These days I am writing some code for a small project. I have started having problems with different versions of the files I keep (in case I need to move to older files). I need some easy cvs platform ( I do not know if cvs is the general name or a specific program) that is easy to use. I do not need something that special or specific. Could you please suggest me one easy to use for newbies? I would like to tahnk you in advance for your help P.s I use R (cran) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] preferred x-delimited data format for R?
On Wed, Oct 20, 2010 at 5:30 PM, Nutter, Benjamin nutt...@ccf.org wrote: I run into that problem frequently. I can usually circumvent it by using the quote = \ Argument. The default is quote = \' which uses the double and single quote as quoting symbols. If you change it to \ it will read the single quotes like regular text. This solves my problem. It seems that the apostrophe issue affects comma- (CSV) as well as tab-delimited data files, so it would probably be wise to start using quote = \ whenever I suspect the presence of apostrophes. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Subsetting a dataframe
Hi, I have a dataframe with 43 columns and a 1000 rows. Each entry in the dataframe can be either P or A. here is a small chunk: c1c2 ...c43 r100 P A ... P r101 A A ... A r102 P P ... P How does one subset this data frame to select those rows that have only P's in them? Thanks in advance. Anjan -- === anjan purkayastha, phd. research associate fas center for systems biology, harvard university 52 oxford street cambridge ma 02138 phone-703.740.6939 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a dataframe
Hi Anjan, Please consider the following example: x - c(2, rep(1, 10)) all(x == 1) [1] FALSE d - replicate(10, sample(x, replace = TRUE)) d [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]111111211 1 [2,]111212121 1 [3,]111111111 1 [4,]111111112 1 [5,]111121111 1 [6,]211111111 1 [7,]111111112 1 [8,]112111111 1 [9,]122121112 1 [10,]111111111 1 [11,]111111111 1 d[apply(d, 1, function(v) all(v==1)), ] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]111111111 1 [2,]111111111 1 [3,]111111111 1 HTH, Jorge On Sun, Oct 24, 2010 at 1:32 PM, ANJAN PURKAYASTHA wrote: Hi, I have a dataframe with 43 columns and a 1000 rows. Each entry in the dataframe can be either P or A. here is a small chunk: c1c2 ...c43 r100 P A ... P r101 A A ... A r102 P P ... P How does one subset this data frame to select those rows that have only P's in them? Thanks in advance. Anjan -- === anjan purkayastha, phd. research associate fas center for systems biology, harvard university 52 oxford street cambridge ma 02138 phone-703.740.6939 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cvs fpr R
Hi, On Sun, Oct 24, 2010 at 1:15 PM, Alaios ala...@yahoo.com wrote: Hello everyone. These days I am writing some code for a small project. I have started having problems with different versions of the files I keep (in case I need to move to older files). I need some easy cvs platform ( I do not know if cvs is the general name or a specific program) that is easy to use. I do not need something that special or specific. Could you please suggest me one easy to use for newbies? I would like to tahnk you in advance for your help P.s I use R (cran) You're looking for some revision control system: http://en.wikipedia.org/wiki/Revision_control It's not specific to R, CRAN, or anything else. I'd recommend using git or mercurial: http://git-scm.com/ http://mercurial.selenic.com/ simply because you don't have to setup any server component to get it to work (with subversion or CVS, you do need to setup a server component), and all of the revision history is just kept in the same directory as your project .. you can, of course, push your changes out to another cpu/server if you please for extra backup. CVS and subversion are other options -- if you find them easier to use, then feel free to use those. There are plenty of tutorials online for each to help you get started. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Subsetting a dataframe
Thanks all for your help. Anjan On Sun, Oct 24, 2010 at 1:38 PM, Jorge Ivan Velez jorgeivanve...@gmail.comwrote: Hi Anjan, Please consider the following example: x - c(2, rep(1, 10)) all(x == 1) [1] FALSE d - replicate(10, sample(x, replace = TRUE)) d [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]111111211 1 [2,]111212121 1 [3,]111111111 1 [4,]111111112 1 [5,]111121111 1 [6,]211111111 1 [7,]111111112 1 [8,]112111111 1 [9,]122121112 1 [10,]111111111 1 [11,]111111111 1 d[apply(d, 1, function(v) all(v==1)), ] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]111111111 1 [2,]111111111 1 [3,]111111111 1 HTH, Jorge On Sun, Oct 24, 2010 at 1:32 PM, ANJAN PURKAYASTHA wrote: Hi, I have a dataframe with 43 columns and a 1000 rows. Each entry in the dataframe can be either P or A. here is a small chunk: c1c2 ...c43 r100 P A ... P r101 A A ... A r102 P P ... P How does one subset this data frame to select those rows that have only P's in them? Thanks in advance. Anjan -- === anjan purkayastha, phd. research associate fas center for systems biology, harvard university 52 oxford street cambridge ma 02138 phone-703.740.6939 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === anjan purkayastha, phd. research associate fas center for systems biology, harvard university 52 oxford street cambridge ma 02138 phone-703.740.6939 === [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional looping over a set of variables in R
Whoops, got an extra comma in there somehow; should be: apply(d, 1, function(x) match(1, x)) -Peter Ehlers On 2010-10-24 08:17, Peter Ehlers wrote: This won't be as quick as Bill's elegant solution, but it's a one-liner: apply(d, 1, function(x), match(1, x)) See ?match. -Peter Ehlers On 2010-10-22 10:36, David Herzberg wrote: Bill, thanks so much for this. I'll get a chance to test it later today, and will post the outcome. David S. Herzberg, Ph.D. Vice President, Research and Development Western Psychological Services 12031 Wilshire Blvd. Los Angeles, CA 90025-1251 Phone: (310)478-2061 x144 FAX: (310)478-7838 email: dav...@wpspublish.com -Original Message- From: William Dunlap [mailto:wdun...@tibco.com] Sent: Friday, October 22, 2010 9:52 AM To: David Herzberg; r-help@r-project.org Subject: RE: [R] Conditional looping over a set of variables in R You were a bit vague about the format of your data. I'm assuming all columns were numeric and the entries are one of 0, 1, and NA (missing value). I made a little function to generate random data of that format for testing purposes: makeData- function (nrow = 1500, ncol = 140, pMissing = 0.1) { # pMissing if proportion of missing values m- matrix(sample(c(1, 0), size = nrow * ncol, replace = TRUE), nrow, ncol) m[runif(nrow * ncol) pMissing]- NA data.frame(m) } E.g., set.seed(168) d- makeData(15,3) d X1 X2 X3 1 1 1 1 2 0 0 NA 3 0 1 0 4 0 0 NA 5 0 1 1 6 0 0 NA 7 1 0 0 8 0 1 1 9 0 0 1 10 1 1 NA 11 0 0 1 12 0 0 0 13 NA NA NA 14 0 0 0 15 1 0 0 I think the following function does what you want. The algorithm is pretty similar to what you showed. columnOfFirstOne- function(data) { # col will be return value, one entry per row of data. # Fill it with NA's: NA in output will mean there were no 1's in row col- rep(as.integer(NA), nrow(data)) for (j in seq_len(ncol(data))) { # loop over columns # For each entry in 'col', if it has not been set yet # and this entry the j'th column of data is 1 (and not missing) # then set to the column number. col[is.na(col) !is.na(data[, j]) data[, j] == 1]- j } col # return this from function } With the above data we get columnOfFirstOne(d) [1] 1 NA 2 NA 2 NA 1 2 3 1 3 NA NA NA 1 It seems quick enough for a dataset of your size dd- makeData(nrow=1500, ncol=140) system.time(columnOfFirstOne(dd)) # time in seconds user system elapsed 0.080.000.08 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of David Herzberg Sent: Friday, October 22, 2010 8:34 AM To: r-help@r-project.org Subject: [R] Conditional looping over a set of variables in R Here's the problem I'm trying to solve in R: I have a data frame that consists of about 1500 cases (rows) of data from kids who took a test of listening comprehension. The columns are their scores (1 = correct, 0 = incorrect, . = missing) on 140 test items. The items are numbered sequentially and are ordered by increasing difficulty as you go from left to right across the columns. I want R to go through the data and find the first correct response for each case. Because of basal and ceiling rules, many cases have missing data on many items before the first correct response appears. For each case, I want R to evaluate the item responses sequentially starting with item 1. If the score is 0 or missing, proceed to the next item and evaluate it. If the score is 1, stop the operation for that case, record the item number of that first correct response in a new variable, proceed to the next case, and restart the operation. In SPSS, this operation would be carried out with LOOP, VECTOR, and DO IF, as follows (assuming the data set is already loaded): * DECLARE A NEW VARIABLE TO HOLD THE ITEM NUMBER OF THE FIRST CORRECT RESPONSE, SET IT EQUAL TO 0. numeric LCfirst1. comp LCfirst1 = 0 * DECLARE A VECTOR TO HOLD THE 140 ITEM RESPONSE VARIABLES. vector x=LC1a_score to LC140a_score. * SET UP A LOOP THAT WILL RUN FROM 1 TO 140, AS LONG AS LCfirst1 = 0. #i IS AN INDEX VARIABLE THAT INCREASES BY 1 EACH TIME THE LOOP RUNS. loop #i=1 to 140 if (LCfirst1 = 0). * SET UP A CONDITIONAL TRANSFORMATION THAT IS EVALUATED FOR EACH ELEMENT OF THE VECTOR. THUS, WHEN #i = 1, THE EXPRESSION EVALUATES THE FIRST ELEMENT OF THE VECTOR (THAT IS, THE FIRST OF THE 140 ITEM RESPONSES). AS THE LOOP RUNS AND #i INCREASES, SUBSEQUENT VECTOR ELELMENTS ARE EVALUATED. THE do if STATEMENT RETAINS CONTROL AND KEEPS LOOPING THROUGH THE VECTOR UNTIL A '1' IS ENCOUNTERED. + do if x(#i) = 1. * WHEN A '1' IS ENCOUNTERED, CONTROL PASSES TO
[R] more errors (behavior)
quick programming question. I am not making enough errors in my programs, so I want to trigger a few more. ;-) [1] undefined variable behavior: d=data.frame( x=rnorm(1:10), y=rnorm(1:10)) z Error: object 'z' not found d$z NULL is this consistent? I thought that z is the same as .GlobalEnv$z, but apparently it is not. something here is smart enough to trigger an error. I like this error behavior. is it possible to set an R global option that triggers the same 'not found' error when an undefined element of a list or data frame is accessed? [just trying to check all my function arguments, and right now, I think I need to include for each argument 'stopifnot(is.null(argument))'. This clutters the code.] [2] is it possible to turn off recycling for vector operations? (I may have asked this at some point already, but I can't find the answer.) a=c(2,3) b=c(4,5,6,7) a+b [1] 6 8 8 10 when I really want recycling, I would rather do it explicitly with rep. regards, /iaw Ivo Welch (ivo.we...@brown.edu, ivo.we...@gmail.com) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] more errors (behavior)
On 24-Oct-10 19:55:12, ivo welch wrote: quick programming question. I am not making enough errors in my programs, so I want to trigger a few more. ;-) [1] undefined variable behavior: d=data.frame( x=rnorm(1:10), y=rnorm(1:10)) z Error: object 'z' not found d$z NULL is this consistent? I thought that z is the same as .GlobalEnv$z, but apparently it is not. something here is smart enough to trigger an error. I like this error behavior. is it possible to set an R global option that triggers the same 'not found' error when an undefined element of a list or data frame is accessed? [just trying to check all my function arguments, and right now, I think I need to include for each argument 'stopifnot(is.null(argument))'. This clutters the code.] I'm not expert enough to answer your query properly, but I see it as an example of the somewhat bewildering variety of ways in which indexing can be represented in R. With your definition: .GlobalEnv$d returns exactly the same as if you had entered simply 'd'. .GlobalEnv$z returns NULL, while simply 'z' returns Error: object 'z' not found as you observed. d$y returns a vector consisting of the values of y (printed horizontally), as also does d[[2]], while d[2] returns a column of the values of y. str(d[2]) # 'data.frame': 10 obs. of 1 variable: # $ y: num 0.331 0.57 -0.266 -0.694 -0.992 ... d[[2]] returns a vector (horizontal) exactly like d$y. Now for d$z etc: d$z # NULL d[3] # Error in `[.data.frame`(d, 3) : undefined columns selected d[[3]] Error in .subset2(x, i, exact = exact) : subscript out of bounds [2] is it possible to turn off recycling for vector operations? (I may have asked this at some point already, but I can't find the answer.) a=c(2,3) b=c(4,5,6,7) a+b [1] 6 8 8 10 when I really want recycling, I would rather do it explicitly with rep. regards, /iaw But, in such a case, what would you intend a+b to mean? Ted. E-Mail: (Ted Harding) ted.hard...@wlandres.net Fax-to-email: +44 (0)870 094 0861 Date: 24-Oct-10 Time: 21:25:29 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] call for paper-nov/dec 2010
Dear Sir/Madam *CALL FOR PAPER DECEMBER ISSUE* Greetings from INTERNATIONAL JOURNAL OF COMPUTER TECHNOLOGY AND APPLICATIONS (IJCTA) IJCTA is an international, peer-reviewed online journal entitled to publish original research articles in the fields of Computer science and Information technology *Topics* The articles shall cover the following topics but not limited to Computer science Computational mathematics Computer engineering and scientific computing Software engineering Neural networks Natural language processing and information retrieval Algorithm and computational complexity Computer architecture and high performance computing Distributed and grid computing Human-computer interaction Information security Pattern recognition and image processing VLSI design and testing Artificial intelligence Dependable computing, etc *Process* Frequency of Publication: Two Months once Review Process: 4 days Acceptance: 6th day from the date of submission Published online: 4 days after receiving copyrights transfer form after acceptance We invite research, review articles and short communications to be published in the forthcoming December issue Please contact us for further queries Regards, Chief Editor IJCTA [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] 140 packages in R Commander!!
Dear List I just downloaded and installed R 2.12.0 and then installed R Commander . First it got RCmdr and Car, and then suggested for other packages for utilizing the full functionality- I clicked yes! I got 140 packages installed!!! Cran Mirror was UCLA... Here is the list. Is this intentional- I can see some packages like snow and multicore which are desirable but quite optional.(see list below) Regards Ajay 'slam' 'fBasics' 'bitops' 'Rglpk' 'snowFT' 'rlecuyer' 'rsprng' 'nws' 'tweedie' 'gtools' 'gdata' 'caTools' 'Ecdat' 'ergm' 'latentnet' 'degreenet' 'shapes' 'snow' 'RColorBrewer' 'statmod' 'cubature' 'kinship' 'gam' 'tripack' 'akima' 'logspline' 'gplots' 'maxLik' 'miscTools' 'sem' 'rgdal' 'network' 'numDeriv' 'statnet' 'rgenoud' 'hexbin' 'ellipse' 'gclus' 'mlbench' 'randomForest' 'SparseM' 'Formula' 'ineq' 'mlogit' 'np' 'plm' 'pscl' 'quantreg' 'ROCR' 'sampleSelection' 'scatterplot3d' 'systemfit' 'truncreg' 'urca' 'oz' 'fUtilities' 'fEcofin' 'RUnit' 'quadprog' 'iterators' 'locfit' 'maps' 'rcom' 'rscproxy' 'sp' 'VGAM' 'MCMCpack' 'sna' 'gee' 'anchors' 'survey' 'ape' 'flexmix' 'rmeta' 'mlmRev' 'MEMSS' 'coda' 'party' 'ipred' 'modeltools' 'e1071' 'AER' 'bdsmatrix' 'DAAG' 'fCalendar' 'fSeries' 'fts' 'its' 'timeDate' 'timeSeries' 'tis' 'tseries' 'xts' 'foreach' 'TSA' 'RSQLite' 'tkrplot' 'sgeostat' 'mapproj' 'tcltk2' 'R2wd' 'png' 'tree' 'VIM' 'mitools' 'Zelig' 'HSAUR' 'mvtnorm' 'lme4' 'robustbase' 'mboost' 'coin' 'xtable' 'sandwich' 'coxme' 'zoo' 'strucchange' 'dynlm' 'biglm' 'chron' 'acepack' 'TeachingDemos' 'Design' 'mice' 'subselect' 'kernlab' 'vcd' 'rgl' 'relimp' 'multcomp' 'lmtest' 'leaps' 'Hmisc' 'effects' 'colorspace' 'aplpack' 'abind' 'RODBC' car Rcmdr Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri On Sun, Oct 24, 2010 at 9:27 PM, Marcelo Lima mlim...@gmail.com wrote: Hi all, I generated a covariance matrix and visualized as a 2D contour plot (x,y, covariance matrix), I would like to extract from the matrix the values ( in x and y) that auto-correlate which I will plot as an normal (x,y(being the values that auto-corelate to a certain x and y values in my original matrix). Any suggestions? Cheers, Marcelo -- Marcelo Andrade de Lima UNIFESP - Universidade Federal de São Paulo Departamento de Bioquímica Disciplina de Biologia Molecular Rua Três de Maio 100, 4 andar - Vila Clementino, 04044-020 Lab +55 11 55764438 R.1188 Cell +55 11 92725274 ml...@unifesp.br [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to simulate from an estimated density
Hi, dear fellows, I was wondering how can I simulate from an estimated density function? I used my training data set and already have estimated density values at some fixed points. I plan to simulate some data from such estimated density and compare them to my validation data set. Anyone can help out is really appreciated. Thanks. Jay -- View this message in context: http://r.789695.n4.nabble.com/How-to-simulate-from-an-estimated-density-tp3009394p3009394.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Importing CSV File
I'm trying to import a CSV file into R and when it gets imported, the entries get numbered down the left side. How do I get rid of that? Thanks, Jason * read.csv(file=C:\\Program Files\\R\\Test Data\\sales.csv,head=TRUE) Month Sales 1January 422 2 February 151 3 March 451 4 April 175 5May 131 6 June 307 7 July47 8 August12 9 September 488 10 October 122 11 November54 12 December 244 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing CSV File
On 10/24/2010 04:57 PM, Jason Kwok wrote: I'm trying to import a CSV file into R and when it gets imported, the entries get numbered down the left side. How do I get rid of that? When you imported the CSV file into R, an object of class data.frame was created, and since you did not assign it to a variable name, (e.g., df1 - read.csv(...) ), the object got printed. A data.frame object is going to have a row.names attribute by definition, which is what you're seeing. In ?data.frame, we see documentation for the row.names argument: If ‘row.names’ was supplied as ‘NULL’ or no suitable component was found the row names are the integer sequence starting at one (and such row names are considered to be ‘automatic’, and not preserved by ‘as.matrix’). The method that prints out a data.frame is called print.data.frame, and it does have an argument to suppress printing of the row.names. The question is, why do you not want row.names? Are they just distracting you when printed, or is there some reason not to carry them along in the object? --Erik Thanks, Jason * read.csv(file=C:\\Program Files\\R\\Test Data\\sales.csv,head=TRUE) Month Sales 1January 422 2 February 151 3 March 451 4 April 175 5May 131 6 June 307 7 July47 8 August12 9 September 488 10 October 122 11 November54 12 December 244 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing CSV File
Thanks for the response Erik. In this case, I would like to keep the row name as the month. How would I do that? Thanks, Jason On Sun, Oct 24, 2010 at 6:20 PM, Erik Iverson er...@ccbr.umn.edu wrote: On 10/24/2010 04:57 PM, Jason Kwok wrote: I'm trying to import a CSV file into R and when it gets imported, the entries get numbered down the left side. How do I get rid of that? When you imported the CSV file into R, an object of class data.frame was created, and since you did not assign it to a variable name, (e.g., df1 - read.csv(...) ), the object got printed. A data.frame object is going to have a row.names attribute by definition, which is what you're seeing. In ?data.frame, we see documentation for the row.names argument: If row.names was supplied as NULL or no suitable component was found the row names are the integer sequence starting at one (and such row names are considered to be automatic, and not preserved by as.matrix). The method that prints out a data.frame is called print.data.frame, and it does have an argument to suppress printing of the row.names. The question is, why do you not want row.names? Are they just distracting you when printed, or is there some reason not to carry them along in the object? --Erik Thanks, Jason * read.csv(file=C:\\Program Files\\R\\Test Data\\sales.csv,head=TRUE) Month Sales 1January 422 2 February 151 3 March 451 4 April 175 5May 131 6 June 307 7 July47 8 August12 9 September 488 10 October 122 11 November54 12 December 244 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to simulate from an estimated density
You usually simulate a distribution by: 1. building the cumulative distribution (use cumsum). 2. Simulating a random number (from uniform distribution, use runif (number_of_simulations_needed)). 3. Get the closest number to the number simulated from the cumulative distribution. 4. The corresponding value is your distribution. As to comparing them look up a Q-Q plot. hope this helps (or what you asked for), Sachin p.s. I'm sorry if this has an corporate disclaimer attached (don't know how to get rid of it). --- Please consider the environment before printing this email --- Allianz - Best General Insurance Company of the Year 2010* Allianz - General Insurance Company of the Year 2009+ * Australian Banking and Finance Insurance Awards + Australia and New Zealand Insurance Industry Awards This email and any attachments has been sent by Allianz Australia Insurance Limited (ABN 15 000 122 850) and is intended solely for the addressee. It is confidential, may contain personal information and may be subject to legal professional privilege. Unauthorised use is strictly prohibited and may be unlawful. If you have received this by mistake, confidentiality and any legal privilege are not waived or lost and we ask that you contact the sender and delete and destroy this and any other copies. In relation to any legal use you may make of the contents of this email, you must ensure that you comply with the Privacy Act (Cth) 1988 and you should note that the contents may be subject to copyright and therefore may not be reproduced, communicated or adapted without the express consent of the owner of the copyright. Allianz will not be liable in connection with any data corruption, interruption, delay, computer virus or unauthorised access or amendment to the contents of this email. If this email is a commercial electronic message and you would prefer not to receive further commercial electronic messages from Allianz, please forward a copy of this email to unsubscr...@allianz.com.au with the word unsubscribe in the subject header. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing CSV File
sales - read.csv(file=C:/Program Files/R/Test Data/sales.csv, header=TRUE, row.names = Month) ^^^ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Jason Kwok Sent: Monday, 25 October 2010 8:27 AM To: Erik Iverson Cc: r-help@r-project.org Subject: Re: [R] Importing CSV File Thanks for the response Erik. In this case, I would like to keep the row name as the month. How would I do that? Thanks, Jason On Sun, Oct 24, 2010 at 6:20 PM, Erik Iverson er...@ccbr.umn.edu wrote: On 10/24/2010 04:57 PM, Jason Kwok wrote: I'm trying to import a CSV file into R and when it gets imported, the entries get numbered down the left side. How do I get rid of that? When you imported the CSV file into R, an object of class data.frame was created, and since you did not assign it to a variable name, (e.g., df1 - read.csv(...) ), the object got printed. A data.frame object is going to have a row.names attribute by definition, which is what you're seeing. In ?data.frame, we see documentation for the row.names argument: If 'row.names' was supplied as 'NULL' or no suitable component was found the row names are the integer sequence starting at one (and such row names are considered to be 'automatic', and not preserved by 'as.matrix'). The method that prints out a data.frame is called print.data.frame, and it does have an argument to suppress printing of the row.names. The question is, why do you not want row.names? Are they just distracting you when printed, or is there some reason not to carry them along in the object? --Erik Thanks, Jason * read.csv(file=C:\\Program Files\\R\\Test Data\\sales.csv,head=TRUE) Month Sales 1January 422 2 February 151 3 March 451 4 April 175 5May 131 6 June 307 7 July47 8 August12 9 September 488 10 October 122 11 November54 12 December 244 * [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Importing CSV File
On Mon, Oct 25, 2010 at 12:26 AM, Jason Kwok jayk...@gmail.com wrote: Thanks for the response Erik. In this case, I would like to keep the row name as the month. How would I do that? You can do this in Rcmdr. First Data Import From text file (or select your data.frame as active data set), then Data Active data set Set case names. Rcmdr will display the R code used to perform the two operations. Regards Liviu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditional looping over a set of variables in R
On Sun, Oct 24, 2010 at 2:54 PM, Peter Ehlers ehl...@ucalgary.ca wrote: Whoops, got an extra comma in there somehow; should be: apply(d, 1, function(x) match(1, x)) A slight variation on this would be: apply(d, 1, match, x = 1) -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 140 packages in R Commander!!
Dear Ajay, This is a consequence of installing the dependencies (including suggested packages, etc.) of the Rcmdr package, their dependencies, and so on recursively. The alternative would be for the Rcmdr package to specify its direct dependencies via depends rather than suggests, but then these dependencies would be loaded whenever the Rcmdr is loaded. If you have a better idea, I'm certainly open to it, since many, probably most, of the packages that get installed aren't really needed by the Rcmdr or by the packages on which it directly depends. The whole business takes about 10 minutes on my not-all-that-fast Internet connection and occupies about 250 MB (considerably less than 10 US cents at today's hard-disk prices), which doesn't seem terrible to me. Best, John John Fox Senator William McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Ajay Ohri Sent: October-24-10 12:47 PM To: R-help@r-project.org Subject: [R] 140 packages in R Commander!! Dear List I just downloaded and installed R 2.12.0 and then installed R Commander . First it got RCmdr and Car, and then suggested for other packages for utilizing the full functionality- I clicked yes! I got 140 packages installed!!! Cran Mirror was UCLA... Here is the list. Is this intentional- I can see some packages like snow and multicore which are desirable but quite optional.(see list below) Regards Ajay 'slam' 'fBasics' 'bitops' 'Rglpk' 'snowFT' 'rlecuyer' 'rsprng' 'nws' 'tweedie' 'gtools' 'gdata' 'caTools' 'Ecdat' 'ergm' 'latentnet' 'degreenet' 'shapes' 'snow' 'RColorBrewer' 'statmod' 'cubature' 'kinship' 'gam' 'tripack' 'akima' 'logspline' 'gplots' 'maxLik' 'miscTools' 'sem' 'rgdal' 'network' 'numDeriv' 'statnet' 'rgenoud' 'hexbin' 'ellipse' 'gclus' 'mlbench' 'randomForest' 'SparseM' 'Formula' 'ineq' 'mlogit' 'np' 'plm' 'pscl' 'quantreg' 'ROCR' 'sampleSelection' 'scatterplot3d' 'systemfit' 'truncreg' 'urca' 'oz' 'fUtilities' 'fEcofin' 'RUnit' 'quadprog' 'iterators' 'locfit' 'maps' 'rcom' 'rscproxy' 'sp' 'VGAM' 'MCMCpack' 'sna' 'gee' 'anchors' 'survey' 'ape' 'flexmix' 'rmeta' 'mlmRev' 'MEMSS' 'coda' 'party' 'ipred' 'modeltools' 'e1071' 'AER' 'bdsmatrix' 'DAAG' 'fCalendar' 'fSeries' 'fts' 'its' 'timeDate' 'timeSeries' 'tis' 'tseries' 'xts' 'foreach' 'TSA' 'RSQLite' 'tkrplot' 'sgeostat' 'mapproj' 'tcltk2' 'R2wd' 'png' 'tree' 'VIM' 'mitools' 'Zelig' 'HSAUR' 'mvtnorm' 'lme4' 'robustbase' 'mboost' 'coin' 'xtable' 'sandwich' 'coxme' 'zoo' 'strucchange' 'dynlm' 'biglm' 'chron' 'acepack' 'TeachingDemos' 'Design' 'mice' 'subselect' 'kernlab' 'vcd' 'rgl' 'relimp' 'multcomp' 'lmtest' 'leaps' 'Hmisc' 'effects' 'colorspace' 'aplpack' 'abind' 'RODBC' car Rcmdr Websites- http://decisionstats.com http://dudeofdata.com Linkedin- www.linkedin.com/in/ajayohri On Sun, Oct 24, 2010 at 9:27 PM, Marcelo Lima mlim...@gmail.com wrote: Hi all, I generated a covariance matrix and visualized as a 2D contour plot (x,y, covariance matrix), I would like to extract from the matrix the values ( in x and y) that auto-correlate which I will plot as an normal (x,y(being the values that auto-corelate to a certain x and y values in my original matrix). Any suggestions? Cheers, Marcelo -- Marcelo Andrade de Lima UNIFESP - Universidade Federal de Sco Paulo Departamento de Bioqummica Disciplina de Biologia Molecular Rua Trjs de Maio 100, 4 andar - Vila Clementino, 04044-020 Lab +55 11 55764438 R.1188 Cell +55 11 92725274 ml...@unifesp.br [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Contour Plot on a non Rectangular Grid
On Oct 24, 2010, at 9:30 AM, Lorenzo Isella wrote: Hi, And thanks for helping. I am anyway a bit puzzled, since case (1) is not only a matter of interpolation. Probably the point I did not make clear (my fault) is that case (1) in my original email does not refer to an irregular grid on a rectangular domain; the set of (x,y) coordinate could stand e.g. a flat metal slab along which I have temperature measurements. The slab could be e.g. elliptical or any other funny shape. What also matters is that the final outcome should not look rectangular, but by eye one should be able to tell the shape of the slab. Case (1) is a generalization of case (2) where I do not have either an analytical expression for the surface not for the scalar. Cheers What about the facilities in package rgl then? Uwe Ligges Hello, I feel I am drowning in a glass of water. Not sure what we are supposed to make of this. Consider the following snippet at the end of the email, where I generated a set of {x,y,s=f(x,y)} values, i.e. a set of 2D coordinates + a scalar on a circle. Now, I can get a scatterplot in 3D, but how to get a 2D surface plot/ levelplot? You were advised to look at rms. Why have you dismissed this suggestion? Using your data setup below and packaging into a dataframe. require(rms) ddf - datadist(xysf - as.data.frame(xys)) olsfit - ols(V3~rcs(V1,3)+rcs(V2,3), data=xysf) bounds - perimeter(xysf$V1, xysf$V2) plot(xysf$V1, xysf$V2) #demonstrates the extent of the data bplot(Predict(olsfit, V1,V2), perim=bounds) # a levelplot is the default bplot(Predict(olsfit, V1,V2), perim=bounds, lfun=contourplot) bplot(Predict(olsfit, V1,V2), perim=bounds, lfun=contourplot, xlim=c(-2.5,2.5)) # to demonstrate that perimeter works # and as expected this shows very little variability d/t V1 olsfit # note that anova(olsfit) Analysis of Variance Response: V3 Factor d.f. Partial SS MS F P V1 2 0.01618738 8.093691e-03 19.47 .0001 Nonlinear 1 0.01618738 1.618738e-02 38.93 .0001 V2 2 470.67057254 2.353353e+02 566040.95 .0001 Nonlinear 1 470.67057254 4.706706e+02 1132081.91 .0001 TOTAL NONLINEAR2 527.78127558 2.638906e+02 634723.80 .0001 REGRESSION 4 527.78127558 1.319453e+02 317361.90 .0001 ERROR 7663 3.18594315 4.157566e-04 # most the the regression SS is in the V2 variable # Q.E.D. -- David, An idea could be to artificially set the z coordinate of the plot as a constant (instead of having it equal to s as in the scatterplot) and calculate the colormap with the values of s, along the lines of the volcano example + surface plot at http://bit.ly/9MRncd but I am experiencing problems. However, should I really go through all this? There is nothing truly 3D in the plot that I have in mind, you can think of it as e.g. some temperature measurement along a tube cross section. Any help is appreciated. Cheers Lorenzo library(scatterplot3d) library(rgl) R - pi/2 n - 100 x - y - seq(-R,R, length=n) xys - c() temp - seq(3) for (i in seq(n)){ for (j in seq(n)) #check I am inside the circle if ((sqrt(x[i]^2+y[j]^2))=R){ temp[1] - x[i] temp[2] - y[j] temp[3] - abs(cos(y[j])) xys - rbind(xys,temp) } } scatterplot3d(xys[,1], xys[,2], xys[,3], highlight.3d=TRUE, col.axis=blue, col.grid=lightblue, main=scatterplot3d - 2, pch=20) # __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help: Maximum likelihood estimation
Can you provide a reproducible code? Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu - Original Message - From: roach roachy...@gmail.com Date: Saturday, October 23, 2010 4:41 am Subject: Re: [R] Help: Maximum likelihood estimation To: r-help@r-project.org I'm not quite familiar with E-M algorithm, but I think what I did was the first step of the iteration. The method used in the original article is as follow: I gave lamda an initial value, and maximized the likelihood function. This is the complete chunk of my code after using alabama package. The first iteration had no problem, but after a few iterations, I again got warnings and the result is not good. Is it possible that it's because of some computational problems? because there are too many log and exp in the functions? Or is there anything I missed? library(alabama) # n=number of observation w=seq(0.05,0.9,length.out=n) # iteration repeat{ lamda=mean(w) ## -log likelihood function log.L=function(parm){ alpha0=parm[1] alpha1=parm[2] alpha2=parm[3] beta0=parm[4] beta1=parm[5] beta2=parm[6] beta3=parm[7] # here sigma is actually sigma inverse sigma11=parm[8] sigma12=parm[9] sigma21=parm[10] sigma22=parm[11] u1=-alpha0-alpha1*logp-alpha2*lakes+logq u21=-beta0-beta1*logq-beta2*s-beta3+logp u22=-beta0-beta1*logq-beta2*s+logp expon1=u1^2*sigma11+u1*u21*sigma12+u1*u21*sigma21+u21^2*sigma22 expon2=u1^2*sigma11+u1*u22*sigma12+u1*u22*sigma21+u22^2*sigma22 const=-log(2*pi)+.5*log(sigma11*sigma22-sigma12*sigma21)+log(abs(1-alpha1*beta1)) logf=const+log(lamda*exp(-0.5*expon1)+(1-lamda)*exp(-0.5*expon2)) log.L=-sum(logf) return(log.L) } ## estimate with nonlinear constraint hin=function(parm){ h=rep(NA,1) h[1]=parm[8]*parm[11]-parm[9]*parm[10] h[2]=parm[8] h[3]=parm[11] h } heq=function(parm){ h=rep(NA,1) h[1]=parm[9]-parm[10] h } max.like=constrOptim.nl(par=c(-0.5,-0.5,-0.5,-0.5,0.02,-0.02,0.02,1.9,-1.1,-1.1,1.9),fn=log.L, hin=hin,heq=heq) max.like$par ## parm=max.like$par alpha0=parm[1] alpha1=parm[2] alpha2=parm[3] beta0=parm[4] beta1=parm[5] beta2=parm[6] beta3=parm[7] sigma11=parm[8] sigma12=parm[9] sigma21=parm[10] sigma22=parm[11] u1=-alpha0-alpha1*logp-alpha2*lakes+logq u21=-beta0-beta1*logq-beta2*s-beta3+logp u22=-beta0-beta1*logq-beta2*s+logp expon1=u1^2*sigma11+u1*u21*sigma12+u1*u21*sigma21+u21^2*sigma22 expon2=u1^2*sigma11+u1*u22*sigma12+u1*u22*sigma21+u22^2*sigma22 h1_log=(-log(2*pi)+0.5*log(sigma11*sigma22-sigma12*sigma21))+(log(abs(1-alpha1*beta1))-0.5*expon1) h2_log=(-log(2*pi)+0.5*log(sigma11*sigma22-sigma12*sigma21))+(log(abs(1-alpha1*beta1))-0.5*expon2) w1=w w=1/(1+(1-lamda)/lamda*exp(h2_log-h1_log)) if(cor(w,w1)0.999) break } -- View this message in context: Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list PLEASE do read the posting guide and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unable to allocate arrays of size 2GB in 64 bit Windows 7 R
I seem Unable to allocate arrays of size around 2GB in 64 bit Windows 7 R. There is a lot of main memory available. The memory.limit is set to the max memory available, and there is more than 10GB of that available when R returns an 'unable to allocate memory' error. Is this a limitation of R even in 64 bit Windows 7? Or is there a wY to get I've this? Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] re-vertical conversion of data entries
Dear R user, Can you please help me. How do I convert part of a cluster analysis output under the heading âClustering vectorâ as shown below, showing the clusters to which each respondent belongs to: Â Â Â [1] 1 1 2 2 1 2 1 2 1 1 2 2 1 2 2 2 2 1 1 1 1 2 2 1 2 2 1 2 2 2 2 2 2 2 2 1 2 Â [38] 2 1 1 2 2 2 2 2 1 2 1 2 2 2 2 1 2 1 2 2 1 2 2 2 2 2 2 1 2 1 2 2 2 1 1 2 2 Â [75] 2 1 2 2 2 2 2 2 2 1 1 2 1 2 2 2 2 2 1 1 1 1 1 2 2 2 2 2 2 2 1 2 2 2 1 2 2 . . . [8696] 2 1 1 2 1 1 1 1 2 2 1 1 1 1 1 1 1 1 2 1 1 2 1 1 1 1 2 2 2 2 2 2 2 1 2 2 2 [8733] 2 2 1 1 2 1 2 2 1 2 2 1 1 2 1 2 2 1 2 2 2 2 1 2 2 2 1 2 1 2 2 2 1 2 1 1 Â to a single vertical column? Thanks. Â I used the following code to arrive at the above output: pam(dm,2,diss=TRUE, medoids=NULL, cluster.only=FALSE,do.swap=TRUE, keep.data=FALSE, trace.lev=0) Â Penny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question on passing the subset argument to an lm wrapper
Hello, How would you go about handling the following situation? This is on R 2.12.0 on Ubuntu 32-bit. I have a wrapper function to lm. I want to pass in a subset argument. First, I just thought I'd use ## make example reproducible set.seed(123) df1 - data.frame(age = rnorm(100, 50, 10), bmi = rnorm(100, 30, sd = 2)) ## create a wrapper using ... testlm - function(formula, ...) { lm(formula, data = df1, ...) } testlm(bmi ~ age, subset = age 50) Error in eval(expr, envir, enclos) : ..1 used in an incorrect context, no ... to look in I found some other examples of this error message, but couldn't piece together how it fits in with this example. Next, I tried specifying a subset argument. testlm2 - function(formula, subset) { lm(formula, data = df1, subset = subset) } testlm2(bmi ~ age, subset = age 50) Error in xj[i] : invalid subscript type 'closure' I also don't understand this one. Any pointers on if I'm just missing the easy solution to do what I want? Any explanations as to the above behavior (I know it has to do with model.frame, but not sure how) would also be greatly appreciated! Thanks! --Erik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-Fortran question (multiple subroutines)
Dear R-helpers, apologies if this is somewhere in a manual, I have not been able to find anything relevant. I run Windows Vista. I have some Fortran code in a subroutine, and have no problem calling this from R with .Fortran, compiling the code either with 'R CMD SHLIB' or independently with gfortran. But is it possible to have more than one subroutine in my source file, one depending on the other? Or is this not supported, or is there a trick? Of course, I could rewrite my code, but there are lots of subroutines... I.e, my code looks something like this: subroutine f(x,y,z) call g(x,y,z) end subroutine g(x,y,z) z = x*y end calling this from R shows that subroutine g is not called. The code compiled as executable works fine. thanks, Remko - Remko Duursma Research Lecturer Centre for Plants and the Environment University of Western Sydney Hawkesbury Campus Richmond NSW 2753 Mobile: +61 (0)422 096908 www.remkoduursma.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Text wrapping in R
I am about to give an introduction to R to some clinical data managers used to SAS. There is already a lot of material in printed form and on the web that paves the way. What I haven't found so far are text wrapping capabilities in setting tables in raw text as in SAS PROC REPORT. At the moment i would direct them at producing HTML output from R and pipe the result through lynx. Coming from SAS they may not be prepared to walk the Unix way of choosing the best tool for the right job. Have I overlooked a package that does something similar to SAS PROC REPORT? -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text wrapping in R
2010/10/25 Johannes Huesing johan...@huesing.name I am about to give an introduction to R to some clinical data managers used to SAS. There is already a lot of material in printed form and on the web that paves the way. What I haven't found so far are text wrapping capabilities in setting tables in raw text as in SAS PROC REPORT. At the moment i would direct them at producing HTML output from R and pipe the result through lynx. Coming from SAS they may not be prepared to walk the Unix way of choosing the best tool for the right job. Have I overlooked a package that does something similar to SAS PROC REPORT? -- Do you know Sweave, could this be a tool of choice? Uwe -- Uwe Ziegenhagen http://www.uweziegenhagen.de [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text wrapping in R
I would demonstrate one of the many LaTeX table functions. Off hand, packages xtable, hmisc, and quantreg all have functions that convert R objects to LaTeX tables. If they're unwilling to work in LaTeX, you can use something like LaTeXiT or Laeqed to create PDFs or PNGs of the tables for insertion into whatever report tool they use. Note that the latter will require a change to the preamble to not constantly be in math mode. Mac: http://www.chachatelier.fr/programmation/latexit_en.php Windows: http://www.thrysoee.dk/laeqed/ Hope that helps, Jeff. On Mon, Oct 25, 2010 at 12:59 AM, Johannes Huesing johan...@huesing.name wrote: I am about to give an introduction to R to some clinical data managers used to SAS. There is already a lot of material in printed form and on the web that paves the way. What I haven't found so far are text wrapping capabilities in setting tables in raw text as in SAS PROC REPORT. At the moment i would direct them at producing HTML output from R and pipe the result through lynx. Coming from SAS they may not be prepared to walk the Unix way of choosing the best tool for the right job. Have I overlooked a package that does something similar to SAS PROC REPORT? -- Johannes Hüsing There is something fascinating about science. One gets such wholesale returns of conjecture mailto:johan...@huesing.name from such a trifling investment of fact. http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.