[R] random point in a circle centred in a geographical position
Dear List members I am trying to find a way to generate a random point in a circle centred in a geographical location. So far I have used the following formula (see code below): random_x = original_x + radius*cos(angle) random_y = original_y + radius*sin(angle) where radius is a random number between 0 and the radius of the circle and angle is between 0 and 360 degrees The code bellow works fine however some people argue that this method will produce uneven locations… I really do not understand why. Another option will be to use the “rejection method”: generate points inside a box and reject points that fall outside the circle. However I think this is much complicate… or at least I don’t know how to programme it. So, 1. Is there any package that generates random points in a circle? (I couldn’t find any) 2. Do you think the code bellow produces uneven locations? 3. Any suggestions to programme the “rejection method”? Maybe someone has already used it... Cheers albert #Code random location in a circle # Generate UTM coordinates (original location) x_utm-c(289800) y_utm-c(4603900) coord-as.data.frame(cbind(x_utm,y_utm)) # Generate a random radius with a maximum distance of 1000 meters from the original location. radius-runif(1,0,1000) # Generate a random angle angle-runif(1,0,360) #Generate a random x coordinate x_rdm-coord$x_utm[1]+radius*cos(angle) #Generate a random y coordinate y_rdm-coord$y_utm[1]+radius*sin(angle) result-as.data.frame(cbind(x_rdm,y_rdm, x_utm, y_utm)) # We can calculate the distance between the original and the random location result$dist_m-sqrt(((coord$x_utm-result$x_rdm)^2+(coord$y_utm-result$y_rdm)^2)) result __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random point in a circle centred in a geographical posit
Hi Albert On 07-Oct-06 Albert Picado wrote: Dear List members I am trying to find a way to generate a random point in a circle centred in a geographical location. So far I have used the following formula (see code below): random_x = original_x + radius*cos(angle) random_y = original_y + radius*sin(angle) where radius is a random number between 0 and the radius of the circle and angle is between 0 and 360 degrees The code bellow works fine however some people argue that this method will produce uneven locations--I really do not understand why. Think about 5000 points chosen by the method you describe, and divide the circle into 10 circular regions at r = 0.1, 0.2, ... 1.0 at equal increments of r. Since you have chosen r uniformly distributed (in your description above and in your code below) you will have (about) 500 points in each region. So that is 500 between r=0 and r=0.1, in an area pi*(0.1^2); and then 500 between r=0.1 and r=0.2, in an area pi*(0.2^2) - pi*(0.1^2), and so on. The second area is pi*(0.04 - 0.01) = 0.03*pi, which is 3 times the first area, 0.01*pi. And so on. So the average density of points in the first is 3 times the average density of points in the second. And so on. Therefore your method will give points whose density per unit area is more concentrated the nearer you are to the centre of the circle. You can check this visually with the following code: r - runif(5000) t - 2*pi*runif(5000) x - r*cos(t) ; y - r*sin(t) plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1)) The above argument should suggest the correct way to proceed. You need equal increments in probability that radius r to correspond to equal increments in the area of a circle with radius r. This means that the probability that radius r should be proportional to r^2. Hence make the random radius r to be such that r^2 is uniformly distributed. Now try the following modification of the above code: r - sqrt(runif(5000)) t - 2*pi*runif(5000) x - r*cos(t) ; y - r*sin(t) plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1)) and you will see that (to within random scatter) the result is uniformly distributed over the circle. (By the way, don't use degrees from 0 to 360 in sin() and cos() -- these use radians from 0 to 2*pi). Another option will be to use the rejection method: generate points inside a box and reject points that fall outside the circle. However I think this is much complicated--or at least I don't know how to programme it. It is more complicated. Uniform x and y are easy, but then you have to check that x^2 + y^2 = 1 and reject it if not; and also keep a count of the number of points you have accepted and check whether this has attained the number of points you need, so that you can continue sampling until you have enough points (which you cannot predict at the start). It is also (though in this case not very) inefficient. A unit circle has radius pi = 3.14... and the enclosing square has area 4, so you will reject nearly 25% of your points. So, 1. Is there any package that generates random points in a circle? (I couldnt find any) Use the above code as a basis. For n points in a circle of radius 1000, multiply sqrt(runif(n)) by 1000. 2. Do you think the code bellow produces uneven locations? Yes (see above) 3. Any suggestions to programme the rejection method? Maybe someone has already used it... I wouldn't bother ... ! But if you are really interested in how it can be done, please write again. Cheers albert #Code random location in a circle # Generate UTM coordinates (original location) x_utm-c(289800) y_utm-c(4603900) coord-as.data.frame(cbind(x_utm,y_utm)) # Generate a random radius with a maximum distance of 1000 meters from # the original location. radius-runif(1,0,1000) # Generate a random angle angle-runif(1,0,360) #Generate a random x coordinate x_rdm-coord$x_utm[1]+radius*cos(angle) #Generate a random y coordinate y_rdm-coord$y_utm[1]+radius*sin(angle) result-as.data.frame(cbind(x_rdm,y_rdm, x_utm, y_utm)) # We can calculate the distance between the original and the random # location result$dist_m-sqrt(((coord$x_utm-result$x_rdm)^2+ (coord$y_utm-result$y_rdm)^2)) result Best wishes, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 07-Oct-06 Time: 09:53:43 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Sweave, latex Warning
When I am using Sweave to produce reports I get a strange Warning from latex: LaTeX Warning: You have requested package `/usr/local/lib/R/share/texmf/Sweave', but the package provides `Sweave'. It seems to be completely harmless, but I wonder if it is a symptom of installation problems, R or latex. How can I get rid of it? This is on linux (ubuntu), with R-2.4.0. -- Göran Broström __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, latex Warning
Göran Broström goran.brostrom at gmail.com writes: When I am using Sweave to produce reports I get a strange Warning from latex: LaTeX Warning: You have requested package `/usr/local/lib/R/share/texmf/Sweave' but the package provides `Sweave'. It seems to be completely harmless, but I wonder if it is a symptom of installation problems, R or latex. How can I get rid of it? This is on linux (ubuntu), with R-2.4.0. What do you actually use in Rnw file? Is it \usepackage{/usr/local/lib/R/share/texmf/Sweave} or \usepackage{Sweave} I think this was discussed in r-devel list a while ago. Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] (no subject)
On Sat, 7 Oct 2006 [EMAIL PROTECTED] wrote: Hi Albert On 07-Oct-06 Albert Picado wrote: Dear List members I am trying to find a way to generate a random point in a circle centred in a geographical location. So far I have used the following formula (see code below): random_x = original_x + radius*cos(angle) random_y = original_y + radius*sin(angle) where radius is a random number between 0 and the radius of the circle and angle is between 0 and 360 degrees The code bellow works fine however some people argue that this method will produce uneven locations--I really do not understand why. Think about 5000 points chosen by the method you describe, and divide the circle into 10 circular regions at r = 0.1, 0.2, ... 1.0 at equal increments of r. Since you have chosen r uniformly distributed (in your description above and in your code below) you will have (about) 500 points in each region. So that is 500 between r=0 and r=0.1, in an area pi*(0.1^2); and then 500 between r=0.1 and r=0.2, in an area pi*(0.2^2) - pi*(0.1^2), and so on. The second area is pi*(0.04 - 0.01) = 0.03*pi, which is 3 times the first area, 0.01*pi. And so on. So the average density of points in the first is 3 times the average density of points in the second. And so on. Therefore your method will give points whose density per unit area is more concentrated the nearer you are to the centre of the circle. You can check this visually with the following code: r - runif(5000) t - 2*pi*runif(5000) x - r*cos(t) ; y - r*sin(t) plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1)) The above argument should suggest the correct way to proceed. You need equal increments in probability that radius r to correspond to equal increments in the area of a circle with radius r. This means that the probability that radius r should be proportional to r^2. Hence make the random radius r to be such that r^2 is uniformly distributed. Now try the following modification of the above code: r - sqrt(runif(5000)) t - 2*pi*runif(5000) x - r*cos(t) ; y - r*sin(t) plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1)) and you will see that (to within random scatter) the result is uniformly distributed over the circle. (By the way, don't use degrees from 0 to 360 in sin() and cos() -- these use radians from 0 to 2*pi). Another option will be to use the rejection method: generate points inside a box and reject points that fall outside the circle. However I think this is much complicated--or at least I don't know how to programme it. It is more complicated. Uniform x and y are easy, but then you have to check that x^2 + y^2 = 1 and reject it if not; and also keep a count of the number of points you have accepted and check whether this has attained the number of points you need, so that you can continue sampling until you have enough points (which you cannot predict at the start). It is also (though in this case not very) inefficient. A unit circle has radius pi = 3.14... and the enclosing square has area 4, so you will reject nearly 25% of your points. Since these were geographical points, I'd risk using the sp package for this, but first a little function to make a polygon-circle based on Gabor Csardi's answer to a similar question in April: circle - function(x, y, r, length=100, zapsmall=TRUE) { t - seq(0, 2*pi, length=length) coords - t(rbind(x+sin(t)*r, y+cos(t)*r)) if (zapsmall) coords - zapsmall(coords) coords } Then: library(sp) crds - circle(0, 0, 1) Poly_0.0.1 - Polygon(crds) set.seed(061007) pts - sp:::sample.Polygon(Poly_0.0.1, n=100, type=random) plot(crds, type=l, asp=1) dim(coordinates(pts)) points(coordinates(pts)) which uses rejection. The distances follow as: spDistsN1(coordinates(pts), c(0,0)) Roger So, 1. Is there any package that generates random points in a circle? (I couldnt find any) Use the above code as a basis. For n points in a circle of radius 1000, multiply sqrt(runif(n)) by 1000. 2. Do you think the code bellow produces uneven locations? Yes (see above) 3. Any suggestions to programme the rejection method? Maybe someone has already used it... I wouldn't bother ... ! But if you are really interested in how it can be done, please write again. Cheers albert #Code random location in a circle # Generate UTM coordinates (original location) x_utm-c(289800) y_utm-c(4603900) coord-as.data.frame(cbind(x_utm,y_utm)) # Generate a random radius with a maximum distance of 1000 meters from # the original location. radius-runif(1,0,1000) # Generate a random angle angle-runif(1,0,360) #Generate a random x coordinate x_rdm-coord$x_utm[1]+radius*cos(angle) #Generate a random y coordinate y_rdm-coord$y_utm[1]+radius*sin(angle) result-as.data.frame(cbind(x_rdm,y_rdm, x_utm, y_utm)) # We can calculate the distance between the original and the random # location
[R] ifelse(logical, function1, function2) does not work
Why this kind of assignment does not work? n - 1 f - ifelse(n == 1, sin, cos) f(pi) this must be rewritten as: n - 1 f - cos if (n == 1) f - sin f(pi) [oops. 1.224606e-16 instead of zero. Damn floating point errors :-/] Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-pkg] New packages pmg, gWidgets, gWidgetsRGtk2
I answered Gabor off line, but realize that this may cause some others confusion. The RGtk2 package requires the GTK libraries to be installed first. This varies from system to system. Under Windows it consists of downloading some files from sourceforget (see below). After which RGtk2 installs like any other R package. Gabor brought up an additional point, in my example for gWidgets I have an argument to gtable() copied twice. The full example should read win = gwindow(Select a CRAN mirror) size(win) - c(600,400) tbl = gtable(utils:::getCRANmirrors(), chosencol=4, container=win, filter.column=2, handler = function(h,...) { URL = svalue(tbl) repos - getOption(repos) repos[CRAN] - gsub(/$, , URL[1]) options(repos = repos) dispose(win) }) Finally, for some reason I'm trying to track down, under Windows, pmg is crashing when a cairoDevice graphics device is being closed. This affects the plotnotebook and the Lattice Explorer. I have an update to gWidgetsRGtk2 that may fix it, but I haven't had a chance to fully test it. It can be installed with install.packages(gWidgetsRGtk2,repos=http://www.math.csi.cuny.edu/pmg;) --John Gabor Grothendieck [EMAIL PROTECTED] writes: Hi, I need installation instructions. library(pmg) seems not to be enough. Thanks. library(pmg) Loading pmg() Loading required package: gWidgets Loading required package: gWidgetsRGtk2 Loading required package: RGtk2 Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library 'C:/PROGRA~1/R/R-24~1.0/library/RGtk2/libs/RGtk2.dll': LoadLibrary failure: The specified procedure could not be found. Error: package 'RGtk2' could not be loaded Error: .onAttach failed in 'attachNamespace' Loading required package: pmggWidgetsRGtk PMG needs gWidgetsError in assign(x, value, envir = ns, inherits = FALSE) : could not find function gwindow In addition: Warning message: there is no package called 'pmggWidgetsRGtk' in: library(package, lib.loc = lib.loc, character.only = TRUE, logical = TRUE, Error: .onLoad failed in 'loadNamespace' for 'pmg' Error: package/namespace load failed for 'pmg' On 10/6/06, j verzani [EMAIL PROTECTED] wrote: I'd like to announce three new packages on CRAN: pmg, gWidgets, and gWidgetsRGtk2. --John Verzani 1 PMG *=*=*= The pmg package for R provides a relatively simple graphical user interface for R in a manner similar to the more mature RCmdr package. Basically this means a menu-driven interface to dialogs that collect arguments for R functions. This GUI was written with an eye towards simplifying the learning curve of R for introductory statistics students. The pmg package uses the GTK toolkit via the RGtk2 (1) package by Michael Lawrence. Some features are - The GUI works under Windows, Mac OS X, Linux/X11 - The GUI takes advantage of GTK's drag-and-drop capabilities. - Several dialogs for analyses performed in an introductory statistics course can be used simply by dragging variables around. - With the cairoDevice package, the GUI provides a notebook interface to a graphics device that allows UNIX users to easily manage multiple graphics devices. 2 gWidgets *=*=*=*=*=* The RGtk2 package is interfaced with the new gWidgets package, which may be of independent interest to those who would like to add GUI elements to their work. The gWidgets package provides a toolkit-independent API for interactive GUIs based on the iWidgets API of Simon Urbanek, with additional suggestions given by Philippe Grosjean and Michael Lawrence. The gWidgetsRGtk2 package provides the link between gWidgets and RGtk2, allowing the GTK toolkit to be used through gWidgets. It is possible for others to write a link between gWidgets and some other GUI toolkit. The gWidgets API is nowhere near as rich as the RGtk2 interface to the toolkit, but is much easier to learn. It is well-suited for simple applications or for rapid prototyping of more complicated applications. An accompanying vignette shows many examples of its use. For a simple, but not trivial, illustration, this code, following chooseCRANmirror(), shows how to use gWidgets to make a dialog for selecting a CRAN mirror. win = gwindow(Select a CRAN mirror) size(win) - c(600,400) tbl = gtable(utils:::getCRANmirrors(), chosencol=4, filter.column=2, container=win, filter.column=2, handler = function(h,...) { URL = svalue(tbl) repos - getOption(repos) repos[CRAN] - gsub(/$, , URL[1]) options(repos = repos) dispose(win) }) Two widgets are created: a base window to be a container for the widget that displays a table, which has a few extra arguments shown to indicate its flexibility. The widget to display a table allows a user to double click on a row and expect some action to happen. In this case, the handler
Re: [R] ifelse(logical, function1, function2) does not work
Try n - 1 f - if (n == 1) sin else cos f(pi) On 10/7/06, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote: Why this kind of assignment does not work? n - 1 f - ifelse(n == 1, sin, cos) f(pi) this must be rewritten as: n - 1 f - cos if (n == 1) f - sin f(pi) [oops. 1.224606e-16 instead of zero. Damn floating point errors :-/] Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse(logical, function1, function2) does not work
Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] writes: Why this kind of assignment does not work? n - 1 f - ifelse(n == 1, sin, cos) f(pi) It's not supposed to. 'ifelse' returns a value with the same shape as 'test' which is filled with elements selected from either 'yes' or 'no' depending on whether the element of 'test' is 'TRUE' or 'FALSE'. which makes very little sense if yes and no are functions. this must be rewritten as: n - 1 f - cos if (n == 1) f - sin f(pi) No, it must not. n - 1 f - if (n==1) sin else cos f(pi) or even (if (n==1) sin else cos)(pi) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sweave, latex Warning
On 10/7/06, Gregor Gorjanc [EMAIL PROTECTED] wrote: Göran Broström goran.brostrom at gmail.com writes: When I am using Sweave to produce reports I get a strange Warning from latex: LaTeX Warning: You have requested package `/usr/local/lib/R/share/texmf/Sweave' but the package provides `Sweave'. It seems to be completely harmless, but I wonder if it is a symptom of installation problems, R or latex. How can I get rid of it? This is on linux (ubuntu), with R-2.4.0. What do you actually use in Rnw file? Is it \usepackage{/usr/local/lib/R/share/texmf/Sweave} or \usepackage{Sweave} Neither; Sweave puts the former into the tex file. I'll search the r-devel archives. Thanks, Göran I think this was discussed in r-devel list a while ago. Gregor __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Göran Broström __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimate in Wilcox_test vs wilcox.exact
Jue.Wang2 at sanofi-aventis.com writes: Does any one know why wilcox.exact sometimes doesn't agree with wilcox_test on the estimate of the difference of medians in two levels ? Example removed See: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/85893.html Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse(logical, function1, function2) does not work
Peter Dalgaard writes: Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] writes: Why this kind of assignment does not work? n - 1 f - ifelse(n == 1, sin, cos) f(pi) It's not supposed to. 'ifelse' returns a value with the same shape as 'test' which is filled with elements selected from either 'yes' or 'no' depending on whether the element of 'test' is 'TRUE' or 'FALSE'. which makes very little sense if yes and no are functions. I think that's a debatable assertion. Why might I not want a vector, or rather a list, of functions? (Mr. Monteiro's real sin is a very common one --- using ifelse() unnecessarily/inappropriately, i.e. when the ``test'' argument is a scalar.) But to get back to trying to apply ifelse() with functions for the ``yes'' and ``no'' arguments: What's happening is really a feature of the ***rep()*** function, and the way that it treats objects of different natures. If you do rep(1,5) you get a vector of 5 1's. If you do rep(sin,5) you get an error message. Note however that if you do x - list(a=17,b=42) junk - rep(x,5) then junk is a list of length 5, each entry of which is a copy of x. I.e. rep() works as expected (?) with lists. And if you make the assignments y - list(a=42,b=17) n - 1 and then execute ifelse(n==1,x,y) you get a copy of x. So rep does its thing with numbers (and vectors) and lists, but not with *functions*. What else might one try to rep? What about matrices? Works, but the matrix is coerced to a vector first. One might have naively hoped that rep(M,5) would yield a list of length 5, each entry of which was a copy of M. But it doesn't; it gives a vector of length 5*nrow(M)*ncol(M) consisting of the data of M strung out in column order, 5 times over. The function rep() seems to have (S3) methods associated with it --- rep.Date, rep.factor, etc., but no generic function. I.e. rep() does not seem to dispatch methods. Nor is there a rep.default(). I wrote a ``method'' for the matrix class rep.matrix - function(x,times) { if(!inherits(x,matrix)) stop(Argument x is not a matrix.\n) ans - list() ans[1:times] - list(x) ans } That seemed to give the result I expected/wanted; rep(M,5) did indeed give a list of length 5, each entry of which was a copy of M. However a similar ``method'' for functions did not work; rep(sin,5) gave the same old ``object is not subsettable'' error. But if I called rep.function() *explicitly* I got what I wanted. I.e. rep.function(sin,5) gave a list of length 5 each entry of which was ``.Primitive(sin)''. Questions: How do methods for rep() get dispatched when there is no generic rep()? How come the matrix method that I wrote got dispatched, but the function method didn't? cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse(logical, function1, function2) does not work
I have noticed that dispatch on functions seems not to work in another case too. We define + on functions (I have ignored the niceties of sorting out the environments as we don't really need it for this example) but when we try to use it, it fails even though in the second example if we run it explicitly it succeeds: +.function - function(x, y) function(z) x(z) + y(z) sin+cos Error in sin + cos : non-numeric argument to binary operator +.function(sin, cos) function(z) x(z) + y(z) environment: 0x01c4ae7c By the way, rep(list(sin), 5) and rep(list(matrix(1:4, 2)), 5) will create lists of functions and matrices although I agree that the way it works is pretty strange. On 10/7/06, Rolf Turner [EMAIL PROTECTED] wrote: Peter Dalgaard writes: Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] writes: Why this kind of assignment does not work? n - 1 f - ifelse(n == 1, sin, cos) f(pi) It's not supposed to. 'ifelse' returns a value with the same shape as 'test' which is filled with elements selected from either 'yes' or 'no' depending on whether the element of 'test' is 'TRUE' or 'FALSE'. which makes very little sense if yes and no are functions. I think that's a debatable assertion. Why might I not want a vector, or rather a list, of functions? (Mr. Monteiro's real sin is a very common one --- using ifelse() unnecessarily/inappropriately, i.e. when the ``test'' argument is a scalar.) But to get back to trying to apply ifelse() with functions for the ``yes'' and ``no'' arguments: What's happening is really a feature of the ***rep()*** function, and the way that it treats objects of different natures. If you do rep(1,5) you get a vector of 5 1's. If you do rep(sin,5) you get an error message. Note however that if you do x - list(a=17,b=42) junk - rep(x,5) then junk is a list of length 5, each entry of which is a copy of x. I.e. rep() works as expected (?) with lists. And if you make the assignments y - list(a=42,b=17) n - 1 and then execute ifelse(n==1,x,y) you get a copy of x. So rep does its thing with numbers (and vectors) and lists, but not with *functions*. What else might one try to rep? What about matrices? Works, but the matrix is coerced to a vector first. One might have naively hoped that rep(M,5) would yield a list of length 5, each entry of which was a copy of M. But it doesn't; it gives a vector of length 5*nrow(M)*ncol(M) consisting of the data of M strung out in column order, 5 times over. The function rep() seems to have (S3) methods associated with it --- rep.Date, rep.factor, etc., but no generic function. I.e. rep() does not seem to dispatch methods. Nor is there a rep.default(). I wrote a ``method'' for the matrix class rep.matrix - function(x,times) { if(!inherits(x,matrix)) stop(Argument x is not a matrix.\n) ans - list() ans[1:times] - list(x) ans } That seemed to give the result I expected/wanted; rep(M,5) did indeed give a list of length 5, each entry of which was a copy of M. However a similar ``method'' for functions did not work; rep(sin,5) gave the same old ``object is not subsettable'' error. But if I called rep.function() *explicitly* I got what I wanted. I.e. rep.function(sin,5) gave a list of length 5 each entry of which was ``.Primitive(sin)''. Questions: How do methods for rep() get dispatched when there is no generic rep()? How come the matrix method that I wrote got dispatched, but the function method didn't? cheers, Rolf Turner [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Row comparisons to a new matrix?
---BeginMessage--- Thanks Gabor, Your version is handy to use, because you can change the function as you like. However it isn't any faster and if you know some way to make the result matrix more quickly, I'm interested to learn it. My test material (musical improvisations) consists of samples with 2x2 or even bigger result matrices. Atte - Original Message - From: Gabor Grothendieck [EMAIL PROTECTED] Date: Saturday, October 7, 2006 0:28 am Subject: Re: [R] Row comparisons to a new matrix? There is a generalized inner product here: http://tolstoy.newcastle.edu.au/R/help/05/04/3709.html On 10/6/06, Atte Tenkanen [EMAIL PROTECTED] wrote: Hi, Can somebody tell me, which is the fastest way to make comparisons between all rows in a matrix (here A) and put the results to the new symmetric matrix? I have here used cosine distance as an example, but the comparison function can be any other, euclidean dist etc. A=rbind(c(2,3),c(4,5),c(-1,2),c(5,6)) M=matrix(nrow=length(A[,1]),ncol=length(A[,1])) for(i in 1:length(A[,1])) { for(j in 1:length(A[,1])) { M[i,j]=cosine(A[i,],A[j,]) } } Atte Tenkanen University of Turku, Finland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ---End Message--- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Row comparisons to a new matrix?
Thanks Gabor, Your version is handy to use, because you can change the function as you like. However it isn't any faster and if you know some way to make the result matrix more quickly, I'm interested to learn it. My test material (musical improvisations) consists of samples with 2x2 or even bigger result matrices. Atte - Original Message - From: Gabor Grothendieck [EMAIL PROTECTED] Date: Saturday, October 7, 2006 0:28 am Subject: Re: [R] Row comparisons to a new matrix? There is a generalized inner product here: http://tolstoy.newcastle.edu.au/R/help/05/04/3709.html On 10/6/06, Atte Tenkanen [EMAIL PROTECTED] wrote: Hi, Can somebody tell me, which is the fastest way to make comparisons between all rows in a matrix (here A) and put the results to the new symmetric matrix? I have here used cosine distance as an example, but the comparison function can be any other, euclidean dist etc. A=rbind(c(2,3),c(4,5),c(-1,2),c(5,6)) M=matrix(nrow=length(A[,1]),ncol=length(A[,1])) for(i in 1:length(A[,1])) { for(j in 1:length(A[,1])) { M[i,j]=cosine(A[i,],A[j,]) } } Atte Tenkanen University of Turku, Finland __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error compiling packages with r-2.4.0 on ppc mac osx 10.4.7
Hi All, As the subject says, I am working on a powerPC (G4) mac with OSX version 10.4.7. I upgraded from r-2.2.1 to r-2.4.0 today using the universal binary dmg file downloaded from CRAN. This is the complete, not mini binary. The interactive environment works great, but I am having difficulty installing packages that require C code to be compiled. This includes my own packages and I have also tried the Matrix package. It looks like compilation is not the problem, linking is. In particular, when linking object files, every package install dies with this message: /usr/bin/libtool: unknown option character `m' in: -macosx_version_min I use fink and libtool is installed at version 1.5.10. Clearly this looks like a configuration problem with my machine, but I am a bit confused on where to go next. Any pointers would be greatly appreciated. cheers, Allan Strand __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [R-pkgs] New versions of Matrix and lme4 packages for R-2.4.0
On Tue, 03-Oct-2006 at 09:56AM -0500, Douglas Bates wrote: | Versions 0.9975-1 of the Matrix and lme4 packages will soon be available on | CRAN for use with R version 2.4.0 or later. How quickly things change!! When I got to look at CRAN, the latest versions were 0.9975-1 of lme4 and 0.9975-2 of Matrix which I tried. platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 4.0 year 2006 month 10 day03 svn rev39566 language R version.string R version 2.4.0 (2006-10-03) Matrix installs, but with a number of warnings like these: Warning in matchSignature(signature, fdef, where) : in the method signature for function coerce no definition for class: matrix.csr Warning in matchSignature(signature, fdef, where) : in the method signature for function coerce no definition for class: matrix.csr When I try to install lme4, it fails with messages like these: *Installing *source* package 'lme4' ... ** libs gcc -I/usr/local/R-2.4.0/include -I/usr/local/R-2.4.0/include -I/usr/local/include-fpic -g -O2 -std=gnu99 -c init.c -o init.o In file included from init.c:1: lme4_utils.h:4:20: error: Matrix.h: No such file or directory In file included from init.c:1: lme4_utils.h:14: error: syntax error before ��c�� My guess is that the warnings from the Matrix installation were not irrelevant, and that the type of error would indicate to the cognoscenti[1] what could be missing in my installation. If the second part of my guess is not correct, I'll gladly supply more of the messages if the above is insufficient. Of course, I'm open to correction on the first part also. TIA 1. cognoscenti:- persons who have superior knowledge and understanding of a particular field, (http://dictionary.reference.com/search?q=cognoscenti) -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_Middle minds discuss events (:_~*~_:)Small minds discuss people (_)-(_) . Anon ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Installing Lindsey's packages
Dear r-helpers, I downloaded http://popgen.unimaas.nl/~jlindsey/rcode/rmutil.tar (it was originally .tgz, but got unzipped by my browser). Can anyone give me detailed instructions on installing this and Lindsey's other packages on R version 2.4.0 (2006-10-03)---(powerpc- apple-darwin8.7.0, locale: C)? _ Professor Michael Kubovy University of Virginia Department of Psychology USPS: P.O.Box 400400Charlottesville, VA 22904-4400 Parcels:Room 102Gilmer Hall McCormick RoadCharlottesville, VA 22903 Office:B011+1-434-982-4729 Lab:B019+1-434-982-4751 Fax:+1-434-982-4766 WWW:http://www.people.virginia.edu/~mk9y/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum of Bernoullis with varying probabilities
Hi again. I had suspected that doing the calculation by a convolution method might be both straightforward and efficient in R. I've now located convolve() (in 'base'!!), and have a solution using this function. Details below. Original statement of problem, and method originally proposed, repeated below for reference. On 10/6/06, Ted Harding [EMAIL PROTECTED] wrote: Hi Folks, Given a series of n independent Bernoulli trials with outcomes Yi (i=1...n) and Prob[Yi = 1] = Pi, I want P = Prob[sum(Yi) = r] (r = 0,1,...,n) I can certainly find a way to do it: Let p be the vector c(P1,P2,...,Pn). The cases r=0 and r=n are trivial (and also are exceptions for the following routine). For a given value of r in (1:(n-1)), library(combinat) Set - (1:n) Subsets - combn(Set,r) P - 0 for(i in (1:dim(Subsets)[2])){ ix - numeric(n) ix[Subsets[,i]] - 1 P - P + prod((p^ix) * ((1-p)^(1-ix))) } The convolution method implemented in convolve computes, for two series X[1],X[2],...,X[n] and Y[1],Y[2],...,Y[m] the series (k = 1, 2, ... , n+m-1): Z[k] = sum( X[k-m+i]*Y[i] ) over valid values of i, i.e. products of terms of X matched with terms of a shifted Y, using the open method (see ?convolve). This is not quite what is wanted for the type of convolution needed to compute the distribution of the sum of two integer random variables, since one needs to reverse one of the series being convolved. This is the basis of the following: Given a vector p of probabilities P1, P2, P3, for Y=1 in successive trials, I need to convolve the successive Bernoulli distributions (1-P1,P1), (1-P2,P2), ... , (1-Pn,Pn). Hence: ps-cbind(1-p,p); u-ps[1,]; u1-u for(i in (2:16)){ u-convolve(u,rev(ps[i,]),type=o) } In the case I was looking at, I had already obtained the vector P of probabilities P(Sum(Y)=0), P(sum(Y)=1), ... by the method previously described (repeated above); and here n=16. Having obtained u by the above routine, I can now compare the results, u with P: cbind((0:16),u,P) r uP 0 1.353353e-01 1.353353e-01 1 3.007903e-01 3.007903e-01 2 2.976007e-01 2.976007e-01 3 1.747074e-01 1.747074e-01 4 6.826971e-02 6.826971e-02 5 1.884969e-02 1.884969e-02 6 3.804371e-03 3.804371e-03 7 5.720398e-04 5.720398e-04 8 6.463945e-05 6.463945e-05 9 5.489454e-06 5.489454e-06 10 3.473997e-07 3.473997e-07 11 1.607822e-08 1.607822e-08 12 5.262533e-10 5.262532e-10 13 1.148626e-11 1.148618e-11 14 1.494650e-13 1.493761e-13 15 9.008700e-16 8.764298e-16 16 -1.896716e-17 1.313261e-19 so, apart from ignorable differences in the very small probabilities for r11, they agree. I have to admit, though, that I had to tread carefully (and experiment with Binomials) to make sure of exactly how to introduce the reversal (at rev(ps[i,])). And the convolution method is *very* much faster than the selection of all subsets method! However, I wonder whether the subsets method may be the more accurate of the two (not that it really matters). Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 07-Oct-06 Time: 22:03:27 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] merge and polylist
Greetings: I would like to kindly ask for a little help. The rough code is: # dat=data.frame(read.delim(file=all.txt, header = TRUE, sep = \t, quote=\, dec=.,na.strings = NA)) nc=read.shape(astae.shp, dbf.data=TRUE, verbose=TRUE) mappolys=Map2poly(nc) submap - subset(mappolys, nc$att.data$NAME!=Honolulu, HI) nc$att.data=subset(nc$att.data, nc$att.data$NAME!=Honolulu, HI) nc$att.data[,1]=as.numeric(paste(nc$att.data$MSACMSA)) #attributes(nc$att.data) nc$att.data=merge(nc$att.data, dat, by.x=AREA, by.y=cod, all.x=TRUE, sort=FALSE) #attributes(nc$att.data) tmp=file(tmp) write.polylistShape(submap, nc$att.data, tmp) #_ All works fine, but merge() changes the rownames and the link between the polygons and the corresponding rows is lost. I tried numerous other solutions (such as to paste back the old rownames), to no avail. After a few days, here I am. Please, if you have a moment, send a tip. Thanks, mihai -- Mihai Nica 170 East Griffith Street G5 Jackson, MS 39201 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] xlsReadWrite version 1.1.1 available
An updated version 1.1.1 of xlsReadWrite has been submitted to cran. Currently the cran version is still 1.0, but you can download the updated version already here: http://treetron.googlepages.com. xlsReadWrite is a package which allows you to natively read and write Excelfiles (windows only). Changes from xlsReadWrite 1.0.0 to 1.1.1 - added colClasses argument (rowname, integer, double, factor, character and NA to determine type of data.frame column) - NA values are no longer written into the Excelsheet. - vector will be written as a column instead of a a row - FIXED IMPORTANT BUG which could happen in rare cases in a RKRecord cell and led to numbers not divided by 100, e.g. wrong by this factor. (thanks to Eric Rexstad for reporting!) - fixed bug with colNames: character vector in colNames didn't have an effect, I always used the Excel header row - compiled with range checks enabled (this should have been done from the beginning; from now on programming errors surface properly as range check errors). - minor changes in help text, README, testscripts improved... - LAST BUT NOT LEAST: clarified license text by granting an explicit right to freely distribute and use this package incl. the dll. (my own source is still licensed as GPLv2 (same as before)). Removed R-copyright notices in my source (had been there as a kind of credit but can be misunderstood)). -- Regards, Hans-Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unexpected behavior of boxplot(x, notch=TRUE, log=y)
bogdan romocea br44114 at gmail.com writes: A function I've been using for a while returned a surprising [to me, given the data] error recently: Error in plot.window(xlim, ylim, log, asp, ...) : Logarithmic axis must have positive limits After some digging I realized what was going on: x - c(10460.97, 10808.67, 29499.98, 1, 35818.62, 48535.59, 1, 1, 42512.1, 1627.39, 1, 7571.06, 21479.69, 25, 1, 16143.85, 12736.96, 1, 7603.63, 1, 33155.24, 1, 1, 50, 3361.78, 1, 37781.84, 1, 1, 1, 46492.05, 22334.88, 1, 1) summary(x) boxplot(x,notch=TRUE,log=y) #unexpected boxplot(x) #ok boxplot(x,log=y) #ok boxplot(x,notch=TRUE) #aha Mick Crawley (author of several books on ecological data analysis in R) submitted a related issue as bug #7690, which I was mildly surprised to see filed as not reproducible (I didn't have problems reproducing it at the time ... I posted my then-patch to R-devel at the time https://stat.ethz.ch/pipermail/r-devel/2006-January/036257.html ) The problem typically occurs for very small data sets, when the notches can be bigger than the hinges. As I said then, I can imagine debate about what should be done in this case -- you could just say don't do that, since the notches are based on an asymptotic argument ... the diff below just truncates the notches to the hinges, but produces a warning saying that the notches have been truncated. The interaction with log=y is new to me, though, and my old patch didn't catch it. Here's my reproducible version: set.seed(1001) npts - 7 X - rnorm(2*npts,rep(c(3,4.5),each=npts),sd=1) f - factor(rep(1:2,each=npts)) par(mfrow=c(1,2)) boxplot(X~f,notch=TRUE) A possible fix is to truncate the notches (and issue a warning) when this happens, in src/library/grDevices/R/calc.R: *** calc.R 2006-10-07 17:44:49.0 -0400 --- newcalc.R 2006-10-07 19:25:38.0 -0400 *** *** 16,21 --- 16,26 if(any(out[nna])) stats[c(1, 5)] - range(x[!out], na.rm = TRUE) } conf - if(do.conf) stats[3] + c(-1.58, 1.58) * iqr / sqrt(n) + if (do.conf) { + if (conf[1]stats[2] || conf[2]stats[4]) warning(confidence limits hinges: notches truncated) + conf[1] - max(conf[1],stats[2]) + conf[2] - min(conf[2],stats[4]) + } list(stats = stats, n = n, conf = conf, out = if(do.out) x[out nna] else numeric(0)) } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can lm() automatically take in the independent variables without knowing the names in advance
Hello! I am trying to use lm to do a simple regression but on a batch of different files. Each file has different column names. I know the first column is the dependent variable and all the rest are explanatory variables. The column names are in the first line of the file. But it seems lm() requires I know the variable names in advance? Is there a way to dynamically read in variable names and send to lm()? Many many thanks!!! Urania [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum of Bernoullis with varying probabilities
One can get a one-line solution by taking the product of the FFTs. For example, let p - 1:4/8 be the probabilities. Then the solution is: fft(exp(rowSums(log(mvfft(t(cbind(1-p,p,0,0,0)), inverse = TRUE)/5 On 10/7/06, Ted Harding [EMAIL PROTECTED] wrote: Hi again. I had suspected that doing the calculation by a convolution method might be both straightforward and efficient in R. I've now located convolve() (in 'base'!!), and have a solution using this function. Details below. Original statement of problem, and method originally proposed, repeated below for reference. On 10/6/06, Ted Harding [EMAIL PROTECTED] wrote: Hi Folks, Given a series of n independent Bernoulli trials with outcomes Yi (i=1...n) and Prob[Yi = 1] = Pi, I want P = Prob[sum(Yi) = r] (r = 0,1,...,n) I can certainly find a way to do it: Let p be the vector c(P1,P2,...,Pn). The cases r=0 and r=n are trivial (and also are exceptions for the following routine). For a given value of r in (1:(n-1)), library(combinat) Set - (1:n) Subsets - combn(Set,r) P - 0 for(i in (1:dim(Subsets)[2])){ ix - numeric(n) ix[Subsets[,i]] - 1 P - P + prod((p^ix) * ((1-p)^(1-ix))) } The convolution method implemented in convolve computes, for two series X[1],X[2],...,X[n] and Y[1],Y[2],...,Y[m] the series (k = 1, 2, ... , n+m-1): Z[k] = sum( X[k-m+i]*Y[i] ) over valid values of i, i.e. products of terms of X matched with terms of a shifted Y, using the open method (see ?convolve). This is not quite what is wanted for the type of convolution needed to compute the distribution of the sum of two integer random variables, since one needs to reverse one of the series being convolved. This is the basis of the following: Given a vector p of probabilities P1, P2, P3, for Y=1 in successive trials, I need to convolve the successive Bernoulli distributions (1-P1,P1), (1-P2,P2), ... , (1-Pn,Pn). Hence: ps-cbind(1-p,p); u-ps[1,]; u1-u for(i in (2:16)){ u-convolve(u,rev(ps[i,]),type=o) } In the case I was looking at, I had already obtained the vector P of probabilities P(Sum(Y)=0), P(sum(Y)=1), ... by the method previously described (repeated above); and here n=16. Having obtained u by the above routine, I can now compare the results, u with P: cbind((0:16),u,P) r uP 0 1.353353e-01 1.353353e-01 1 3.007903e-01 3.007903e-01 2 2.976007e-01 2.976007e-01 3 1.747074e-01 1.747074e-01 4 6.826971e-02 6.826971e-02 5 1.884969e-02 1.884969e-02 6 3.804371e-03 3.804371e-03 7 5.720398e-04 5.720398e-04 8 6.463945e-05 6.463945e-05 9 5.489454e-06 5.489454e-06 10 3.473997e-07 3.473997e-07 11 1.607822e-08 1.607822e-08 12 5.262533e-10 5.262532e-10 13 1.148626e-11 1.148618e-11 14 1.494650e-13 1.493761e-13 15 9.008700e-16 8.764298e-16 16 -1.896716e-17 1.313261e-19 so, apart from ignorable differences in the very small probabilities for r11, they agree. I have to admit, though, that I had to tread carefully (and experiment with Binomials) to make sure of exactly how to introduce the reversal (at rev(ps[i,])). And the convolution method is *very* much faster than the selection of all subsets method! However, I wonder whether the subsets method may be the more accurate of the two (not that it really matters). Best wishes to all, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 07-Oct-06 Time: 22:03:27 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can lm() automatically take in the independent variableswithout knowing the names in advance
someone may know how to do what you asked but a get around is to not read the header in the file and assign your own variable names to each column. This way you have control and you don't worry about doing anything dynamically. This is pretty easy if you are using read.table and probably also possible with other types of in read in file functions. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of HelponR Sent: Saturday, October 07, 2006 8:18 PM To: r-help Subject: [R] can lm() automatically take in the independent variableswithout knowing the names in advance Hello! I am trying to use lm to do a simple regression but on a batch of different files. Each file has different column names. I know the first column is the dependent variable and all the rest are explanatory variables. The column names are in the first line of the file. But it seems lm() requires I know the variable names in advance? Is there a way to dynamically read in variable names and send to lm()? Many many thanks!!! Urania [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can lm() automatically take in the independent variables without knowing the names in advance
Try this: run.lm - function(DF, response = names(DF)[1], fo = y~.) { fo[[2]] - as.name(response) eval(substitute(lm(fo, DF))) } # test run.lm(iris) run.lm(iris, Sepal.Width) Another possibility is to rename the first column: On 10/7/06, HelponR [EMAIL PROTECTED] wrote: Hello! I am trying to use lm to do a simple regression but on a batch of different files. Each file has different column names. I know the first column is the dependent variable and all the rest are explanatory variables. The column names are in the first line of the file. But it seems lm() requires I know the variable names in advance? Is there a way to dynamically read in variable names and send to lm()? Many many thanks!!! Urania [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] can lm() automatically take in the independent variables without knowing the names in advance
This works beautifully! You know I did not know how many columns in advance either. So this method is very good. Many thanks! On 10/7/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: run.lm - function(DF, response = names(DF)[1], fo = y~.) { fo[[2]] - as.name(response) eval(substitute(lm(fo, DF))) } # test run.lm(iris) run.lm(iris, Sepal.Width) Another possibility is to rename the first column: On 10/7/06, HelponR [EMAIL PROTECTED] wrote: Hello! I am trying to use lm to do a simple regression but on a batch of different files. Each file has different column names. I know the first column is the dependent variable and all the rest are explanatory variables. The column names are in the first line of the file. But it seems lm() requires I know the variable names in advance? Is there a way to dynamically read in variable names and send to lm()? Many many thanks!!! Urania [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Select range of dates
Hello, This is likely fairly silly question, and I apologize to whomever takes the time to respond. I am a relatively new user of R, on Windows XP, version 2.3.1. Say I have a data table that looks like the following: x Date Location Amount Blue Green 1 01/01/2001 Central1817 TRUE FALSE 2 01/02/2001 Central 20358 FALSE TRUE 3 05/08/2001 Central 16245 FALSE TRUE 4 02/02/2002 Western 112 TRUE FALSE 5 21/03/2002 Western 98756 TRUE FALSE 6 01/04/2002 Western 1598414 FALSE TRUE 7 07/01/2001 Western1255 FALSE TRUE 8 20/10/2003 Central 16289 TRUE FALSE 9 21/10/2003 Eastern 1 FALSE TRUE 10 22/10/2003 Eastern 98737 FALSE TRUE 11 23/10/2003 Eastern 198756 TRUE FALSE 12 24/10/2003 Eastern 98756 FALSE TRUE 13 25/10/2003 Eastern 65895 TRUE FALSE 14 26/10/2003 Eastern 2142266 FALSE TRUE 15 27/10/2003North 98756 TRUE FALSE 16 28/10/2003North 548236 FALSE TRUE and I want to do some summaries by Fiscal year (or FY quarter). Reading manuals and such, I cobbled this less than satisfactory bit together to start to build a factor representing a fiscal year split: y-as.Date(x$Date,%d/%m/%Y) y-as.matrix(y) y$FY0203-ifelse((y=(as.Date(2002-03-21)))(y=(as.Date (2003-04-01))),TRUE,FALSE) the values seem correct, but aside from ugly, the data is not in the proper format - with some more effort I might fix that... But my question is: is there a more simple function available to select a range of dates? Or, at least, a more elegant approach? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Select range of dates
Here are three alternative ways to get the fiscal year as a numeric value assuming: dd - as.Date(x$Date,%d/%m/%Y) # add one to year if month is past March as.numeric(format(dd, %Y)) + (format(dd, %m) 03) # same but using POSIXlt # (Even though there are no time zones involved I have seen # situations where the time zone nevertheless crept in # where one would least expect it so even though I believe # this is correct I would nevertheless triple check # all your results if you use this one) with(as.POSIXlt(dd), 1900 + year + (mon 2)) # this makes use of the zoo yearqtr class # which represents dates as year + 0, .25, .5 or .75 for # the four quarters library(zoo) as.numeric(ceiling(as.yearqtr(dd))) On 10/7/06, Ian Broom [EMAIL PROTECTED] wrote: Hello, This is likely fairly silly question, and I apologize to whomever takes the time to respond. I am a relatively new user of R, on Windows XP, version 2.3.1. Say I have a data table that looks like the following: x Date Location Amount Blue Green 1 01/01/2001 Central1817 TRUE FALSE 2 01/02/2001 Central 20358 FALSE TRUE 3 05/08/2001 Central 16245 FALSE TRUE 4 02/02/2002 Western 112 TRUE FALSE 5 21/03/2002 Western 98756 TRUE FALSE 6 01/04/2002 Western 1598414 FALSE TRUE 7 07/01/2001 Western1255 FALSE TRUE 8 20/10/2003 Central 16289 TRUE FALSE 9 21/10/2003 Eastern 1 FALSE TRUE 10 22/10/2003 Eastern 98737 FALSE TRUE 11 23/10/2003 Eastern 198756 TRUE FALSE 12 24/10/2003 Eastern 98756 FALSE TRUE 13 25/10/2003 Eastern 65895 TRUE FALSE 14 26/10/2003 Eastern 2142266 FALSE TRUE 15 27/10/2003North 98756 TRUE FALSE 16 28/10/2003North 548236 FALSE TRUE and I want to do some summaries by Fiscal year (or FY quarter). Reading manuals and such, I cobbled this less than satisfactory bit together to start to build a factor representing a fiscal year split: y-as.Date(x$Date,%d/%m/%Y) y-as.matrix(y) y$FY0203-ifelse((y=(as.Date(2002-03-21)))(y=(as.Date (2003-04-01))),TRUE,FALSE) the values seem correct, but aside from ugly, the data is not in the proper format - with some more effort I might fix that... But my question is: is there a more simple function available to select a range of dates? Or, at least, a more elegant approach? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.