[R] random point in a circle centred in a geographical position

2006-10-07 Thread Albert Picado
Dear List members

I am trying to find a way to generate a random point in a circle centred in a 
geographical location.

So far I have used the following formula (see code below):
random_x = original_x + radius*cos(angle)
random_y = original_y + radius*sin(angle)
where radius is a random number between 0 and the radius of the circle and
angle is between 0 and 360 degrees

The code bellow works fine however some people argue that this method will 
produce uneven locations… I really do not understand why.

Another option will be to use the “rejection method”: generate points inside a 
box and reject points that fall outside the circle. However I think this is 
much complicate… or at least I don’t know how to programme it.

So,
1. Is there any package that generates random points in a circle? (I couldn’t 
find any)
2. Do you think the code bellow produces uneven locations?
3. Any suggestions to programme the “rejection method”? Maybe someone has 
already used it...

Cheers

albert

#Code random location in a circle
# Generate UTM coordinates (original location)
 x_utm-c(289800)
 y_utm-c(4603900)
 coord-as.data.frame(cbind(x_utm,y_utm))
# Generate a random radius with a maximum distance of 1000 meters from the 
original location.
radius-runif(1,0,1000)
# Generate a random angle
angle-runif(1,0,360)
#Generate a random x coordinate
x_rdm-coord$x_utm[1]+radius*cos(angle)
#Generate a random y coordinate
y_rdm-coord$y_utm[1]+radius*sin(angle)
result-as.data.frame(cbind(x_rdm,y_rdm, x_utm, y_utm))
# We can calculate the distance between the original and the random location
 result$dist_m-sqrt(((coord$x_utm-result$x_rdm)^2+(coord$y_utm-result$y_rdm)^2))
result

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random point in a circle centred in a geographical posit

2006-10-07 Thread Ted Harding
Hi Albert

On 07-Oct-06 Albert Picado wrote:
 Dear List members
 
 I am trying to find a way to generate a random point in a circle
 centred in a geographical location.
 
 So far I have used the following formula (see code below):
 random_x = original_x + radius*cos(angle)
 random_y = original_y + radius*sin(angle)
 where radius is a random number between 0 and the radius of the
 circle and angle is between 0 and 360 degrees
 
 The code bellow works fine however some people argue that this method
 will produce uneven locations--I really do not understand why.

Think about 5000 points chosen by the method you describe, and
divide the circle into 10 circular regions at r = 0.1, 0.2, ... 1.0
at equal increments of r.

Since you have chosen r uniformly distributed (in your description
above and in your code below) you will have (about) 500 points in
each region. So that is 500 between r=0 and r=0.1, in an area
pi*(0.1^2); and then 500 between r=0.1 and r=0.2, in an area
pi*(0.2^2) - pi*(0.1^2), and so on.

The second area is pi*(0.04 - 0.01) = 0.03*pi, which is 3 times
the first area, 0.01*pi. And so on. So the average density of
points in the first is 3 times the average density of points in
the second. And so on.

Therefore your method will give points whose density per unit
area is more concentrated the nearer you are to the centre of
the circle.

You can check this visually with the following code:

  r - runif(5000)
  t - 2*pi*runif(5000)
  x - r*cos(t) ; y - r*sin(t)
  plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1))

The above argument should suggest the correct way to proceed.
You need equal increments in probability that radius  r
to correspond to equal increments in the area of a circle with
radius r. This means that the probability that radius  r
should be proportional to r^2. Hence make the random radius r
to be such that r^2 is uniformly distributed.

Now try the following modification of the above code:

  r - sqrt(runif(5000))
  t - 2*pi*runif(5000)
  x - r*cos(t) ; y - r*sin(t)
  plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1))

and you will see that (to within random scatter) the result is
uniformly distributed over the circle. (By the way, don't use
degrees from 0 to 360 in sin() and cos() -- these use radians
from 0 to 2*pi).

 Another option will be to use the “rejection method: generate
 points inside a box and reject points that fall outside the circle.
 However I think this is much complicated--or at least I don't know
 how to programme it.

It is more complicated. Uniform x and y are easy, but then you have
to check that x^2 + y^2 = 1 and reject it if not; and also keep
a count of the number of points you have accepted and check whether
this has attained the number of points you need, so that you can
continue sampling until you have enough points (which you cannot
predict at the start).

It is also (though in this case not very) inefficient. A unit
circle has radius pi = 3.14... and the enclosing square has
area 4, so you will reject nearly 25% of your points.

 So,
 1. Is there any package that generates random points in a circle? (I
 couldn’t find any)

Use the above code as a basis. For n points in a circle of radius
1000, multiply sqrt(runif(n)) by 1000.

 2. Do you think the code bellow produces uneven locations?

Yes (see above)

 3. Any suggestions to programme the rejection method? Maybe someone
 has already used it...

I wouldn't bother ... ! But if you are really interested in how
it can be done, please write again.

 Cheers
 
 albert
 
#Code random location in a circle
# Generate UTM coordinates (original location)
 x_utm-c(289800)
 y_utm-c(4603900)
 coord-as.data.frame(cbind(x_utm,y_utm))
# Generate a random radius with a maximum distance of 1000 meters from
# the original location.
radius-runif(1,0,1000)
# Generate a random angle
angle-runif(1,0,360)
#Generate a random x coordinate
x_rdm-coord$x_utm[1]+radius*cos(angle)
#Generate a random y coordinate
y_rdm-coord$y_utm[1]+radius*sin(angle)
result-as.data.frame(cbind(x_rdm,y_rdm, x_utm, y_utm))
# We can calculate the distance between the original and the random
# location
 result$dist_m-sqrt(((coord$x_utm-result$x_rdm)^2+
 (coord$y_utm-result$y_rdm)^2))
result

Best wishes,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 07-Oct-06   Time: 09:53:43
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave, latex Warning

2006-10-07 Thread Göran Broström
When I am using Sweave to produce reports I get a strange Warning from latex:

LaTeX Warning: You have requested package `/usr/local/lib/R/share/texmf/Sweave',
   but the package provides `Sweave'.

It seems to be completely harmless, but I wonder if it is a symptom of
installation problems, R or latex. How can I get rid of it?

This is on linux (ubuntu), with R-2.4.0.

-- 
Göran Broström

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave, latex Warning

2006-10-07 Thread Gregor Gorjanc
Göran Broström goran.brostrom at gmail.com writes:

 
 When I am using Sweave to produce reports I get a strange Warning from latex:
 
 LaTeX Warning: You have requested package 
 `/usr/local/lib/R/share/texmf/Sweave'
but the package provides `Sweave'.
 
 It seems to be completely harmless, but I wonder if it is a symptom of
 installation problems, R or latex. How can I get rid of it?
 
 This is on linux (ubuntu), with R-2.4.0.

What do you actually use in Rnw file? Is it 

\usepackage{/usr/local/lib/R/share/texmf/Sweave}

or

\usepackage{Sweave}

I think this was discussed in r-devel list 
a while ago.

Gregor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2006-10-07 Thread Roger Bivand
On Sat, 7 Oct 2006 [EMAIL PROTECTED] wrote:

 Hi Albert
 
 On 07-Oct-06 Albert Picado wrote:
  Dear List members
  
  I am trying to find a way to generate a random point in a circle
  centred in a geographical location.
  
  So far I have used the following formula (see code below):
  random_x = original_x + radius*cos(angle)
  random_y = original_y + radius*sin(angle)
  where radius is a random number between 0 and the radius of the
  circle and angle is between 0 and 360 degrees
  
  The code bellow works fine however some people argue that this method
  will produce uneven locations--I really do not understand why.
 
 Think about 5000 points chosen by the method you describe, and
 divide the circle into 10 circular regions at r = 0.1, 0.2, ... 1.0
 at equal increments of r.
 
 Since you have chosen r uniformly distributed (in your description
 above and in your code below) you will have (about) 500 points in
 each region. So that is 500 between r=0 and r=0.1, in an area
 pi*(0.1^2); and then 500 between r=0.1 and r=0.2, in an area
 pi*(0.2^2) - pi*(0.1^2), and so on.
 
 The second area is pi*(0.04 - 0.01) = 0.03*pi, which is 3 times
 the first area, 0.01*pi. And so on. So the average density of
 points in the first is 3 times the average density of points in
 the second. And so on.
 
 Therefore your method will give points whose density per unit
 area is more concentrated the nearer you are to the centre of
 the circle.
 
 You can check this visually with the following code:
 
   r - runif(5000)
   t - 2*pi*runif(5000)
   x - r*cos(t) ; y - r*sin(t)
   plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1))
 
 The above argument should suggest the correct way to proceed.
 You need equal increments in probability that radius  r
 to correspond to equal increments in the area of a circle with
 radius r. This means that the probability that radius  r
 should be proportional to r^2. Hence make the random radius r
 to be such that r^2 is uniformly distributed.
 
 Now try the following modification of the above code:
 
   r - sqrt(runif(5000))
   t - 2*pi*runif(5000)
   x - r*cos(t) ; y - r*sin(t)
   plot(x,y,pch=+,col=blue,xlim=c(-1,1),ylim=c(-1,1))
 
 and you will see that (to within random scatter) the result is
 uniformly distributed over the circle. (By the way, don't use
 degrees from 0 to 360 in sin() and cos() -- these use radians
 from 0 to 2*pi).
 
  Another option will be to use the “rejection method: generate
  points inside a box and reject points that fall outside the circle.
  However I think this is much complicated--or at least I don't know
  how to programme it.
 
 It is more complicated. Uniform x and y are easy, but then you have
 to check that x^2 + y^2 = 1 and reject it if not; and also keep
 a count of the number of points you have accepted and check whether
 this has attained the number of points you need, so that you can
 continue sampling until you have enough points (which you cannot
 predict at the start).
 
 It is also (though in this case not very) inefficient. A unit
 circle has radius pi = 3.14... and the enclosing square has
 area 4, so you will reject nearly 25% of your points.
 

Since these were geographical points, I'd risk using the sp package for
this, but first a little function to make a polygon-circle based on Gabor
Csardi's answer to a similar question in April:

circle - function(x, y, r, length=100, zapsmall=TRUE) {
  t - seq(0, 2*pi, length=length)
  coords - t(rbind(x+sin(t)*r, y+cos(t)*r)) 
  if (zapsmall) coords - zapsmall(coords)
  coords
}

Then:

library(sp)
crds - circle(0, 0, 1)
Poly_0.0.1 - Polygon(crds)
set.seed(061007)
pts - sp:::sample.Polygon(Poly_0.0.1, n=100, type=random)
plot(crds, type=l, asp=1)
dim(coordinates(pts))
points(coordinates(pts))

which uses rejection. The distances follow as:

spDistsN1(coordinates(pts), c(0,0))

Roger


  So,
  1. Is there any package that generates random points in a circle? (I
  couldn’t find any)
 
 Use the above code as a basis. For n points in a circle of radius
 1000, multiply sqrt(runif(n)) by 1000.
 
  2. Do you think the code bellow produces uneven locations?
 
 Yes (see above)
 
  3. Any suggestions to programme the rejection method? Maybe someone
  has already used it...
 
 I wouldn't bother ... ! But if you are really interested in how
 it can be done, please write again.
 
  Cheers
  
  albert
  
 #Code random location in a circle
 # Generate UTM coordinates (original location)
  x_utm-c(289800)
  y_utm-c(4603900)
  coord-as.data.frame(cbind(x_utm,y_utm))
 # Generate a random radius with a maximum distance of 1000 meters from
 # the original location.
 radius-runif(1,0,1000)
 # Generate a random angle
 angle-runif(1,0,360)
 #Generate a random x coordinate
 x_rdm-coord$x_utm[1]+radius*cos(angle)
 #Generate a random y coordinate
 y_rdm-coord$y_utm[1]+radius*sin(angle)
 result-as.data.frame(cbind(x_rdm,y_rdm, x_utm, y_utm))
 # We can calculate the distance between the original and the random
 # location
  

[R] ifelse(logical, function1, function2) does not work

2006-10-07 Thread Alberto Vieira Ferreira Monteiro
Why this kind of assignment does not work?

  n - 1
  f - ifelse(n == 1, sin, cos)
  f(pi)

this must be rewritten as:

  n - 1
  f - cos
  if (n == 1) f - sin
  f(pi)

[oops. 1.224606e-16 instead of zero. Damn floating point errors :-/]

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-pkg] New packages pmg, gWidgets, gWidgetsRGtk2

2006-10-07 Thread verzani
I answered Gabor off line, but realize that this may cause some others
confusion. The RGtk2 package requires the GTK libraries to be
installed first. This varies from system to system. Under Windows it
consists of downloading some files from sourceforget (see
below). After which RGtk2 installs like any other R package. 

Gabor brought up an additional point, in my example for gWidgets I have
an argument to gtable() copied twice. The full example should read

win = gwindow(Select a CRAN mirror)
size(win) - c(600,400)
tbl = gtable(utils:::getCRANmirrors(), chosencol=4, 
  container=win,
  filter.column=2,
  handler = function(h,...) {
URL = svalue(tbl)
repos - getOption(repos)
repos[CRAN] - gsub(/$, , URL[1])
options(repos = repos)
dispose(win)
  })


Finally, for some reason I'm trying to track down, under Windows, pmg
is crashing when a cairoDevice graphics device is being closed. This
affects the plotnotebook and the Lattice Explorer. I have an update to
gWidgetsRGtk2 that may fix it, but I haven't had a chance to fully
test it. It can be installed with

install.packages(gWidgetsRGtk2,repos=http://www.math.csi.cuny.edu/pmg;)

--John

Gabor Grothendieck [EMAIL PROTECTED] writes:

 Hi, I need installation instructions. library(pmg) seems not to be enough.
 Thanks.

 library(pmg)
 Loading pmg()
 Loading required package: gWidgets
 Loading required package: gWidgetsRGtk2
 Loading required package: RGtk2
 Error in dyn.load(x, as.logical(local), as.logical(now)) :
 unable to load shared library
 'C:/PROGRA~1/R/R-24~1.0/library/RGtk2/libs/RGtk2.dll':
   LoadLibrary failure:  The specified procedure could not be found.

 Error: package 'RGtk2' could not be loaded
 Error: .onAttach failed in 'attachNamespace'
 Loading required package: pmggWidgetsRGtk
 PMG needs gWidgetsError in assign(x, value, envir = ns, inherits = FALSE) :
 could not find function gwindow
 In addition: Warning message:
 there is no package called 'pmggWidgetsRGtk' in: library(package,
 lib.loc = lib.loc, character.only = TRUE, logical = TRUE,
 Error: .onLoad failed in 'loadNamespace' for 'pmg'
 Error: package/namespace load failed for 'pmg'

 On 10/6/06, j verzani [EMAIL PROTECTED] wrote:
 I'd like to announce three new packages on CRAN: pmg, gWidgets, and
 gWidgetsRGtk2.

 --John Verzani

 1  PMG
 *=*=*=


  The pmg package for R provides a relatively simple graphical user
 interface for R in a manner similar to the more mature RCmdr package.
 Basically this means a menu-driven interface to dialogs that collect
 arguments for R functions. This GUI was written with an eye towards
 simplifying the learning curve of R for introductory statistics
 students.
  The pmg package uses the GTK toolkit via the RGtk2 (1) package by
 Michael Lawrence. Some features are

  - The GUI works under Windows, Mac OS X, Linux/X11
  - The GUI takes advantage of GTK's drag-and-drop capabilities.
  - Several dialogs for analyses performed in an introductory  statistics
   course can be used simply by dragging variables  around.
  - With the cairoDevice package, the GUI provides a  notebook interface
   to a graphics device that allows UNIX users to  easily manage
   multiple graphics devices.



 2  gWidgets
 *=*=*=*=*=*


  The RGtk2 package is interfaced with the new gWidgets package, which
 may be of independent interest to those who would like to add GUI
 elements to their work.  The gWidgets package provides a
 toolkit-independent API for interactive GUIs based on the iWidgets API
 of Simon Urbanek, with additional suggestions given by Philippe Grosjean
 and Michael Lawrence. The gWidgetsRGtk2 package provides the link
 between gWidgets and RGtk2, allowing the GTK toolkit to be used through
 gWidgets. It is possible for others to write a link between gWidgets and
 some other GUI toolkit.
  The gWidgets API is nowhere near as rich as the RGtk2 interface to the
 toolkit, but is much easier to learn. It is well-suited for simple
 applications or for rapid prototyping of more complicated applications.
 An accompanying vignette shows many examples of its use. For a simple,
 but not trivial, illustration, this code, following chooseCRANmirror(),
 shows how to use gWidgets to make a dialog for selecting a CRAN mirror.
 
win = gwindow(Select a CRAN mirror)
size(win) - c(600,400)
tbl = gtable(utils:::getCRANmirrors(), chosencol=4, filter.column=2,
  container=win,
  filter.column=2,
  handler = function(h,...) {
URL = svalue(tbl)
repos - getOption(repos)
repos[CRAN] - gsub(/$, , URL[1])
options(repos = repos)
dispose(win)
  })
 
  Two widgets are created: a base window to be a container for the
 widget that displays a table, which has a few extra arguments shown to
 indicate its flexibility. The widget to display a table allows a user to
 double click on a row and expect some action to happen. In this case,
 the handler 

Re: [R] ifelse(logical, function1, function2) does not work

2006-10-07 Thread Gabor Grothendieck
Try

n - 1
f - if (n == 1) sin else cos
f(pi)


On 10/7/06, Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] wrote:
 Why this kind of assignment does not work?

  n - 1
  f - ifelse(n == 1, sin, cos)
  f(pi)

 this must be rewritten as:

  n - 1
  f - cos
  if (n == 1) f - sin
  f(pi)

 [oops. 1.224606e-16 instead of zero. Damn floating point errors :-/]

 Alberto Monteiro

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse(logical, function1, function2) does not work

2006-10-07 Thread Peter Dalgaard
Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] writes:

 Why this kind of assignment does not work?
 
   n - 1
   f - ifelse(n == 1, sin, cos)
   f(pi)

It's not supposed to. 

 'ifelse' returns a value with the same shape as 'test' which is
 filled with elements selected from either 'yes' or 'no' depending
 on whether the element of 'test' is 'TRUE' or 'FALSE'.

which makes very little sense if yes and no are functions.
 
 this must be rewritten as:
 
   n - 1
   f - cos
   if (n == 1) f - sin
   f(pi)

No, it must not.

n - 1
f - if (n==1) sin else cos
f(pi)

or even 

(if (n==1) sin else cos)(pi)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave, latex Warning

2006-10-07 Thread Göran Broström
On 10/7/06, Gregor Gorjanc [EMAIL PROTECTED] wrote:
 Göran Broström goran.brostrom at gmail.com writes:

 
  When I am using Sweave to produce reports I get a strange Warning from 
  latex:
 
  LaTeX Warning: You have requested package 
  `/usr/local/lib/R/share/texmf/Sweave'
 but the package provides `Sweave'.
 
  It seems to be completely harmless, but I wonder if it is a symptom of
  installation problems, R or latex. How can I get rid of it?
 
  This is on linux (ubuntu), with R-2.4.0.

 What do you actually use in Rnw file? Is it

 \usepackage{/usr/local/lib/R/share/texmf/Sweave}

 or

 \usepackage{Sweave}

Neither; Sweave puts the former into the tex file. I'll search the
r-devel archives.

Thanks, Göran

 I think this was discussed in r-devel list
 a while ago.

 Gregor

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Göran Broström

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Estimate in Wilcox_test vs wilcox.exact

2006-10-07 Thread Dieter Menne
 Jue.Wang2 at sanofi-aventis.com writes:

 
 Does any one know why wilcox.exact sometimes doesn't agree with wilcox_test on
the estimate of the
 difference of medians in two levels ?
 

Example removed

See: 
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/85893.html


Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse(logical, function1, function2) does not work

2006-10-07 Thread Rolf Turner
Peter Dalgaard writes:

 Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] writes:
 
  Why this kind of assignment does not work?
  
n - 1
f - ifelse(n == 1, sin, cos)
f(pi)
 
 It's not supposed to. 
 
  'ifelse' returns a value with the same shape as 'test' which is
  filled with elements selected from either 'yes' or 'no' depending
  on whether the element of 'test' is 'TRUE' or 'FALSE'.
 
 which makes very little sense if yes and no are functions.

I think that's a debatable assertion.  Why might I
not want a vector, or rather a list, of functions?

(Mr. Monteiro's real sin is a very common one ---
using ifelse() unnecessarily/inappropriately, i.e.
when the ``test'' argument is a scalar.)

But to get back to trying to apply ifelse() with functions
for the ``yes'' and ``no'' arguments:

What's happening is really a feature of the ***rep()***
function, and the way that it treats objects of different
natures.  If you do rep(1,5) you get a vector of 5 1's.  If
you do rep(sin,5) you get an error message.

Note however that if you do

 x - list(a=17,b=42)
 junk - rep(x,5)

then junk is a list of length 5, each entry of which is
a copy of x.  I.e. rep() works as expected (?) with lists.

And if you make the assignments

 y - list(a=42,b=17)
 n - 1

and then execute

 ifelse(n==1,x,y)

you get a copy of x.  So rep does its thing with numbers (and
vectors) and lists, but not with *functions*.

What else might one try to rep?  What about matrices?  Works,
but the matrix is coerced to a vector first.  One might have
naively hoped that rep(M,5) would yield a list of length 5,
each entry of which was a copy of M.  But it doesn't; it
gives a vector of length 5*nrow(M)*ncol(M) consisting of
the data of M strung out in column order, 5 times over.

The function rep() seems to have (S3) methods associated
with it --- rep.Date, rep.factor, etc., but no generic
function.  I.e. rep() does not seem to dispatch methods.
Nor is there a rep.default().

I wrote a ``method'' for the matrix class

rep.matrix - function(x,times) {
if(!inherits(x,matrix))
stop(Argument x is not a matrix.\n)
ans - list()
ans[1:times] - list(x)
ans
}

That seemed to give the result I expected/wanted; rep(M,5)
did indeed give a list of length 5, each entry of which was a
copy of M.

However a similar ``method'' for functions did not work;
rep(sin,5) gave the same old ``object is not subsettable''
error.

But if I called rep.function() *explicitly* I got what I
wanted.  I.e.

 rep.function(sin,5)

gave a list of length 5 each entry of which was
``.Primitive(sin)''.

Questions:  How do methods for rep() get dispatched
when there is no generic rep()?

How come the matrix method that I wrote
got dispatched, but the function method didn't?

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse(logical, function1, function2) does not work

2006-10-07 Thread Gabor Grothendieck
I have noticed that dispatch on functions seems not to work
in another case too.  We define + on functions (I have ignored
the niceties of sorting out the environments as we don't really
need it for this example) but when we try to use it, it fails even
though in the second example if we run it explicitly it succeeds:

 +.function - function(x, y) function(z) x(z) + y(z)
 sin+cos
Error in sin + cos : non-numeric argument to binary operator
 +.function(sin, cos)
function(z) x(z) + y(z)
environment: 0x01c4ae7c

By the way, rep(list(sin), 5) and rep(list(matrix(1:4, 2)), 5) will
create lists of functions and matrices although I agree that the
way it works is pretty strange.

On 10/7/06, Rolf Turner [EMAIL PROTECTED] wrote:
 Peter Dalgaard writes:

  Alberto Vieira Ferreira Monteiro [EMAIL PROTECTED] writes:
 
   Why this kind of assignment does not work?
  
 n - 1
 f - ifelse(n == 1, sin, cos)
 f(pi)
 
  It's not supposed to.
 
   'ifelse' returns a value with the same shape as 'test' which is
   filled with elements selected from either 'yes' or 'no' depending
   on whether the element of 'test' is 'TRUE' or 'FALSE'.
 
  which makes very little sense if yes and no are functions.

I think that's a debatable assertion.  Why might I
not want a vector, or rather a list, of functions?

(Mr. Monteiro's real sin is a very common one ---
using ifelse() unnecessarily/inappropriately, i.e.
when the ``test'' argument is a scalar.)

But to get back to trying to apply ifelse() with functions
for the ``yes'' and ``no'' arguments:

What's happening is really a feature of the ***rep()***
function, and the way that it treats objects of different
natures.  If you do rep(1,5) you get a vector of 5 1's.  If
you do rep(sin,5) you get an error message.

Note however that if you do

 x - list(a=17,b=42)
 junk - rep(x,5)

then junk is a list of length 5, each entry of which is
a copy of x.  I.e. rep() works as expected (?) with lists.

And if you make the assignments

 y - list(a=42,b=17)
 n - 1

and then execute

 ifelse(n==1,x,y)

you get a copy of x.  So rep does its thing with numbers (and
vectors) and lists, but not with *functions*.

What else might one try to rep?  What about matrices?  Works,
but the matrix is coerced to a vector first.  One might have
naively hoped that rep(M,5) would yield a list of length 5,
each entry of which was a copy of M.  But it doesn't; it
gives a vector of length 5*nrow(M)*ncol(M) consisting of
the data of M strung out in column order, 5 times over.

The function rep() seems to have (S3) methods associated
with it --- rep.Date, rep.factor, etc., but no generic
function.  I.e. rep() does not seem to dispatch methods.
Nor is there a rep.default().

I wrote a ``method'' for the matrix class

rep.matrix - function(x,times) {
if(!inherits(x,matrix))
stop(Argument x is not a matrix.\n)
ans - list()
ans[1:times] - list(x)
ans
}

That seemed to give the result I expected/wanted; rep(M,5)
did indeed give a list of length 5, each entry of which was a
copy of M.

However a similar ``method'' for functions did not work;
rep(sin,5) gave the same old ``object is not subsettable''
error.

But if I called rep.function() *explicitly* I got what I
wanted.  I.e.

 rep.function(sin,5)

gave a list of length 5 each entry of which was
``.Primitive(sin)''.

Questions:  How do methods for rep() get dispatched
when there is no generic rep()?

How come the matrix method that I wrote
got dispatched, but the function method didn't?

cheers,

Rolf Turner
[EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Row comparisons to a new matrix?

2006-10-07 Thread Atte Tenkanen
---BeginMessage---
Thanks Gabor,

Your version is handy to use, because you can change the function as you like. 
However it isn't any faster and if you know some way to make the result matrix 
more quickly, I'm interested to learn it. My test material (musical 
improvisations) consists of samples with 2x2 or even bigger result 
matrices.

Atte

- Original Message -
From: Gabor Grothendieck [EMAIL PROTECTED]
Date: Saturday, October 7, 2006 0:28 am
Subject: Re: [R] Row comparisons to a new matrix?

 There is a generalized inner product here:
 
 http://tolstoy.newcastle.edu.au/R/help/05/04/3709.html
 
 On 10/6/06, Atte Tenkanen [EMAIL PROTECTED] wrote:
  Hi,
  Can somebody tell me, which is the fastest way to make 
 comparisons between all rows in a matrix (here A) and put the 
 results to the new symmetric matrix? I have here used cosine 
 distance as an example, but the comparison function can be any 
 other, euclidean dist etc.
 
  A=rbind(c(2,3),c(4,5),c(-1,2),c(5,6))
 
  M=matrix(nrow=length(A[,1]),ncol=length(A[,1]))
 
  for(i in 1:length(A[,1]))
  {
 for(j in 1:length(A[,1]))
 {
 M[i,j]=cosine(A[i,],A[j,])
 }
  }
 
  Atte Tenkanen
  University of Turku, Finland
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html and provide commented, minimal, self-contained, 
 reproducible code.
 
 
---End Message---
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Row comparisons to a new matrix?

2006-10-07 Thread Atte Tenkanen
Thanks Gabor,

Your version is handy to use, because you can change the function as you like. 
However it isn't any faster and if you know some way to make the result matrix 
more quickly, I'm interested to learn it. My test material (musical 
improvisations) consists of samples with 2x2 or even bigger result 
matrices.

Atte

- Original Message -
From: Gabor Grothendieck [EMAIL PROTECTED]
Date: Saturday, October 7, 2006 0:28 am
Subject: Re: [R] Row comparisons to a new matrix?

 There is a generalized inner product here:

 http://tolstoy.newcastle.edu.au/R/help/05/04/3709.html

 On 10/6/06, Atte Tenkanen [EMAIL PROTECTED] wrote:
  Hi,
  Can somebody tell me, which is the fastest way to make
 comparisons between all rows in a matrix (here A) and put the
 results to the new symmetric matrix? I have here used cosine
 distance as an example, but the comparison function can be any
 other, euclidean dist etc.
 
  A=rbind(c(2,3),c(4,5),c(-1,2),c(5,6))
 
  M=matrix(nrow=length(A[,1]),ncol=length(A[,1]))
 
  for(i in 1:length(A[,1]))
  {
 for(j in 1:length(A[,1]))
 {
 M[i,j]=cosine(A[i,],A[j,])
 }
  }
 
  Atte Tenkanen
  University of Turku, Finland
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html and provide commented, minimal, self-contained,
 reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error compiling packages with r-2.4.0 on ppc mac osx 10.4.7

2006-10-07 Thread Strand, Allan Edgar

Hi All,

As the subject says, I am working on a powerPC (G4) mac with OSX version 
10.4.7.  
I upgraded from r-2.2.1 to r-2.4.0 today using the universal binary dmg file 
downloaded from CRAN. This is the complete, not mini binary.

The interactive environment works great, but I am having difficulty installing 
packages that require C code to be compiled. This includes my own packages and 
I have also tried the Matrix package. 

It looks like compilation is not the problem, linking is.  In particular, when 
linking object files, every package install dies with this message:

/usr/bin/libtool: unknown option character `m' in: -macosx_version_min

I use fink and libtool is installed at version  1.5.10.  Clearly this looks 
like a configuration problem with my machine, but I am a bit confused on where 
to go next.  Any pointers would be greatly appreciated.

cheers,
Allan Strand

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-pkgs] New versions of Matrix and lme4 packages for R-2.4.0

2006-10-07 Thread Patrick Connolly
On Tue, 03-Oct-2006 at 09:56AM -0500, Douglas Bates wrote:

| Versions 0.9975-1 of the Matrix and lme4 packages will soon be available on
| CRAN for use with R version 2.4.0 or later.

How quickly things change!!  When I got to look at CRAN, the latest
versions were 0.9975-1 of lme4 and 0.9975-2 of Matrix which I tried.  



platform   x86_64-unknown-linux-gnu
arch   x86_64
os linux-gnu
system x86_64, linux-gnu
status
major  2
minor  4.0
year   2006
month  10
day03
svn rev39566
language   R
version.string R version 2.4.0 (2006-10-03)


Matrix installs, but with a number of warnings like these:

Warning in matchSignature(signature, fdef, where) :
 in the method signature for function coerce no definition for class: 
matrix.csr
Warning in matchSignature(signature, fdef, where) :
 in the method signature for function coerce no definition for class: 
matrix.csr


When I try to install lme4, it fails with messages like these:

*Installing *source* package 'lme4' ...
** libs
gcc -I/usr/local/R-2.4.0/include -I/usr/local/R-2.4.0/include  
-I/usr/local/include-fpic  -g -O2 -std=gnu99 -c init.c -o init.o
In file included from init.c:1:
lme4_utils.h:4:20: error: Matrix.h: No such file or directory
In file included from init.c:1:
lme4_utils.h:14: error: syntax error before ��c��

My guess is that the warnings from the Matrix installation were not
irrelevant, and that the type of error would indicate to the
cognoscenti[1] what could be missing in my installation.  If the
second part of my guess is not correct, I'll gladly supply more of the
messages if the above is insufficient.  Of course, I'm open to
correction on the first part also.

TIA




1. cognoscenti:- persons who have superior knowledge and understanding
of a particular field, (http://dictionary.reference.com/search?q=cognoscenti)


-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installing Lindsey's packages

2006-10-07 Thread Michael Kubovy
Dear r-helpers,

I downloaded http://popgen.unimaas.nl/~jlindsey/rcode/rmutil.tar (it  
was originally .tgz, but got unzipped by my browser).

Can anyone give me detailed instructions on installing this and  
Lindsey's other packages on R version 2.4.0 (2006-10-03)---(powerpc- 
apple-darwin8.7.0, locale:  C)?
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum of Bernoullis with varying probabilities

2006-10-07 Thread Ted Harding
Hi again.
I had suspected that doing the calculation by a convolution
method might be both straightforward and efficient in R.

I've now located convolve() (in 'base'!!), and have a solution
using this function. Details below. Original statement of
problem, and method originally proposed, repeated below for
reference.

On 10/6/06, Ted Harding [EMAIL PROTECTED] wrote:
 Hi Folks,

 Given a series of n independent Bernoulli trials with
 outcomes Yi (i=1...n) and Prob[Yi = 1] = Pi, I want

   P = Prob[sum(Yi) = r]  (r = 0,1,...,n)

 I can certainly find a way to do it:

 Let p be the vector c(P1,P2,...,Pn).
 The cases r=0 and r=n are trivial (and also are exceptions
 for the following routine).

 For a given value of r in (1:(n-1)),

   library(combinat)
   Set - (1:n)
   Subsets - combn(Set,r)
   P - 0
   for(i in (1:dim(Subsets)[2])){
 ix - numeric(n)
 ix[Subsets[,i]] - 1
 P - P + prod((p^ix) * ((1-p)^(1-ix)))
   }

The convolution method implemented in convolve computes, for
two series X[1],X[2],...,X[n] and Y[1],Y[2],...,Y[m] the
series (k = 1, 2, ... , n+m-1):

  Z[k] = sum( X[k-m+i]*Y[i] )

over valid values of i, i.e. products of terms of X matched
with terms of a shifted Y, using the open method (see
?convolve).

This is not quite what is wanted for the type of convolution
needed to compute the distribution of the sum of two integer
random variables, since one needs to reverse one of the series
being convolved. This is the basis of the following:

Given a vector p of probabilities P1, P2, P3, for Y=1 in
successive trials, I need to convolve the successive Bernoulli
distributions (1-P1,P1), (1-P2,P2), ... , (1-Pn,Pn). Hence:

  ps-cbind(1-p,p); u-ps[1,]; u1-u
  for(i in (2:16)){
u-convolve(u,rev(ps[i,]),type=o)
  }

In the case I was looking at, I had already obtained the vector
P of probabilities P(Sum(Y)=0), P(sum(Y)=1), ... by the method
previously described (repeated above); and here n=16.

Having obtained u by the above routine, I can now compare the
results, u with P:

cbind((0:16),u,P)
   r uP
   0  1.353353e-01 1.353353e-01
   1  3.007903e-01 3.007903e-01
   2  2.976007e-01 2.976007e-01
   3  1.747074e-01 1.747074e-01
   4  6.826971e-02 6.826971e-02
   5  1.884969e-02 1.884969e-02
   6  3.804371e-03 3.804371e-03
   7  5.720398e-04 5.720398e-04
   8  6.463945e-05 6.463945e-05
   9  5.489454e-06 5.489454e-06
  10  3.473997e-07 3.473997e-07
  11  1.607822e-08 1.607822e-08
  12  5.262533e-10 5.262532e-10
  13  1.148626e-11 1.148618e-11
  14  1.494650e-13 1.493761e-13
  15  9.008700e-16 8.764298e-16
  16 -1.896716e-17 1.313261e-19

so, apart from ignorable differences in the very small
probabilities for r11, they agree. I have to admit, though,
that I had to tread carefully (and experiment with Binomials)
to make sure of exactly how to introduce the reversal
(at rev(ps[i,])).

And the convolution method is *very* much faster than the
selection of all subsets method!

However, I wonder whether the subsets method may be the more
accurate of the two (not that it really matters).

Best wishes to all,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 07-Oct-06   Time: 22:03:27
-- XFMail --

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merge and polylist

2006-10-07 Thread Mihai Nica
Greetings:

I would like to kindly ask for a little help. The rough code is:

#

dat=data.frame(read.delim(file=all.txt, header = TRUE, sep = \t,
quote=\, dec=.,na.strings = NA))

nc=read.shape(astae.shp, dbf.data=TRUE, verbose=TRUE)

mappolys=Map2poly(nc)

submap - subset(mappolys, nc$att.data$NAME!=Honolulu, HI)

nc$att.data=subset(nc$att.data, nc$att.data$NAME!=Honolulu, HI)

nc$att.data[,1]=as.numeric(paste(nc$att.data$MSACMSA))

#attributes(nc$att.data)

nc$att.data=merge(nc$att.data, dat, by.x=AREA, by.y=cod, all.x=TRUE,
sort=FALSE)

#attributes(nc$att.data)

tmp=file(tmp)

write.polylistShape(submap, nc$att.data, tmp)

#_

All works fine, but merge() changes the rownames and the link between the
polygons and the corresponding rows is lost. I tried numerous other
solutions (such as to paste back the old rownames), to no avail. After a
few days, here I am. Please, if you have a moment, send a tip.

 Thanks,

 mihai


-- 
Mihai Nica
170 East Griffith Street G5
Jackson, MS 39201

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xlsReadWrite version 1.1.1 available

2006-10-07 Thread Hans-Peter
An updated version 1.1.1 of xlsReadWrite has been submitted to cran.
Currently the cran version is still 1.0, but you can download the
updated version already here: http://treetron.googlepages.com.

xlsReadWrite is a package which allows you to natively read and write
Excelfiles (windows only).

Changes from xlsReadWrite 1.0.0 to 1.1.1

- added colClasses argument (rowname, integer, double, factor,
   character and NA to determine type of data.frame column)

- NA values are no longer written into the Excelsheet.
- vector will be written as a column instead of a a row

- FIXED IMPORTANT BUG which could happen in rare cases in a RKRecord
   cell and led to numbers not divided by 100, e.g. wrong by this factor.
   (thanks to Eric Rexstad for reporting!)

- fixed bug with colNames: character vector in colNames didn't
   have an effect, I always used the Excel header row
- compiled with range checks enabled (this should have been
   done from the beginning; from now on programming errors
   surface properly as range check errors).
- minor changes in help text, README, testscripts improved...

- LAST BUT NOT LEAST: clarified license text by granting an explicit
   right to freely distribute and use this package incl. the dll.
   (my own source is still licensed as GPLv2 (same as before)).
Removed R-copyright notices in my source (had been there as
a kind of credit but can be misunderstood)).


-- 
Regards,
Hans-Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected behavior of boxplot(x, notch=TRUE, log=y)

2006-10-07 Thread Ben Bolker
bogdan romocea br44114 at gmail.com writes:

 
 A function I've been using for a while returned a surprising [to me,
 given the data] error recently:
Error in plot.window(xlim, ylim, log, asp, ...) :
Logarithmic axis must have positive limits
 
 After some digging I realized what was going on:
 x - c(10460.97, 10808.67, 29499.98, 1, 35818.62, 48535.59, 1, 1,
42512.1, 1627.39, 1, 7571.06, 21479.69, 25, 1, 16143.85, 12736.96,
1, 7603.63, 1, 33155.24, 1, 1, 50, 3361.78, 1, 37781.84, 1, 1,
1, 46492.05, 22334.88, 1, 1)
 summary(x)
 boxplot(x,notch=TRUE,log=y)  #unexpected
 boxplot(x)  #ok
 boxplot(x,log=y)  #ok
 boxplot(x,notch=TRUE)  #aha
 

  Mick Crawley (author of several books on ecological
data analysis in R) submitted a related issue as
bug #7690, which I was mildly surprised to see
filed as not reproducible (I didn't have problems reproducing
it at the time ... I posted my then-patch
to R-devel at the time
https://stat.ethz.ch/pipermail/r-devel/2006-January/036257.html )  
The problem typically occurs
for very small data sets, when the notches can
be bigger than the hinges.  

  As I said then,

  I can imagine debate about what should be done in this case --
 you could just say don't do that, since the notches are based
 on an asymptotic argument ... the diff below just truncates
 the notches to the hinges, but produces a warning saying that the 
 notches have been truncated.

The interaction with
log=y is new to me, though, and my old patch
didn't catch it.

   Here's my reproducible version:

set.seed(1001)
npts - 7
X - rnorm(2*npts,rep(c(3,4.5),each=npts),sd=1)
f - factor(rep(1:2,each=npts))
par(mfrow=c(1,2))
boxplot(X~f,notch=TRUE)

  A possible fix is to truncate the notches
(and issue a warning) when this happens,
in src/library/grDevices/R/calc.R:

*** calc.R  2006-10-07 17:44:49.0 -0400
--- newcalc.R   2006-10-07 19:25:38.0 -0400
***
*** 16,21 
--- 16,26 
if(any(out[nna])) stats[c(1, 5)] - range(x[!out], na.rm = TRUE)
  }
  conf - if(do.conf) stats[3] + c(-1.58, 1.58) * iqr / sqrt(n)
+ if (do.conf) {
+   if (conf[1]stats[2] || conf[2]stats[4]) warning(confidence limits 
hinges: notches truncated)
+   conf[1] - max(conf[1],stats[2])
+   conf[2] - min(conf[2],stats[4])
+ }
  list(stats = stats, n = n, conf = conf,
 out = if(do.out) x[out  nna] else numeric(0))
  }

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can lm() automatically take in the independent variables without knowing the names in advance

2006-10-07 Thread HelponR
Hello!

I am trying to use lm to do a simple regression but on a batch of
different files.

Each file has different column names.

I know the first column is the dependent variable and all the rest are
explanatory variables.

The column names are in the first line of the file.

But it seems lm() requires I know the variable names in advance?

Is there a way to dynamically read in variable names and send to lm()?

Many many thanks!!!


Urania

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sum of Bernoullis with varying probabilities

2006-10-07 Thread Gabor Grothendieck
One can get a one-line solution by taking the product of the FFTs.
For example, let p - 1:4/8 be the probabilities.  Then the solution is:

fft(exp(rowSums(log(mvfft(t(cbind(1-p,p,0,0,0)), inverse = TRUE)/5

On 10/7/06, Ted Harding [EMAIL PROTECTED] wrote:
 Hi again.
 I had suspected that doing the calculation by a convolution
 method might be both straightforward and efficient in R.

 I've now located convolve() (in 'base'!!), and have a solution
 using this function. Details below. Original statement of
 problem, and method originally proposed, repeated below for
 reference.

 On 10/6/06, Ted Harding [EMAIL PROTECTED] wrote:
  Hi Folks,
 
  Given a series of n independent Bernoulli trials with
  outcomes Yi (i=1...n) and Prob[Yi = 1] = Pi, I want
 
P = Prob[sum(Yi) = r]  (r = 0,1,...,n)
 
  I can certainly find a way to do it:
 
  Let p be the vector c(P1,P2,...,Pn).
  The cases r=0 and r=n are trivial (and also are exceptions
  for the following routine).
 
  For a given value of r in (1:(n-1)),
 
library(combinat)
Set - (1:n)
Subsets - combn(Set,r)
P - 0
for(i in (1:dim(Subsets)[2])){
  ix - numeric(n)
  ix[Subsets[,i]] - 1
  P - P + prod((p^ix) * ((1-p)^(1-ix)))
}

 The convolution method implemented in convolve computes, for
 two series X[1],X[2],...,X[n] and Y[1],Y[2],...,Y[m] the
 series (k = 1, 2, ... , n+m-1):

  Z[k] = sum( X[k-m+i]*Y[i] )

 over valid values of i, i.e. products of terms of X matched
 with terms of a shifted Y, using the open method (see
 ?convolve).

 This is not quite what is wanted for the type of convolution
 needed to compute the distribution of the sum of two integer
 random variables, since one needs to reverse one of the series
 being convolved. This is the basis of the following:

 Given a vector p of probabilities P1, P2, P3, for Y=1 in
 successive trials, I need to convolve the successive Bernoulli
 distributions (1-P1,P1), (1-P2,P2), ... , (1-Pn,Pn). Hence:

  ps-cbind(1-p,p); u-ps[1,]; u1-u
  for(i in (2:16)){
u-convolve(u,rev(ps[i,]),type=o)
  }

 In the case I was looking at, I had already obtained the vector
 P of probabilities P(Sum(Y)=0), P(sum(Y)=1), ... by the method
 previously described (repeated above); and here n=16.

 Having obtained u by the above routine, I can now compare the
 results, u with P:

 cbind((0:16),u,P)
   r uP
   0  1.353353e-01 1.353353e-01
   1  3.007903e-01 3.007903e-01
   2  2.976007e-01 2.976007e-01
   3  1.747074e-01 1.747074e-01
   4  6.826971e-02 6.826971e-02
   5  1.884969e-02 1.884969e-02
   6  3.804371e-03 3.804371e-03
   7  5.720398e-04 5.720398e-04
   8  6.463945e-05 6.463945e-05
   9  5.489454e-06 5.489454e-06
  10  3.473997e-07 3.473997e-07
  11  1.607822e-08 1.607822e-08
  12  5.262533e-10 5.262532e-10
  13  1.148626e-11 1.148618e-11
  14  1.494650e-13 1.493761e-13
  15  9.008700e-16 8.764298e-16
  16 -1.896716e-17 1.313261e-19

 so, apart from ignorable differences in the very small
 probabilities for r11, they agree. I have to admit, though,
 that I had to tread carefully (and experiment with Binomials)
 to make sure of exactly how to introduce the reversal
 (at rev(ps[i,])).

 And the convolution method is *very* much faster than the
 selection of all subsets method!

 However, I wonder whether the subsets method may be the more
 accurate of the two (not that it really matters).

 Best wishes to all,
 Ted.

 
 E-Mail: (Ted Harding) [EMAIL PROTECTED]
 Fax-to-email: +44 (0)870 094 0861
 Date: 07-Oct-06   Time: 22:03:27
 -- XFMail --

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can lm() automatically take in the independent variableswithout knowing the names in advance

2006-10-07 Thread Leeds, Mark \(IED\)
someone may know how to do what you asked but a get around is to not
read the header in the file and assign
your own variable names to each column.  This way you have control and
you don't worry about doing anything dynamically.

This is pretty easy if you are using read.table and probably also
possible with other types of in read in file functions.


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of HelponR
Sent: Saturday, October 07, 2006 8:18 PM
To: r-help
Subject: [R] can lm() automatically take in the independent
variableswithout knowing the names in advance

Hello!

I am trying to use lm to do a simple regression but on a batch of
different files.

Each file has different column names.

I know the first column is the dependent variable and all the rest are
explanatory variables.

The column names are in the first line of the file.

But it seems lm() requires I know the variable names in advance?

Is there a way to dynamically read in variable names and send to lm()?

Many many thanks!!!


Urania

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can lm() automatically take in the independent variables without knowing the names in advance

2006-10-07 Thread Gabor Grothendieck
Try this:


run.lm - function(DF, response = names(DF)[1], fo = y~.) {
fo[[2]] - as.name(response)
eval(substitute(lm(fo, DF)))
}

# test
run.lm(iris)
run.lm(iris, Sepal.Width)


Another possibility is to rename the first column:

On 10/7/06, HelponR [EMAIL PROTECTED] wrote:
 Hello!

 I am trying to use lm to do a simple regression but on a batch of
 different files.

 Each file has different column names.

 I know the first column is the dependent variable and all the rest are
 explanatory variables.

 The column names are in the first line of the file.

 But it seems lm() requires I know the variable names in advance?

 Is there a way to dynamically read in variable names and send to lm()?

 Many many thanks!!!


 Urania

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can lm() automatically take in the independent variables without knowing the names in advance

2006-10-07 Thread HelponR
This works beautifully! You know I did not know how many columns in advance
either. So this method is very good.

Many thanks!




On 10/7/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:

 Try this:


 run.lm - function(DF, response = names(DF)[1], fo = y~.) {
fo[[2]] - as.name(response)
eval(substitute(lm(fo, DF)))
 }

 # test
 run.lm(iris)
 run.lm(iris, Sepal.Width)


 Another possibility is to rename the first column:

 On 10/7/06, HelponR [EMAIL PROTECTED] wrote:
  Hello!
 
  I am trying to use lm to do a simple regression but on a batch of
  different files.
 
  Each file has different column names.
 
  I know the first column is the dependent variable and all the rest are
  explanatory variables.
 
  The column names are in the first line of the file.
 
  But it seems lm() requires I know the variable names in advance?
 
  Is there a way to dynamically read in variable names and send to lm()?
 
  Many many thanks!!!
 
 
  Urania
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Select range of dates

2006-10-07 Thread Ian Broom
Hello,

This is likely fairly silly question, and I apologize to whomever takes the
time to respond.

I am a relatively new user of R, on Windows XP, version 2.3.1.

Say I have a data table that looks like the following:

x
 Date Location  Amount  Blue Green
1  01/01/2001  Central1817  TRUE FALSE
2  01/02/2001  Central   20358 FALSE  TRUE
3  05/08/2001  Central   16245 FALSE  TRUE
4  02/02/2002  Western 112  TRUE FALSE
5  21/03/2002  Western   98756  TRUE FALSE
6  01/04/2002  Western 1598414 FALSE  TRUE
7  07/01/2001  Western1255 FALSE  TRUE
8  20/10/2003  Central   16289  TRUE FALSE
9  21/10/2003  Eastern   1 FALSE  TRUE
10 22/10/2003  Eastern   98737 FALSE  TRUE
11 23/10/2003  Eastern  198756  TRUE FALSE
12 24/10/2003  Eastern   98756 FALSE  TRUE
13 25/10/2003  Eastern   65895  TRUE FALSE
14 26/10/2003  Eastern 2142266 FALSE  TRUE
15 27/10/2003North   98756  TRUE FALSE
16 28/10/2003North  548236 FALSE  TRUE

and I want to do some summaries by Fiscal year (or FY quarter).
Reading manuals and such, I cobbled this less than satisfactory bit together
to start to build a factor representing a fiscal year split:

y-as.Date(x$Date,%d/%m/%Y)
y-as.matrix(y)
y$FY0203-ifelse((y=(as.Date(2002-03-21)))(y=(as.Date
(2003-04-01))),TRUE,FALSE)

the values seem correct, but aside from ugly, the data is not in the proper
format - with some more effort I might fix that... But my question is: is
there a more simple function available to select a range of dates? Or, at
least, a more elegant approach?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Select range of dates

2006-10-07 Thread Gabor Grothendieck
Here are three alternative ways to get the fiscal year as a numeric
value assuming:
dd - as.Date(x$Date,%d/%m/%Y)

# add one to year if month is past March
as.numeric(format(dd, %Y)) + (format(dd, %m)  03)

# same but using POSIXlt
# (Even though there are no time zones involved I have seen
# situations where the time zone nevertheless crept in
# where one would least expect it so even though I believe
# this is correct I would nevertheless triple check
# all your results if you use this one)
with(as.POSIXlt(dd), 1900 + year + (mon  2))

# this makes use of the zoo yearqtr class
# which represents dates as year + 0, .25, .5 or .75 for
# the four quarters
library(zoo)
as.numeric(ceiling(as.yearqtr(dd)))



On 10/7/06, Ian Broom [EMAIL PROTECTED] wrote:
 Hello,

 This is likely fairly silly question, and I apologize to whomever takes the
 time to respond.

 I am a relatively new user of R, on Windows XP, version 2.3.1.

 Say I have a data table that looks like the following:

 x
 Date Location  Amount  Blue Green
 1  01/01/2001  Central1817  TRUE FALSE
 2  01/02/2001  Central   20358 FALSE  TRUE
 3  05/08/2001  Central   16245 FALSE  TRUE
 4  02/02/2002  Western 112  TRUE FALSE
 5  21/03/2002  Western   98756  TRUE FALSE
 6  01/04/2002  Western 1598414 FALSE  TRUE
 7  07/01/2001  Western1255 FALSE  TRUE
 8  20/10/2003  Central   16289  TRUE FALSE
 9  21/10/2003  Eastern   1 FALSE  TRUE
 10 22/10/2003  Eastern   98737 FALSE  TRUE
 11 23/10/2003  Eastern  198756  TRUE FALSE
 12 24/10/2003  Eastern   98756 FALSE  TRUE
 13 25/10/2003  Eastern   65895  TRUE FALSE
 14 26/10/2003  Eastern 2142266 FALSE  TRUE
 15 27/10/2003North   98756  TRUE FALSE
 16 28/10/2003North  548236 FALSE  TRUE

 and I want to do some summaries by Fiscal year (or FY quarter).
 Reading manuals and such, I cobbled this less than satisfactory bit together
 to start to build a factor representing a fiscal year split:

 y-as.Date(x$Date,%d/%m/%Y)
 y-as.matrix(y)
 y$FY0203-ifelse((y=(as.Date(2002-03-21)))(y=(as.Date
 (2003-04-01))),TRUE,FALSE)

 the values seem correct, but aside from ugly, the data is not in the proper
 format - with some more effort I might fix that... But my question is: is
 there a more simple function available to select a range of dates? Or, at
 least, a more elegant approach?

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.