Re: [R] Multicore computation in Windows network: How to set up Rmpi

2008-04-03 Thread Thomas Brandenburger

Samu,
Thanks you are a life saver.  Your efforts saved me a ton of time.  I went
through basically exactly the same process as you describe.  To summarize
the same issue you have I get the following phenomenon:
spawning appears to not happen the same every time.  How can you specify how
the nodes are being spawned?  Did you ever figure this out?  I see almost no
documentation anywhere on Windows deployments.

Thanks
Tom


Samu Mäntyniemi wrote:
 
 Hello!
 
 I finally got MPICH 1.06 + R 2.6.1 + Rmpi 0.5-5 working with multiple 
 computers. The key was to realize that the number of processes should be 
 one when launching Rgui using mpiexec and not the number of 
 master+slaves, as I had first wrongly understood.
 
 However, I seem to have a new problem which I have not been able to 
 figure out:
 
 After loading Rmpi, the first attempt to mpi.spawn.Rslaves() always 
 spawns the slaves on the local machine instead of on both machines. If I 
 close the slaves and spawn again, then one slave gets spawned on remote 
 machine. Each time I close and then spawn againg, the order of machines 
 is different, and eventually I get back to the situation where all 
 slaves are on the local machine. Continuing to do spawning and closing 
 seems to reveal a pattern. I can see similar behavior if I have more 
 than two machines, and it takes more spawn-close cycles to get all my 
 slave machines spawned on.
 
 Below is an example session with two machines. This pattern shows 
 everytime I start R and run this script. How to control the spawning so 
 that I get everything right at the first call of mpi.spawn.Rslaves()?
 
 Regards,
 
 Samu
 
 

-- 
View this message in context: 
http://www.nabble.com/Multicore-computation-in-Windows-network%3A-How-to-set-up-Rmpi-tp14436938p16465801.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] summary(object, test=c(Roy, Wilks, Pillai, ....) AND ellipse(object, center=....)

2008-04-03 Thread Ray Haraf
Dear All,
   
  I would be very appreciative of your help with the following
   
  1). I am running multivariate multiple regression through the manova() 
function (kindly suggested by Professor Venables) and getting two different 
answers for test=c(Wilks,Roy,Pillai) and tests=c(Wilks,Roy,'Pillai) 
as shown below. In the first case (test=c(list)) I got error message which 
probably means I can only call one test at a time. I thought I could get ride 
of this by adding s to test; in this case (tests=c(list)), I got Pillai test. 
Does this mean that Pillai would be the default test and summary(manova()) can 
only post one test at a time?
 
 summary(manova(cbind(y1, y2) ~ z1, data =
+ ex7.8),test=c(Wilks,Roy,Pillai))
Error in match.arg(test) : 'arg' must be of length 1
 summary(manova(cbind(y1, y2) ~ z1, data =
+ ex7.8),tests=c(Wilks,Roy,Pillai))
  Df  Pillai approx F num Df den Df Pr(F)  
z1 1  0.9375  15.  2  2 0.0625 .
Residuals  3
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

2). My next struggle is to construct prediction ellipse. Both ellipse() and 
ellipse.lm() are not giving me the solution to Sampling from multivariate 
multiple regression prediction regions posted by Iain Pardoe, Mon May 9 
18:43:46 2005. I am working on the same problem and
performed all the steps he suggested
 
 ex7.10 - 
+ data.frame(y1 = c(141.5, 168.9, 154.8, 146.5, 172.8, 160.1, 108.5), 
+ y2 = c(301.8, 396.1, 328.2, 307.4, 362.4, 369.5, 229.1), 
+ z1 = c(123.5, 146.1, 133.9, 128.5, 151.5, 136.2, 92), 
+ z2 = c(2.108, 9.213, 1.905, .815, 1.061, 8.603, 1.125)) 
 attach(ex7.10) 
 f.mlm - lm(cbind(y1,y2)~z1+z2) 
 y.hat - c(1, 130, 7.5) %*% coef(f.mlm) 
 round(y.hat, 2) 
 y1 y2 
[1,] 151.84 349.63 
 qf.z - t(c(1, 130, 7.5)) %*% 
+ solve(t(cbind(1,z1,z2)) %*% cbind(1,z1,z2)) %*% 
+ c(1, 130, 7.5) 
 round(qf.z, 5) 
[,1] 
[1,] 0.36995 
 n.sigma.hat - SSD(f.mlm)$SSD # same as t(resid(f.mlm)) %*% 
resid(f.mlm) 
 round(n.sigma.hat, 2) 
 y1 y2 
y1 5.80 5.22 
y2 5.22 12.57 
 F.quant - qf(.95,2,3) 
 round(F.quant, 2) 
[1] 9.55 

 
From here how could I calculate a 95% prediction ellipse for y=(y1,y2) at 
(z1,z2)=(130,7.5) using either ellipse or ellipse.lm? y1 would be the x-axis 
and y2, the y-axis. The center is different from (0,0) and I don't know what 
would be the appropriate x (the lm object). Should I
used predicted values or residuals? In both cases I have vectors which is 
different from the example given with ellipse.lm
 
3). Lastly but not the least, would be too ambitious to draw the axes (i.e, the 
eigenvalues) to  the ellipse?
 
Thanks and very kind regards,
Ray

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summary(object, test=c(Roy, Wilks, Pillai, ....) AND ellipse(object, center=....)

2008-04-03 Thread Prof Brian Ripley
Reading the help often resolves questions, which is why we ask you to do 
so before posting.  From ?summary.manova:

Usage:

  ## S3 method for class 'manova':
  summary(object,
  test = c(Pillai, Wilks, Hotelling-Lawley, Roy),
  intercept = FALSE, ...)

Arguments:

   object: An object of class 'manova' or an 'aov' object with
   multiple responses.

 test: The name of the test statistic to be used.  Partial matching
   is used so the name can be abbreviated.

...

  The 'summary.manova' method uses a multivariate test statistic for
  the summary table.  Wilks' statistic is most popular in the
  literature, but the default Pillai-Bartlett statistic is
  recommended by Hand and Taylor (1987).

Note, not mention of 'tests' and explicitly 'a ... test'.  It is a 
standard convention that when several alternatives are given in the usage 
the first is the default, _and_ the text does say this explicitly.

(I might have followed up your next question had you given a URL link to 
the posting concerned.)


On Thu, 3 Apr 2008, Ray Haraf wrote:

  I would be very appreciative of your help with the following

  1). I am running multivariate multiple regression through the manova() 
 function (kindly suggested by Professor Venables) and getting two 
 different answers for test=c(Wilks,Roy,Pillai) and 
 tests=c(Wilks,Roy,'Pillai) as shown below. In the first case 
 (test=c(list)) I got error message which probably means I can only call 
 one test at a time. I thought I could get ride of this by adding s to 
 test; in this case (tests=c(list)), I got Pillai test. Does this mean 
 that Pillai would be the default test and summary(manova()) can only 
 post one test at a time?

 summary(manova(cbind(y1, y2) ~ z1, data =
 + ex7.8),test=c(Wilks,Roy,Pillai))
 Error in match.arg(test) : 'arg' must be of length 1
 summary(manova(cbind(y1, y2) ~ z1, data =
 + ex7.8),tests=c(Wilks,Roy,Pillai))
  Df  Pillai approx F num Df den Df Pr(F)
 z1 1  0.9375  15.  2  2 0.0625 .
 Residuals  3
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

 2). My next struggle is to construct prediction ellipse. Both ellipse() and 
 ellipse.lm() are not giving me the solution to Sampling from multivariate 
 multiple regression prediction regions posted by Iain Pardoe, Mon May 9 
 18:43:46 2005. I am working on the same problem and
 performed all the steps he suggested

 ex7.10 -
 + data.frame(y1 = c(141.5, 168.9, 154.8, 146.5, 172.8, 160.1, 108.5),
 + y2 = c(301.8, 396.1, 328.2, 307.4, 362.4, 369.5, 229.1),
 + z1 = c(123.5, 146.1, 133.9, 128.5, 151.5, 136.2, 92),
 + z2 = c(2.108, 9.213, 1.905, .815, 1.061, 8.603, 1.125))
 attach(ex7.10)
 f.mlm - lm(cbind(y1,y2)~z1+z2)
 y.hat - c(1, 130, 7.5) %*% coef(f.mlm)
 round(y.hat, 2)
 y1 y2
 [1,] 151.84 349.63
 qf.z - t(c(1, 130, 7.5)) %*%
 + solve(t(cbind(1,z1,z2)) %*% cbind(1,z1,z2)) %*%
 + c(1, 130, 7.5)
 round(qf.z, 5)
[,1]
 [1,] 0.36995
 n.sigma.hat - SSD(f.mlm)$SSD # same as t(resid(f.mlm)) %*%
 resid(f.mlm)
 round(n.sigma.hat, 2)
 y1 y2
 y1 5.80 5.22
 y2 5.22 12.57
 F.quant - qf(.95,2,3)
 round(F.quant, 2)
 [1] 9.55


 From here how could I calculate a 95% prediction ellipse for y=(y1,y2) at 
 (z1,z2)=(130,7.5) using either ellipse or ellipse.lm? y1 would be the x-axis 
 and y2, the y-axis. The center is different from (0,0) and I don't know what 
 would be the appropriate x (the lm object). Should I
 used predicted values or residuals? In both cases I have vectors which is 
 different from the example given with ellipse.lm

 3). Lastly but not the least, would be too ambitious to draw the axes (i.e, 
 the eigenvalues) to  the ellipse?

 Thanks and very kind regards,
 Ray

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] need help with understanding stepAIC

2008-04-03 Thread ONKELINX, Thierry
Dear Balavelan,

First I would suggest to include the main effects of all variables in an
interaction. So you model with interaction should be y ~ a + b + c + d +
c:d. Which can be abbreviated to y ~ a + b + c*d.

Futhermore you should take a look at ?predict.glm and pay attention to
the newdata and type = reponse arguments.

HTH,

Thierry 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
[EMAIL PROTECTED] 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Namens Bala Thanigaivelan
Verzonden: donderdag 3 april 2008 4:22
Aan: r-help@r-project.org
Onderwerp: [R] need help with understanding stepAIC

Hi,



I am currently using R extensively to develop a statistical model and
currently learning its capabilities. 

I am trying to develop a logistic regression model using glm and stepAIC



I constructed my formula as y ~ a + b + c + d

Now I wish to study the interaction between variables. So I did y ~ a +
b + c:d + d



The software generated the coefficients for a, b , c:d, d



Now I want to compute the probability for various events using the
logistic regression model formula



  Prob = exp(y)/(1+exp(y))



How do I compute the value of y for the formula with interaction
variables ? what does c:d represent given a dataset ?



Regards,



Balavelan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Thinking about using two y-scales on your plot?

2008-04-03 Thread Jim Lemon
Allen S. Rout wrote:
 ...
 I've got a series of graphs I generate in R to illustrate backup
 activity.  These are scaled, primarily, in bytes.  But different
 activities have different costs per byte, and I augment the bytes
 scale with dollars.
 
 http://docs.osg.ufl.edu/tsm/current/ext/UFEXCH-MBX01.AD.UFL.EDU-all.html
 
 So the blue series corresponds to tbe blue dollars scale (and bytes)
 and the green and red points correspond to the green scale (and
 bytes).
 
 Am I being naughty? 
 
Hi Allen,
If misleading someone is naughty, I guess we've all done that...
The difficulty you have is the visual impression that the storage 
charges are higher than the transfer charges. You are trying to squeeze 
three dimensions (daily amount of information, daily cost of transfer 
and daily cost of storage) into two. Because the values are actually 
plotted against the first dimension and the cost scales are tacked on to 
that, only in the case where both of the latter dimensions coincide 
would the visual metric correspond to the numbers. Then you wouldn't 
have a problem, of course. I would tend to make the visual comparison on 
the money dimension and add supplementary axes showing the amount of 
information for each aspect of the cost. Here's a rough example:

# estimate the first two weeks of data
ufexch-data.frame(transGb=
  c(450,700,200,550,550,545,550,550,195,180,555,555,550,550),
storGb=c(900,1000,1020,1025,1030,1035,1045,1045,
  1050,1050,1050,1055,1060,1065))
# roughly calculate dollars per day
ufexch$transdd-ufexch$transGb/2
ufexch$stordd-ufexch$storGb/
par(mar=c(5,4,4,4))
plot(c(ufexch$transdd,NA,0.2),xlim=c(1,14),ylim=c(0.2,400),col=green,
  pch=2,log=y,main=Cost of backups,ylab=Dollars/day,xlab=Date,
  axes=FALSE,type=b)
points(ufexch$stordd,col=blue,pch=1,type=b)
box()
axis(2)
axis(1,at=1:14,labels=paste(1:14,Feb,sep=/))
axis(4,at=c(100,300,500),labels=c(500,800,1000),col=green)
axis(4,at=c(0.5,0.9,1.4),labels=c(500,900,1400),col=blue)
mtext(Gigabytes,4,2)
legend(6,20,c(Storage,Transfer),pch=1:2,col=c(4,3))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Thinking about using two y-scales on your plot?

2008-04-03 Thread Richard Cotton



thegeologician wrote:
 
 ... very often time-series plots of some values are 
 given rather to show the temporal correlation of these, than to show the 
 actual numerical values! The same applies for plots of some sample 
 values over distance (eg. element concentration over a sample or 
 investigation area). In this case one is more interested in whether some 
 values change simultaneously, than what the actual values at every point 
 are.
 
 In the mentioned plot (see link below), the temporal  evolution of the 
 mean temperature and of the precipitation over a year is the important 
 information. 
 

If temporal correlation is what you are interested in, then why not plot
that?  If you also care about the evolution of temperature and
precipitation, then these can be plotted on individual graphs, to give three
graphs in total, each with a common x-axis (time), and each showing one
variable of interest on its y-axis.  This way the problems of multiple
y-axes are avoided.


-
Regards,
Richie.

Mathematical Sciences Unit
HSL
-- 
View this message in context: 
http://www.nabble.com/Thinking-about-using-two-y-scales-on-your-plot--tp16290293p16467373.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reloading java classes with rJava

2008-04-03 Thread Gregory Jefferis
Dear R people,

I have recently started using rJava to interact with Java code that I have
written and am finding it a very useful bridge. Thanks!

I have now run into a problem I can't figure out. If I edit and recompile my
java source code I would like to force rJava to reload the modified java
class.  However even doing:

.jinit(force.init=TRUE)

To restart the jvm doesn't seem to do this.  Is this possible without
restarting R?  Many thanks,

Greg.


 
rJava 0.5-1 from CRAN

platform   i386-apple-darwin8.10.1
arch   i386
os darwin8.10.1
system i386, darwin8.10.1
status 
major  2   
minor  6.2 
year   2008
month  02  
day08  
svn rev44383
language   R   
version.string R version 2.6.2 (2008-02-08)

-- 
Gregory Jefferis, PhD   and:
Research Fellow
Department of Zoology   St John's College
University of Cambridge Cambridge
Downing Street  CB2 1TP
Cambridge, CB2 3EJ 
United Kingdom

Lab Tel: +44 (0)1223 336683 Office: +44 (0)1223 339899
Lab Fax: +44 (0)1223 336676

[EMAIL PROTECTED]
http://www.zoo.cam.ac.uk/zoostaff/jefferis.html
http://www.neuroscience.cam.ac.uk/directory/profile.php?gsxej2
http://flybrain.stanford.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to ask for *fixed* number of distributions under parameterized Gaussian mixture model.

2008-04-03 Thread Hung-Hsuan Chen (Sean)
Dear R users:
I am wondering how to ask for *fixed* number of distributions under
parameterized Gaussian mixture model.

I know that em() and some related functions can predict the
parameterized Gaussian mixture model. However, there seems no
parameter to decide number of distributions to be mixed (if we known
the value in advance).

That is, assume I know the (mixed)data is from 3 different
distributions. The output, however, may indicate that number of
distributions that form the data is 4. How can I assign number of
distributions is 3 in advance?

Thanks a lot for your help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data.frame or list

2008-04-03 Thread baptiste Auguié
Dear R list,

I'm having difficulties in choosing between a list or a data.frame,  
or an array for the storage and manipulation of my data (example  
follows). I've been using the three for different purposes but I  
would rather like to know which is more adapted to what task.

Here is the data I'm currently working on: 200 observations, each  
observation being a vector of length 1000 depending on several  
factors (polarisation, pitch, dose, size)



 x - seq(1,100,length = 1000)
 observations - matrix( rnorm(200*1000), ncol = 200)

 # factors
 polarisation - rep(c(0,90), each = 100, length = 200)
 pitch - rep(1:5, length = 200)
 dose - rep(1:2, each =100, length = 200)
 size - rep(letters[1:6], each =100, length = 200)

 my.data - list(x = x,
   observations = as.data.frame(observations),
   polarisation = factor(polarisation),
   pitch = factor(pitch),
   dose = factor(dose),
   size = factor(size))



I would like to be able to manipulate the data in observations  
using the factors as opposed to column indices. For instance, I could  
plot all the observations corresponding to polarisation == 90 and  
pitch == 1, like in,

 with(my.data , matplot(x, subset((pitch == 1)  (polarisation ==  
 90), observations, type=l )))


which doesn't work, so I've had to use,

 with(my.data , matplot(x, observations[,(pitch == 1)   
 (polarisation == 90)], type=l ))


Is this a sensible way to store and manipulate this sort of data? Is  
there anything more appropriate I've overlooked using data.frames only?

Many thanks,

baptiste



_

Baptiste Auguié

Physics Department
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
http://projects.ex.ac.uk/atto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion in count data

2008-04-03 Thread Michael Dewey
At 17:03 02/04/2008, Wade Wall wrote:
Hi all,

I have count data (number of flowering individuals plus total number of
individuals) across 24 sites and 3 treatments (time since last burn).
Following recommendations in the R Book, I used a glm with the model y~
burn, with y being two columns (flowering, not flowering) and burn the time
(category) since burn.  However, the residual deviance is roughly 10 times
the number of degrees of freedom, and using the quasibinomial distribution
doesn't change this.  Any suggestions as to why the quasibinomial
distribution doesn't change the residual deviance and how I should proceed.
I know that this level of residual deviance is unacceptable, but not sure is
transformations are in order.

You have received much helpful advice from Gavin and Achim and others 
but I wonder whether they are answering the quaestion in your title 
rather than in your post.

Are you doing something like
fit - glm(cbind(flower, notflower) ~ burn, family = binomial)

You might find it helpful to read the relevant section in MASS (see 
quasibinomial in the index) or in some other text.


Needless to say that I am at the outer limits of my statistical knowledge.

Thanks for any help,

Wade Wall

 [[alternative HTML version deleted]]

Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Lapack error in Design:::ols

2008-04-03 Thread Gad Abraham
Hi,

I'm trying to use Frank Harrell's Design:::ols function to do regression 
of y (numeric) on the interaction of two factors (x1 and x2), but Lapack 
throws an error:

  library(Design)
...
  load(url(http://www.csse.unimelb.edu.au/~gabraham/x;))
  ols(y ~ x1 * x2, data=x)
Error in chol2inv(fit$qr$qr) : 'size' cannot exceed nrow(x) = 20
  traceback()
6: .Call(La_chol2inv, x, size, PACKAGE = base)
5: chol2inv(fit$qr$qr)
4: ols(y ~ x1 * x2, data = x)
3: eval.with.vis(expr, envir, enclos)
2: eval.with.vis(ei, envir)
1: source(t.R, echo = TRUE)

Any ideas?

Thanks,
Gad

  sessionInfo()
R version 2.6.2 (2008-02-08)
x86_64-pc-linux-gnu

locale:
LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;
LC_MONETARY=en_AU.UTF-8;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;
LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] Design_2.1-1  survival_2.34 Hmisc_3.4-3

loaded via a namespace (and not attached):
[1] cluster_1.11.9  grid_2.6.2  lattice_0.17-4  rcompgen_0.1-17

-- 
Gad Abraham
Dept. CSSE and NICTA
The University of Melbourne
Parkville 3010, Victoria, Australia
email: [EMAIL PROTECTED]
web: http://www.csse.unimelb.edu.au/~gabraham

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why is text being rasterized with text()

2008-04-03 Thread Andrew Yee
Thank you for looking at my code.  You are absolutely right.  I'm
mortified that I didn't see the text() tucked inside the for loop.
Once I took it out of the loop, the text() works fine.

Thanks,
Andrew

On Thu, Apr 3, 2008 at 1:52 AM, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 We haven't been told your OS or the graphics device or how you viewed PDF
 output.  But all text on screen devices is rasterized -- they are raster
 devices.  (That includes your pdf viewer.)

 You are writing the label 9 times in the same place.  I suspect what you are
 seeing is a viewer artifact of doing so -- the antialiasing isn't working
 for you.



 On Thu, 3 Apr 2008, Andrew Yee wrote:

 
 
 
  Here's a question:  I noticed that when I tried to create this simple
  graph of rectangles and text, R appears to generate text that is
  rasterized (this is seen both on the monitor and when the output is
  directed to a pdf file).  Any thoughts?
 
  value.seq - c(4,as.character(seq(from=4,to=10)),11)
  frame()
  par(usr=c(0,10,0,10) )
 
  for (r in 1:9) {
 
  rect(r,3,r+1,3.75, border = NA, col=heat.colors(9)[r])
 
  #this text does not appear to be rasterized
   text(r+0.5,2.75, value.seq[r], cex=0.5)
 
  #this text appears to be rasterized
 
   text(5.5,2.25,expression (log2), cex=0.5) }
 
  Thanks,
  Andrew
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 --
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion in count data

2008-04-03 Thread Wade Wall
That is exactly how I am writing it.  Glm works fine, but as I stated the
residual deviance is much greater (10x) than the degrees of freedom.  I want
to take a look at using the negative binomial distribution, but I can't get
glm.nb to work. I get the message Error: (subscript) logical subscript too
long.  I have used traceback() and it seems to be in the glm.fitter
function, but as I say I am at the limit of my abilities here.

Wade

On Thu, Apr 3, 2008 at 7:23 AM, Michael Dewey [EMAIL PROTECTED]
wrote:

 At 17:03 02/04/2008, Wade Wall wrote:

  Hi all,
 
  I have count data (number of flowering individuals plus total number of
  individuals) across 24 sites and 3 treatments (time since last burn).
  Following recommendations in the R Book, I used a glm with the model y~
  burn, with y being two columns (flowering, not flowering) and burn the
  time
  (category) since burn.  However, the residual deviance is roughly 10
  times
  the number of degrees of freedom, and using the quasibinomial
  distribution
  doesn't change this.  Any suggestions as to why the quasibinomial
  distribution doesn't change the residual deviance and how I should
  proceed.
  I know that this level of residual deviance is unacceptable, but not
  sure is
  transformations are in order.
 

 You have received much helpful advice from Gavin and Achim and others but
 I wonder whether they are answering the quaestion in your title rather than
 in your post.

 Are you doing something like
 fit - glm(cbind(flower, notflower) ~ burn, family = binomial)

 You might find it helpful to read the relevant section in MASS (see
 quasibinomial in the index) or in some other text.


  Needless to say that I am at the outer limits of my statistical
  knowledge.
 
  Thanks for any help,
 
  Wade Wall
 
 [[alternative HTML version deleted]]
 

 Michael Dewey
 http://www.aghmed.fsnet.co.uk



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lapack error in Design:::ols

2008-04-03 Thread Prof Brian Ripley
This is not an LAPACK error, but an R function chol2inv 
complaining that it is being misused.

The problem is that you have a vastly overparametrized model (using 
recover())

Browse[1] dim(X)
[1]  20 143

and ols() does not detect that, whereas lm() would.  I'd suggest ols() 
is to blame here.


On Thu, 3 Apr 2008, Gad Abraham wrote:

 Hi,

 I'm trying to use Frank Harrell's Design:::ols function to do regression
 of y (numeric) on the interaction of two factors (x1 and x2), but Lapack
 throws an error:

  library(Design)
 ...
  load(url(http://www.csse.unimelb.edu.au/~gabraham/x;))
  ols(y ~ x1 * x2, data=x)
 Error in chol2inv(fit$qr$qr) : 'size' cannot exceed nrow(x) = 20
  traceback()
 6: .Call(La_chol2inv, x, size, PACKAGE = base)
 5: chol2inv(fit$qr$qr)
 4: ols(y ~ x1 * x2, data = x)
 3: eval.with.vis(expr, envir, enclos)
 2: eval.with.vis(ei, envir)
 1: source(t.R, echo = TRUE)

 Any ideas?

 Thanks,
 Gad

  sessionInfo()
 R version 2.6.2 (2008-02-08)
 x86_64-pc-linux-gnu

 locale:
 LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;
 LC_MONETARY=en_AU.UTF-8;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=en_AU.UTF-8;LC_NAME=C;
 LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_AU.UTF-8;LC_IDENTIFICATION=C

 attached base packages:
 [1] splines   stats graphics  grDevices utils datasets  methods
 [8] base

 other attached packages:
 [1] Design_2.1-1  survival_2.34 Hmisc_3.4-3

 loaded via a namespace (and not attached):
 [1] cluster_1.11.9  grid_2.6.2  lattice_0.17-4  rcompgen_0.1-17

 --
 Gad Abraham
 Dept. CSSE and NICTA
 The University of Melbourne
 Parkville 3010, Victoria, Australia
 email: [EMAIL PROTECTED]
 web: http://www.csse.unimelb.edu.au/~gabraham

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion in count data

2008-04-03 Thread Michael Dewey
At 12:54 03/04/2008, Wade Wall wrote:
That is exactly how I am writing it.  Glm works fine, but as I 
stated the residual deviance is much greater (10x) than the degrees 
of freedom.  I want to take a look at using the negative binomial 
distribution, but I can't get glm.nb to work. I get the message 
Error: (subscript) logical subscript too long.  I have used 
traceback() and it seems to be in the glm.fitter function, but as I 
say I am at the limit of my abilities here.

For Poisson models and for the negative binomial you have a single 
outcome, a count.
For the binomial you can have two columns of counts of successes and 
failures (there are other ways of arranging your data).

I think you might want to try the beta-binomial which is available I 
think in aod.

However I still think reading the relevant section of MASS first 
would be a good idea (or some equivalent text).


Wade

On Thu, Apr 3, 2008 at 7:23 AM, Michael Dewey 
mailto:[EMAIL PROTECTED][EMAIL PROTECTED] wrote:
At 17:03 02/04/2008, Wade Wall wrote:
Hi all,

I have count data (number of flowering individuals plus total number of
individuals) across 24 sites and 3 treatments (time since last burn).
Following recommendations in the R Book, I used a glm with the model y~
burn, with y being two columns (flowering, not flowering) and burn the time
(category) since burn.  However, the residual deviance is roughly 10 times
the number of degrees of freedom, and using the quasibinomial distribution
doesn't change this.  Any suggestions as to why the quasibinomial
distribution doesn't change the residual deviance and how I should proceed.
I know that this level of residual deviance is unacceptable, but not sure is
transformations are in order.


You have received much helpful advice from Gavin and Achim and 
others but I wonder whether they are answering the quaestion in your 
title rather than in your post.

Are you doing something like
fit - glm(cbind(flower, notflower) ~ burn, family = binomial)

You might find it helpful to read the relevant section in MASS (see 
quasibinomial in the index) or in some other text.


Needless to say that I am at the outer limits of my statistical knowledge.

Thanks for any help,

Wade Wall

[[alternative HTML version deleted]]


Michael Dewey
http://www.aghmed.fsnet.co.ukhttp://www.aghmed.fsnet.co.uk


Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion in count data

2008-04-03 Thread Prof Brian Ripley
On Thu, 3 Apr 2008, Wade Wall wrote:

 That is exactly how I am writing it.  Glm works fine, but as I stated the
 residual deviance is much greater (10x) than the degrees of freedom.  I want
 to take a look at using the negative binomial distribution, but I can't get
 glm.nb to work. I get the message Error: (subscript) logical subscript too
 long.  I have used traceback() and it seems to be in the glm.fitter
 function, but as I say I am at the limit of my abilities here.

But we can only help you debug this if we have a reproducible example.


 Wade

 On Thu, Apr 3, 2008 at 7:23 AM, Michael Dewey [EMAIL PROTECTED]
 wrote:

 At 17:03 02/04/2008, Wade Wall wrote:

 Hi all,

 I have count data (number of flowering individuals plus total number of
 individuals) across 24 sites and 3 treatments (time since last burn).
 Following recommendations in the R Book, I used a glm with the model y~
 burn, with y being two columns (flowering, not flowering) and burn the
 time
 (category) since burn.  However, the residual deviance is roughly 10
 times
 the number of degrees of freedom, and using the quasibinomial
 distribution
 doesn't change this.  Any suggestions as to why the quasibinomial
 distribution doesn't change the residual deviance and how I should
 proceed.
 I know that this level of residual deviance is unacceptable, but not
 sure is
 transformations are in order.


 You have received much helpful advice from Gavin and Achim and others but
 I wonder whether they are answering the quaestion in your title rather than
 in your post.

 Are you doing something like
 fit - glm(cbind(flower, notflower) ~ burn, family = binomial)

 You might find it helpful to read the relevant section in MASS (see
 quasibinomial in the index) or in some other text.


  Needless to say that I am at the outer limits of my statistical
 knowledge.

 Thanks for any help,

 Wade Wall

[[alternative HTML version deleted]]


 Michael Dewey
 http://www.aghmed.fsnet.co.uk



   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion in count data

2008-04-03 Thread Gavin Simpson
On Thu, 2008-04-03 at 01:24 +, David Winsemius wrote:
 Wade Wall [EMAIL PROTECTED] wrote in
 news:[EMAIL PROTECTED]: 
 
  Thanks for the recommendations, insights.  I tried using glm.nb, but
  it didn't seem to like my data.  I received the message (subscript)
  logical subscript too long.  I am using the same dataframe as my
  previous glm.  Do you know if I need to put the data in a different
  format? 
 
 I was wondering about your data layout. You said you had the flower/no-
 flower data in two different columns. That is not the way I usually 
 offer data to glm(). I would have imagined that log(burn_time) would 
 have been an offset. It might help if you at least offered the audience 
 a sample of ten rows, the results of str() for the data.frame, and the 
 call to the glm function.
 

David, Wade,

You can supply a two-column matrix to glm() with families
(quasi)binomial as the two columns represent successes and failures
respectively; see ?glm and Details section.

However this is not possible with a (quasi)poisson family or with
glm.nb, which is one reason why Wade might have been getting the error.
My fault here - I didn't grep exactly what Wade had written. Even if
this isn't causing the error, Wade won't be fitting the model he thought
he was with a two column response in glm.nb.

In a similar vein to David's suggestion, could one not use offset but on
the total number of flowers sampled in each location, e.g.
offset(log(totalPlants)) in the formula where totalPlants is the
variable containing the total number of plants encountered. You need to
include this in the formula for glm.nb as that function does not have an
argument 'offset'. Wade's y is then just a vector of counts of flowering
individuals. In this way one would account for differences in the number
of flowers encountered between sites.

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Overdispersion in count data

2008-04-03 Thread Wade Wall
Alright, I feel stupid now.  That was the problem.  For glm you can use both
successes and failures, while with the negative binomial it is simply a
count.  That is why I was getting the subscript too long message.  I
understand generalized linear models, but I haven't worked with negative
binomial distribution models, so I will read up on it before I try to use
it.  RTFM, right?

I will read about using betabinomial for overdispersed data.

Thanks for your help,

Wade

On Thu, Apr 3, 2008 at 8:30 AM, Michael Dewey [EMAIL PROTECTED]
wrote:

 At 12:54 03/04/2008, Wade Wall wrote:

  That is exactly how I am writing it.  Glm works fine, but as I stated
  the residual deviance is much greater (10x) than the degrees of freedom.  I
  want to take a look at using the negative binomial distribution, but I can't
  get glm.nb to work. I get the message Error: (subscript) logical subscript
  too long.  I have used traceback() and it seems to be in the glm.fitter
  function, but as I say I am at the limit of my abilities here.
 

 For Poisson models and for the negative binomial you have a single
 outcome, a count.
 For the binomial you can have two columns of counts of successes and
 failures (there are other ways of arranging your data).

 I think you might want to try the beta-binomial which is available I think
 in aod.

 However I still think reading the relevant section of MASS first would be
 a good idea (or some equivalent text).


  Wade
 
  On Thu, Apr 3, 2008 at 7:23 AM, Michael Dewey mailto:
  [EMAIL PROTECTED][EMAIL PROTECTED] wrote:
  At 17:03 02/04/2008, Wade Wall wrote:
  Hi all,
 
  I have count data (number of flowering individuals plus total number of
  individuals) across 24 sites and 3 treatments (time since last burn).
  Following recommendations in the R Book, I used a glm with the model y~
  burn, with y being two columns (flowering, not flowering) and burn the
  time
  (category) since burn.  However, the residual deviance is roughly 10
  times
  the number of degrees of freedom, and using the quasibinomial
  distribution
  doesn't change this.  Any suggestions as to why the quasibinomial
  distribution doesn't change the residual deviance and how I should
  proceed.
  I know that this level of residual deviance is unacceptable, but not
  sure is
  transformations are in order.
 
 
  You have received much helpful advice from Gavin and Achim and others
  but I wonder whether they are answering the quaestion in your title rather
  than in your post.
 
  Are you doing something like
  fit - glm(cbind(flower, notflower) ~ burn, family = binomial)
 
  You might find it helpful to read the relevant section in MASS (see
  quasibinomial in the index) or in some other text.
 
 
  Needless to say that I am at the outer limits of my statistical
  knowledge.
 
  Thanks for any help,
 
  Wade Wall
 
[[alternative HTML version deleted]]
 
 
  Michael Dewey
  http://www.aghmed.fsnet.co.ukhttp://www.aghmed.fsnet.co.uk
 
 
 Michael Dewey
 http://www.aghmed.fsnet.co.uk



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sqldf file specification, non-ASCII

2008-04-03 Thread Peter Jepsen
Dear R-Listers,

I am a Windows user (R 2.6.2) using the development version of sqldf to
try to read a 3GB file originally stored in .sas7bdat-format. I convert
it to comma-delimited ASCII format with StatTransfer before trying to
import just the rows I need into R. The problem is that I get this
error:

 f - file(hugedata.csv)
 DF - sqldf(select * from f where C_OPR like 'KKA2%',
file.format=list(header=T, row.names=F))
Error in try({ : 
  RS-DBI driver: (RS_sqlite_import: hugedata.csv line 1562740 expected
52 columns of data but found 19)
Error in sqliteExecStatement(con, statement, bind.data) : 
  RS-DBI driver: (error in statement: no such table: f)

Now, I know that my SAS-using colleagues are able to use this file with
SAS, so I was wondering whether StatTransfer'ing it to the SAS XPORT
format which can be read with the 'read.xport' function in the 'foreign'
package would be a better approach. The problem is, I don't know
how/whether I can do that at all with sqldf. I tried various ways like
f - file(read.xport(hugedata.xport))
but I consistently got an error message from the sqldf command. I don't
recall the exact error message, unfortunately, but can anybody tell me
whether it is at all possible to read in files in non-ASCII format
without having to put them in R memory?

Thank you for your assistance.
Peter.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] merge with rownames?

2008-04-03 Thread Chuck Cleland
On 4/3/2008 10:47 AM, Mark Kimpel wrote:
 Can merge be tricked into merging via rownames as opposed to via contents of
 a particular column? I have two data.frames with overlapping, but out of
 order, rownames, but no column contents in common and would like to merge
 without cbinding the rownames to the data.frames.
 Mark

   The details section of the help page for merge suggests you can use 
either merge(..., by=0) or merge(..., by=row.names).

?merge

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf file specification, non-ASCII

2008-04-03 Thread Duncan Murdoch
On 4/3/2008 10:22 AM, Peter Jepsen wrote:
 Dear R-Listers,
 
 I am a Windows user (R 2.6.2) using the development version of sqldf to
 try to read a 3GB file originally stored in .sas7bdat-format. I convert
 it to comma-delimited ASCII format with StatTransfer before trying to
 import just the rows I need into R. The problem is that I get this
 error:
 
 f - file(hugedata.csv)
 DF - sqldf(select * from f where C_OPR like 'KKA2%',
 file.format=list(header=T, row.names=F))
 Error in try({ : 
   RS-DBI driver: (RS_sqlite_import: hugedata.csv line 1562740 expected
 52 columns of data but found 19)
 Error in sqliteExecStatement(con, statement, bind.data) : 
   RS-DBI driver: (error in statement: no such table: f)

That error message looks pretty clear:  there's a problem on line 
1562740.  Can you look at that line and spot what the problem is?  If 
you don't have a text editor that can handle big files, you should be 
able to do it with something like this:

f - file(hugedata.csv, r)

skip - 1562739
while (skip  1) {
   junk - readLines(f, 1)
   skip - skip - 1
}
junk - readLines(f, skip)
readLines(f, 1)



 
 Now, I know that my SAS-using colleagues are able to use this file with
 SAS, so I was wondering whether StatTransfer'ing it to the SAS XPORT
 format which can be read with the 'read.xport' function in the 'foreign'
 package would be a better approach. 

R can usually read CSV files without a problem.  You've likely got a 
problem in your file on that line; you just need to figure out what it 
is, and fix it.  (It's possible the sqldf function has a bug, but I'd 
suspect the file, first.)

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] merge with rownames?

2008-04-03 Thread Mark Kimpel
Can merge be tricked into merging via rownames as opposed to via contents of
a particular column? I have two data.frames with overlapping, but out of
order, rownames, but no column contents in common and would like to merge
without cbinding the rownames to the data.frames.
Mark

-- 
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with R semantics

2008-04-03 Thread Charles Annis, P.E.
Greetings:

 

I'm running R2.6.2 on a WinXP DELL box with 2 gig RAM.

 

I have created a new glm link function to be used with family = binomial.
The function works (although any suggested improvements would be welcome),

 

logit.FC - function(POD.floor = 0, POD.ceiling =1)

{   if (POD.floor  0 | POD.floor  1) stop (POD.floor must be between zero
and one.)

if (POD.ceiling  0 | POD.ceiling  1) stop (POD.ceiling must be
between zero and one.)

if (POD.ceiling - POD.floor  difference.criterion) stop
(paste(POD.ceiling-POD.floor difference must be greater than
,difference.criterion, to discourage answer-shopping., sep=))

linkfun - function(mu) {

mu - qlogis( (mu - POD.floor)/(POD.ceiling - POD.floor) )

}

linkinv - function(eta) {

eta - POD.floor + (POD.ceiling - POD.floor)*plogis(eta)

}

mu.eta - function(eta) {

(POD.ceiling - POD.floor)*dlogis(eta)# derivitaive of mu with respect to
eta

}

valideta - function(eta) TRUE

link - paste(logit.FC(, POD.floor, ,, POD.ceiling, ), sep = )

structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, 

valideta = valideta, name = link), class = link-glm)

}

 

as is evidenced by the call, binomial(logit.FC(0,1))

 

My semantics problem is that I have the name logit.FC(0,1), constructed
elsewhere using paste(), and I need to remove the quotations in order to use
it, since binomial(logit.FC(0,1)) will fail.

 

I know I am missing something obvious and would appreciate any help.

 

Thanks.

 

Charles Annis, P.E.

 mailto:[EMAIL PROTECTED]
[EMAIL PROTECTED]
phone: 561-352-9699
eFax:  614-455-3265
 http://www.StatisticalEngineering.com
http://www.StatisticalEngineering.com

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] L-BFGS-B needs finite values of 'fn'

2008-04-03 Thread Ravi Varadhan
Paul,

After looking at your objective function and the penalty, I realized that
since the contribution of each term to the objective function decreases
geometrically, the later terms contribute relatively less to the overall
maximum.  Hence the numerical estimation of those terms is much less precise
compared to the dominant terms.  This explains why the ratios of x[k]/x[k-1]
is not close to b, for small values of b.  

Now let us try a larger value of b, say, b=0.8.

Here are the results from optim, with BFGS.

  k - 1
  b - 0.8
 
  f - function(x, pen, k, b) {
+  n - length(x)
+  r - sum((b^(0:(n-1)))*log(x)) - pen*(sum(x)-k)^2
+  return(r)
+  }
 
  gr - function(x, pen, k, b) {
+  n - length(x)
+  r - (b^(0:(n-1)))*(1/x) - 2*pen*(sum(x)-k)
+  return(r)
+  }
 
 
 nvar - 10
 p0 - runif(nvar, 0, 20)
 sols - optim(p0, f, gr, method=BFGS, control=list(fnscale=-1), k=k,
b=b, pen=1)
 sum(sols$par)
[1] 1
 sols$par[2:nvar] / sols$par[1:(nvar-1)]
[1] 0.8038902 0.8021216 0.7974323 0.7999681 0.7983918 0.8006497 0.8013940
[8] 0.8044702 0.7879365
 


As you can see, things are much better!

Hope this helps,
Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Ravi Varadhan
Sent: Wednesday, April 02, 2008 2:03 PM
To: 'Paul Smith'; 'R-help'
Subject: Re: [R] L-BFGS-B needs finite values of 'fn'

Yes, that is very important.  If you look at the ratios x[k]/x[k-1], they
are very close to 0.3 for the first few components, and then they start
slowly diverging (ratio becomes smaller than 0.3) from that. 

So, optim is indeed finding a correct solution to the problem that you
actually posed.  You could increase your penalty to get a solution that is
closer to the analytical solution you are expecting.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Paul Smith
Sent: Wednesday, April 02, 2008 1:57 PM
To: R-help
Subject: Re: [R] L-BFGS-B needs finite values of 'fn'

But let me add the following: the part

- 200*(sum(x)-k)^2

of my function is a penalty. In truth, I want to maximize

sum((b^(0:(n-1)))*log(x))

s.t.

sum(x) = k.

Paul


On Wed, Apr 2, 2008 at 6:48 PM, Paul Smith [EMAIL PROTECTED] wrote:
 Thanks, Ravi. The analytical solution, (x_1,x_2,...,x_10), should
  satisfy this equality:

  x_t / x_(t-1) = 0.3.

  Unfortunately, the procedure that you suggest does not lead to a
  solution that satisfies such an equality.

  Paul





  On Wed, Apr 2, 2008 at 5:12 PM, Ravi Varadhan [EMAIL PROTECTED] wrote:
   Paul,
  
Have you tried using BFGS without bounds?
  
sols - optim(rep(20,nvar), f, gr, method=BFGS,
control=list(fnscale=-1))
  
This converges to a solution, although I don't know if the converged
solution is what you want.
  
Ravi.
  
  

---
  
Ravi Varadhan, Ph.D.
  
Assistant Professor, The Center on Aging and Health
  
Division of Geriatric Medicine and Gerontology
  
Johns Hopkins University
  
Ph: (410) 502-2619
  
Fax: (410) 614-9625
  
Email: [EMAIL PROTECTED]
  
Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
  
  
  
  


  
  
  
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On
Behalf Of Paul Smith
  
   Sent: Monday, March 31, 2008 2:25 PM
To: R-help
  
  
   Subject: Re: [R] L-BFGS-B needs finite values of 'fn'
  
On Mon, Mar 31, 2008 at 2:57 PM, Zaihra T [EMAIL PROTECTED] wrote:
 try something like  this before wrapping up your function else i
guess
u'll
 have to stick to Prof Brian Ripley  suggestion his suggestions are
usually
 best bet .

  f - function(x) {

  n - length(x)

 r - sum((b^(0:(n-1)))*log(x)) - 200*(sum(x)-k)^2
  if(!is.finite(r))

 r-1e+20 return(r)

 }

 have a nice day.





Re: [R] help with R semantics

2008-04-03 Thread Ken Knoblauch
Hi,
Charles Annis, P.E. Charles.Annis at StatisticalEngineering.com writes:
 logit.FC - function(POD.floor = 0, POD.ceiling =1)
 {   if (POD.floor  0 | POD.floor  1) stop (POD.floor must be between zero
 and one.) 
 if (POD.ceiling  0 | POD.ceiling  1) stop (POD.ceiling must be
 between zero and one.)
 if (POD.ceiling - POD.floor  difference.criterion) stop
 (paste(POD.ceiling-POD.floor difference must be greater than
 ,difference.criterion, to discourage answer-shopping., sep=))
 linkfun - function(mu) {
 mu - qlogis( (mu - POD.floor)/(POD.ceiling - POD.floor) )
 }
 linkinv - function(eta) {
 eta - POD.floor + (POD.ceiling - POD.floor)*plogis(eta)
 }
 mu.eta - function(eta) {
 (POD.ceiling - POD.floor)*dlogis(eta)# derivitaive of mu with respect to
 eta
 }
 valideta - function(eta) TRUE
 link - paste(logit.FC(, POD.floor, ,, POD.ceiling, ), sep = )
 structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, 
 valideta = valideta, name = link), class = link-glm)
 }
 
 Thanks.
 
 Charles Annis, P.E.
 
  mailto:Charles.Annis at StatisticalEngineering.com
 Charles.Annis at StatisticalEngineering.com
 phone: 561-352-9699
 eFax:  614-455-3265
  http://www.StatisticalEngineering.com
 http://www.StatisticalEngineering.com
 
   [[alternative HTML version deleted]]
 
I defined the following link in the psyphy package that might
be along similar lines
 probit.lambda
function (m = 2, lambda = 0) 
{
m - as.integer(m)
if (m  2) 
stop(m must be an integer  1)
linkfun - function(mu) {
mu - pmax(mu, 1/m + .Machine$double.eps)
mu - pmin(mu, 1 - lambda)
qnorm((m * mu - 1)/(m * (1 - lambda) - 1))
}
linkinv - function(eta) {
1/m + ((m - 1)/m - lambda) * pnorm(eta)
}
mu.eta - function(eta) ((m - 1)/m - lambda) * dnorm(eta)
valideta - function(eta) TRUE
link - paste(probit.lambda(, m, ,, lambda, ), sep = )
structure(list(linkfun = linkfun, linkinv = linkinv, mu.eta = mu.eta, 
valideta = valideta, name = link), class = link-glm)
}
environment: namespace:psyphy

HTH,

ken

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Design package lrm summary and factors

2008-04-03 Thread sj
Hello, I have question regarding the lrm function and estimating the odds
ratio between different levels of a factored variable.
The following code example illustrates the problem I am having. I have a
data set with an outcome variable (0,1) and an input variable (A,B,C). I
would like to estimate the effect of C vs B, but when I perform the summary
I only get A vs B and A vs C, even though I specify that I want to see C vs
B.  Am  I  missing something, any help is appreciated.



outcome-sample(rep(c(0,1),1),1000)
input - sample(rep(c(A,B,C),1),1000)
dset -data.frame(cbind(outcome,input))
dd - datadist(dset)
options(datadist='dd')

mod - lrm(outcome ~ input,data=dset,x=TRUE,y=TRUE)
summary(mod,est.all=FALSE,input=c(C,B))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question on spacing around plot and box in lattice

2008-04-03 Thread Saptarshi Guha
Hello,
How can i increase the padding between the axis and the data  
region(box just containing the figure) in xyplot?

An example:
new - function(x){ if(x0){ return(x^2) }else{ return(x) } }
x - seq(-1,1,length.out=100)
y - sapply(x,new)
sc=list()
sc$alternating=0
sc$tck=0

xyplot(y~x,type='l',
   aspect=0.05,
   scales=sc,
   col='black',
   xlab='',
   ylab=''
)
When plotted on quartz(Mac) device, there is very little padding  
between the bottom of the curve and the lower horizontal axis and  
between the maximum values of the curve and the upper horizontal axis.
How can i increase the padding  - top,bottom,left and right?
Thanks in advance for your time.
Regards
Saptarshi

Saptarshi Guha | [EMAIL PROTECTED] | http://www.stat.purdue.edu/~sguha


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 62, Issue 2

2008-04-03 Thread jarod_v6
Dear all.

Thanks in advance for any help.

I need to analisy a long list of data time/r\fluoscence

I want to make one plot with standard deviation and the average data... Could 
you suggest me how to make that..
pippo.csv

0   0
4.013   62.96
6.053   112.3
24.14   106.2
...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf file specification, non-ASCII

2008-04-03 Thread Peter Jepsen
Thank you for your help, Duncan and Gabor. Yes, I found an early line
feed in line 1562740, so I have corrected that error. The thing is, it
takes me many, many hours to save the file, so I would like to confirm
that there are no more errors further down the file. The ffe tool sounds
like a perfect tool for this job, but it doesn't seem to be available
for Windows. Is anybody out there aware of a similar Windows tool?

Thank you again for your help.
Peter.

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: 3. april 2008 17:08
To: Peter Jepsen
Subject: Re: [R] sqldf file specification, non-ASCII

One other thing you could try would be to run it through
ffe (fast file extractor) which is a free utility that you can
find via google.  Use the ffe's loose argument.  It can find
bad lines and since its not dependent on R would give
you and independent check.  Regards.

On Thu, Apr 3, 2008 at 10:36 AM, Gabor Grothendieck
[EMAIL PROTECTED] wrote:
 Hi, Can you try it with the first 100 lines, say, of the data and
 also try reading it with read.csv to double check your arguments
 (note that sql args are similar but not entirely identical to
read.csv)
 and if it still gives this error send me that 100 line file and I will
 look at it tonight or tomorrow.  Regards.


 On Thu, Apr 3, 2008 at 10:22 AM, Peter Jepsen [EMAIL PROTECTED] wrote:
  Dear R-Listers,
 
  I am a Windows user (R 2.6.2) using the development version of sqldf
to
  try to read a 3GB file originally stored in .sas7bdat-format. I
convert
  it to comma-delimited ASCII format with StatTransfer before trying
to
  import just the rows I need into R. The problem is that I get this
  error:
 
   f - file(hugedata.csv)
   DF - sqldf(select * from f where C_OPR like 'KKA2%',
  file.format=list(header=T, row.names=F))
  Error in try({ :
   RS-DBI driver: (RS_sqlite_import: hugedata.csv line 1562740
expected
  52 columns of data but found 19)
  Error in sqliteExecStatement(con, statement, bind.data) :
   RS-DBI driver: (error in statement: no such table: f)
 
  Now, I know that my SAS-using colleagues are able to use this file
with
  SAS, so I was wondering whether StatTransfer'ing it to the SAS XPORT
  format which can be read with the 'read.xport' function in the
'foreign'
  package would be a better approach. The problem is, I don't know
  how/whether I can do that at all with sqldf. I tried various ways
like
  f - file(read.xport(hugedata.xport))
  but I consistently got an error message from the sqldf command. I
don't
  recall the exact error message, unfortunately, but can anybody tell
me
  whether it is at all possible to read in files in non-ASCII format
  without having to put them in R memory?
 
  Thank you for your assistance.
  Peter.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sqldf file specification, non-ASCII

2008-04-03 Thread Gabor Grothendieck
The Windows version is on sourceforge.

On Thu, Apr 3, 2008 at 1:29 PM, Peter Jepsen [EMAIL PROTECTED] wrote:
 Thank you for your help, Duncan and Gabor. Yes, I found an early line
 feed in line 1562740, so I have corrected that error. The thing is, it
 takes me many, many hours to save the file, so I would like to confirm
 that there are no more errors further down the file. The ffe tool sounds
 like a perfect tool for this job, but it doesn't seem to be available
 for Windows. Is anybody out there aware of a similar Windows tool?

 Thank you again for your help.
 Peter.


 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: 3. april 2008 17:08
 To: Peter Jepsen
 Subject: Re: [R] sqldf file specification, non-ASCII

 One other thing you could try would be to run it through
 ffe (fast file extractor) which is a free utility that you can
 find via google.  Use the ffe's loose argument.  It can find
 bad lines and since its not dependent on R would give
 you and independent check.  Regards.

 On Thu, Apr 3, 2008 at 10:36 AM, Gabor Grothendieck
 [EMAIL PROTECTED] wrote:
  Hi, Can you try it with the first 100 lines, say, of the data and
  also try reading it with read.csv to double check your arguments
  (note that sql args are similar but not entirely identical to
 read.csv)
  and if it still gives this error send me that 100 line file and I will
  look at it tonight or tomorrow.  Regards.
 
 
  On Thu, Apr 3, 2008 at 10:22 AM, Peter Jepsen [EMAIL PROTECTED] wrote:
   Dear R-Listers,
  
   I am a Windows user (R 2.6.2) using the development version of sqldf
 to
   try to read a 3GB file originally stored in .sas7bdat-format. I
 convert
   it to comma-delimited ASCII format with StatTransfer before trying
 to
   import just the rows I need into R. The problem is that I get this
   error:
  
f - file(hugedata.csv)
DF - sqldf(select * from f where C_OPR like 'KKA2%',
   file.format=list(header=T, row.names=F))
   Error in try({ :
RS-DBI driver: (RS_sqlite_import: hugedata.csv line 1562740
 expected
   52 columns of data but found 19)
   Error in sqliteExecStatement(con, statement, bind.data) :
RS-DBI driver: (error in statement: no such table: f)
  
   Now, I know that my SAS-using colleagues are able to use this file
 with
   SAS, so I was wondering whether StatTransfer'ing it to the SAS XPORT
   format which can be read with the 'read.xport' function in the
 'foreign'
   package would be a better approach. The problem is, I don't know
   how/whether I can do that at all with sqldf. I tried various ways
 like
   f - file(read.xport(hugedata.xport))
   but I consistently got an error message from the sqldf command. I
 don't
   recall the exact error message, unfortunately, but can anybody tell
 me
   whether it is at all possible to read in files in non-ASCII format
   without having to put them in R memory?
  
   Thank you for your assistance.
   Peter.
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prettyR 25% quartile, 75% quartile

2008-04-03 Thread stephen sefick
I am using the describe function in prettyR.  I would like to add the
25% 75% quartiles to the summary table

how do I do this

I have tried

describe(x.f, num.desc=c(mean, median, sd, min, max,
skewness, quantile(x.f, na.rm=T, probs=seq(0.25, 0.75)),
valid.n))

help

-- 
Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonlinear equation

2008-04-03 Thread Berend Hasselman


Robert Mcfadden wrote:
 
 Dear R Users,
 I'm trying to find function that allow me to solve one nonlinear equation.
 Functions that I found are good for optimization problems.
 
 

How about 

?uniroot


-- 
View this message in context: 
http://www.nabble.com/Nonlinear-equation-tp16452868p16468039.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] iterative loop with user input?

2008-04-03 Thread Christopher Marcum
Hello R-Users,

I would like to use an iterative loop to collect user input from within a
function. I'm sure that this would be some combination of for,break,
and next but have not been able to get the syntax down.

I would like to print some text to the screen at each step in the loop,
ask the user for input which is saved in an object, and then advance the
loop.

Here is an example:

#anchor is a file with unique ids
anchor-rep(1:30)
anchor-paste(anchor,uid,sep=)

#codelist is where I would like to store user input
codelist-NULL

for(i in 1:30) {
#tell the user which ID is being coded
print(paste(You are coding unique ID,anchor[i],sep=:   ))
#Read a line from a text file:
print(readLines(file=file_with_30_lines.txt,warn=F)[i])
#Ask the user for input
codelist[i]-readline(paste(Select one of the following: \n \t please
enter 1, 2, or 3: \n Enter Your Response HERE:   ,sep=))
}

Ideally, loop should work from inside a function.

Any tips? Thanks in advance for your time and patience.

Best,
Chris Marcum
Sociology/Cal(it)^2
University of California-Irvine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What to do with this data?

2008-04-03 Thread Lucke, Joseph F
First compute side-by-side boxplots for the two data sets.  You will see
that the PG group has one (189), maybe 2 (also, 52) extreme values
whereas the PG group has none.  The PG group will have a smaller median
than the PB group.  Means, st devs, and se's are legitimate statistics
but do not have the usual (normal theory) interpretation, at least until
you can account for or eliminate the extreme values. 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of mika03
Sent: Thursday, April 03, 2008 2:16 PM
To: r-help@r-project.org
Subject: [R] What to do with this data?



Hello,

This is not necessarily a question about R, but more about how we should
display our data in general. (Will we then use R to do that, once we
know what to do ;-) I received good replies about such things in the
past on this mailing list so I give it a go.

Here's what we did:
We showed a fairly large number of subjects search engine queries and
different possible search engine responses. We assumed that users would
like some our responses better than others and wanted to check this.
Subjects could rate a query/response pair on a scale from 0 (very bad
response) to 10 (very good response).

Here are all the judgments we received for one particular class of
response to queries which we thought users would like:

Predicted-Good-0, 4
Predicted-Good-1, 1
Predicted-Good-2, 11
Predicted-Good-3, 8
Predicted-Good-4, 25
Predicted-Good-5, 12
Predicted-Good-6, 21
Predicted-Good-7, 25
Predicted-Good-8, 30
Predicted-Good-9, 52
Predicted-Good-10, 189

And here are all the judgments we received for one particular class of
response to queries which we thought users would NOT like:

Predicted-Bad-0, 34
Predicted-Bad-1, 23
Predicted-Bad-2, 45
Predicted-Bad-3, 60
Predicted-Bad-4, 42
Predicted-Bad-5, 50
Predicted-Bad-6, 21
Predicted-Bad-7, 20
Predicted-Bad-8, 25
Predicted-Bad-9, 19
Predicted-Bad-10, 39 

Here's a small table listing number of observations, mean, standard
deviation and standard error:

Type, N, Mean, StDev, StErr
Predicted-Good, 378, 8.21693121693122, 2.47110906286224,
0.12710013550711 Predicted-Bad, 378, 4.5978835978836, 3.02059872953413,
0.155362834286119

The question we have are:

a) It doesn't seem like our data follows a standard distribution.
Therefore is it okay to calculate mean, standard deviation and standard
error at all?

b) We initially created a figure plotting the mean and a bar around it
indicating standard deviation. Then somebody who knows more about
statistics told us we should display the mean and error bars around it
to depict a 95% Confidence Interval, mean +/- 1.96*SE. But if we are
doing this, aren't we forgetting to mention vital parts of our data,
that is that we indeed get better means for Good responses, but that
the individual data points are all over the place (especially for
Predicted-Bad)? We would capture this by showing standard deviation. 

c) And finally: What would be the best way to present this data anyway?


Thanks a lot!



--
View this message in context:
http://www.nabble.com/What-to-do-with-this-data--tp16467948p16467948.htm
l
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bandwidth estimation using bw.SJ

2008-04-03 Thread D.Pleydell


Majnu John wrote:
 
 When I use bw.SJ (based on Sheather  Jones, 1991) in R to estimate
 the bandwidth for a highly skewed data, I get the following message:
 sample is too sparse to find TD. I played around with the parameters
 such as no. of bins (nb), lower, upper (range over which to minimize) to
 no avail. My sample size is 50,000.
 
 Can anyone tell me what this means and of some way to get around
 this. Any help is very much appreciated.
 

I also get the same error message. It seems to be a function of sample size.
The following example runs fine with fewer random numbers.

rdat - rnorm(11)
hist(rdat,breaks=100)
rdpi -  density(rdat, width=SJ-dpi, n=ceiling(diff(range(rdat))*100))
Erreur dans bw.SJ(x, method = dpi) : sample is too sparse to find TD

Clearly density() doesn't like being turned up to 11. 
I didn't find a solution in R, but the follow package for python seems to
handle sample sizes at least two orders of magnitude greater than density()
in  R :-p
http://bonsai.ims.u-tokyo.ac.jp/~mdehoon/software/python/Statistics/manual/index.xhtml

Good luck
Dave



-- 
View this message in context: 
http://www.nabble.com/bandwidth-estimation-using-bw.SJ-tp16284293p16470097.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] L-BFGS-B needs finite values of 'fn'

2008-04-03 Thread Paul Smith
Thanks, Ravi, for your insight. My initial experiments were conducted
with b = 0.7, and my procedure (similar to yours) worked reasonably
well. Then I wanted to check the quality of my procedure with lower
b's, and the weak result motivated my initial post here.

I have meanwhile got a reasonably accurate solution with R through the
DEoptim package. The number of iterations was 5 and the code the
following:

library(DEoptim)

k - 1
b - 0.3

f - function(x) {

 n - length(x)

 if (sum(x)  k)
   r - Inf
 else r - -sum((b^(0:(n-1)))*log(x))

 return(r)

}

nvar - 10
(sols - DEoptim(f, lower=rep(0.1,nvar), upper=rep(k,nvar),
control=list(itermax=5)))
summary(sols)

The result was:

* summary of DEoptim object *
best member   :  6999.892 2099.949 630.0255 189.0694 56.85699 16.95149
5.12811 1.527 0.46193 0.13798
best value:  -11.91101
after :  50001 iterations
FUN evaluated :  2500050 times
*

I am still struggling to find a fast way of getting the solution to my
problem at least as accurate as the one returned by DEoptim. With so
many iterations, DEoptim takes longer than 15 minutes to complete the
job.

Paul



On Thu, Apr 3, 2008 at 4:38 PM, Ravi Varadhan [EMAIL PROTECTED] wrote:
 Paul,

  After looking at your objective function and the penalty, I realized that
  since the contribution of each term to the objective function decreases
  geometrically, the later terms contribute relatively less to the overall
  maximum.  Hence the numerical estimation of those terms is much less precise
  compared to the dominant terms.  This explains why the ratios of x[k]/x[k-1]
  is not close to b, for small values of b.

  Now let us try a larger value of b, say, b=0.8.

  Here are the results from optim, with BFGS.

k - 1
b - 0.8
  
f - function(x, pen, k, b) {
  +  n - length(x)
  +  r - sum((b^(0:(n-1)))*log(x)) - pen*(sum(x)-k)^2
  +  return(r)
  +  }
  
gr - function(x, pen, k, b) {
  +  n - length(x)
  +  r - (b^(0:(n-1)))*(1/x) - 2*pen*(sum(x)-k)
  +  return(r)
  +  }
  
  
   nvar - 10
   p0 - runif(nvar, 0, 20)
   sols - optim(p0, f, gr, method=BFGS, control=list(fnscale=-1), k=k,
  b=b, pen=1)
   sum(sols$par)
  [1] 1
   sols$par[2:nvar] / sols$par[1:(nvar-1)]
  [1] 0.8038902 0.8021216 0.7974323 0.7999681 0.7983918 0.8006497 0.8013940
  [8] 0.8044702 0.7879365
  
  

  As you can see, things are much better!

  Hope this helps,

 Ravi.
  
  ---

  Ravi Varadhan, Ph.D.

  Assistant Professor, The Center on Aging and Health

  Division of Geriatric Medicine and Gerontology

  Johns Hopkins University

  Ph: (410) 502-2619

  Fax: (410) 614-9625

  Email: [EMAIL PROTECTED]

  Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html



  
  

  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On


 Behalf Of Ravi Varadhan
  Sent: Wednesday, April 02, 2008 2:03 PM
  To: 'Paul Smith'; 'R-help'
  Subject: Re: [R] L-BFGS-B needs finite values of 'fn'

  Yes, that is very important.  If you look at the ratios x[k]/x[k-1], they
  are very close to 0.3 for the first few components, and then they start
  slowly diverging (ratio becomes smaller than 0.3) from that.

  So, optim is indeed finding a correct solution to the problem that you
  actually posed.  You could increase your penalty to get a solution that is
  closer to the analytical solution you are expecting.

  Ravi.

  
  ---

  Ravi Varadhan, Ph.D.

  Assistant Professor, The Center on Aging and Health

  Division of Geriatric Medicine and Gerontology

  Johns Hopkins University

  Ph: (410) 502-2619

  Fax: (410) 614-9625

  Email: [EMAIL PROTECTED]

  Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html



  
  


  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
  Behalf Of Paul Smith
  Sent: Wednesday, April 02, 2008 1:57 PM
  To: R-help
  Subject: Re: [R] L-BFGS-B needs finite values of 'fn'

  But let me add the following: the part

  - 200*(sum(x)-k)^2

  of my function is a penalty. In truth, I want to maximize

  sum((b^(0:(n-1)))*log(x))

  s.t.

  sum(x) = k.

  Paul


  On Wed, Apr 2, 2008 at 6:48 PM, Paul Smith [EMAIL PROTECTED] wrote:
   Thanks, Ravi. The analytical solution, (x_1,x_2,...,x_10), should
satisfy this equality:
  
x_t / x_(t-1) = 0.3.
  
Unfortunately, the procedure that you suggest does not lead to a
solution that satisfies such an equality.
  
Paul
  
  
  
  
  
On Wed, Apr 2, 2008 at 5:12 PM, Ravi Varadhan [EMAIL PROTECTED] wrote:
 Paul,

  Have you tried using BFGS 

[R] coding for categorical variables with unequal observations

2008-04-03 Thread Tanya Yatsunenko
Hi,
I am doing multiple regression, and have several X variables that are 
categorical.
I read that I can use dummy or contrast codes for that, but are there 
any special rules when there're unequal #observations in each groups (4 
females vs 7 males in a gender variable)?
Also, can R generate these codes for me?
THanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] convert data frame values

2008-04-03 Thread Adrian Johnson
Hello:
How can I interchange symbols for numeric values in a data frame.

 test
  f  s  t
1 a  1 -1
2 b  1 -3
3 c -1  1

say I have test d.f . I want to make flip number that are positive to
negative and negative to positive only for numerics in column 's'

my desired result:

 new.test
  f  s  t
1 a  -1 -1
2 b  -1 -3
3 c  1  1


thanks
Adrian

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coding for categorical variables with unequal observations

2008-04-03 Thread Nordlund, Dan (DSHS/RDA)
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Tanya Yatsunenko
 Sent: Thursday, April 03, 2008 1:55 PM
 To: r-help@r-project.org
 Subject: [R] coding for categorical variables with unequal 
 observations
 
 Hi,
 I am doing multiple regression, and have several X variables that are 
 categorical.
 I read that I can use dummy or contrast codes for that, but are there 
 any special rules when there're unequal #observations in each 
 groups (4 
 females vs 7 males in a gender variable)?
 Also, can R generate these codes for me?
 THanks.
 

You don't need to do anything special, and yes you can just let SAS do it for 
you.  For most of the regression PROCs you can put your categorical variables 
in a CLASS statement.  Depending on which procedure you are using, you may be 
able to specify whether you want effects or dummy coding, and which level of 
the categorical variable should be the comparison level.  It is also possible 
to use PROC GLMMOD to create your design variables to be fed into other PROCs.  
Other approaches are possible as well.

If you provide more detail on what analyses you plan to undertake, someone may 
be able to provide more specific advice.

Hope this is helpful,

Dan

Daniel J. Nordlund
Research and Data Analysis
Washington State Department of Social and Health Services
Olympia, WA  98504-5204
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coding for categorical variables with unequal observations

2008-04-03 Thread Nordlund, Dan (DSHS/RDA)
Ignore the previous post.  I completely spaced out on which mailing list I was 
reading and thought this was a SAS question.  My apologies, I'll just crawl 
back into my hole. :-)

Obviously not helpful,

Dan

Daniel J. Nordlund
Research and Data Analysis
Washington State Department of Social and Health Services
Olympia, WA  98504-5204
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Nordlund, 
 Dan (DSHS/RDA)
 Sent: Thursday, April 03, 2008 2:45 PM
 To: [EMAIL PROTECTED]; r-help@r-project.org
 Subject: Re: [R] coding for categorical variables with 
 unequal observations
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of Tanya Yatsunenko
  Sent: Thursday, April 03, 2008 1:55 PM
  To: r-help@r-project.org
  Subject: [R] coding for categorical variables with unequal 
  observations
  
  Hi,
  I am doing multiple regression, and have several X 
 variables that are 
  categorical.
  I read that I can use dummy or contrast codes for that, but 
 are there 
  any special rules when there're unequal #observations in each 
  groups (4 
  females vs 7 males in a gender variable)?
  Also, can R generate these codes for me?
  THanks.
  
 
 You don't need to do anything special, and yes you can just 
 let SAS do it for you.  For most of the regression PROCs you 
 can put your categorical variables in a CLASS statement.  
 Depending on which procedure you are using, you may be able 
 to specify whether you want effects or dummy coding, and 
 which level of the categorical variable should be the 
 comparison level.  It is also possible to use PROC GLMMOD 
 to create your design variables to be fed into other PROCs.  
 Other approaches are possible as well.
 
 If you provide more detail on what analyses you plan to 
 undertake, someone may be able to provide more specific advice.
 
 Hope this is helpful,
 
 Dan
 
 Daniel J. Nordlund
 Research and Data Analysis
 Washington State Department of Social and Health Services
 Olympia, WA  98504-5204
  
  
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] iterative loop with user input?

2008-04-03 Thread jim holtman
Is this what you want to do:

 x.f - function(){
+ x - list()
+ for (i in 1:10){
+ z - readline(paste(Parameter , i, : , sep=''))
+ # just store the input in a list with the same key
+ x[[z]] - z
+ }
+ x
+ }
 x.f()
Parameter 1: 12
Parameter 2: 2
Parameter 3: asdf
Parameter 4: 33
Parameter 5: scs
Parameter 6: sdaf2
Parameter 7: 234
Parameter 8: gfd
Parameter 9: 123
Parameter 10: aa
$`12`
[1] 12

$`2`
[1] 2

$asdf
[1] asdf



On Thu, Apr 3, 2008 at 1:18 PM, Christopher Marcum [EMAIL PROTECTED] wrote:
 Hello R-Users,

 I would like to use an iterative loop to collect user input from within a
 function. I'm sure that this would be some combination of for,break,
 and next but have not been able to get the syntax down.

 I would like to print some text to the screen at each step in the loop,
 ask the user for input which is saved in an object, and then advance the
 loop.

 Here is an example:

 #anchor is a file with unique ids
 anchor-rep(1:30)
 anchor-paste(anchor,uid,sep=)

 #codelist is where I would like to store user input
 codelist-NULL

 for(i in 1:30) {
 #tell the user which ID is being coded
 print(paste(You are coding unique ID,anchor[i],sep=:   ))
 #Read a line from a text file:
 print(readLines(file=file_with_30_lines.txt,warn=F)[i])
 #Ask the user for input
 codelist[i]-readline(paste(Select one of the following: \n \t please
 enter 1, 2, or 3: \n Enter Your Response HERE:   ,sep=))
 }

 Ideally, loop should work from inside a function.

 Any tips? Thanks in advance for your time and patience.

 Best,
 Chris Marcum
 Sociology/Cal(it)^2
 University of California-Irvine

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] convert data frame values

2008-04-03 Thread Peter Alspach
Adrian

If test[,2] is numeric then

test[,2] - -test[,2]

should be all you need.  If it isn't numeric you'll need to convert it
first; e.g.,

test[,2] - -as.numeric(as.character(test[,2]))

which can, of course, be converted back to the original class.

HTH ...

Peter Alspach


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Adrian Johnson
 Sent: Friday, 4 April 2008 10:58 a.m.
 To: r-help
 Subject: [R] convert data frame values
 
 Hello:
 How can I interchange symbols for numeric values in a data frame.
 
  test
   f  s  t
 1 a  1 -1
 2 b  1 -3
 3 c -1  1
 
 say I have test d.f . I want to make flip number that are 
 positive to negative and negative to positive only for 
 numerics in column 's'
 
 my desired result:
 
  new.test
   f  s  t
 1 a  -1 -1
 2 b  -1 -3
 3 c  1  1
 
 
 thanks
 Adrian
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

The contents of this e-mail are privileged and/or confidential to the named
 recipient and are not to be used by any other person and/or organisation.
 If you have received this e-mail in error, please notify the sender and delete
 all material pertaining to this e-mail.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to ask for *fixed* number of distributions under parameterized Gaussian mixture model.

2008-04-03 Thread Wensui Liu
Hi, Chen,
I don't know how you are doing it.
however, per my limited knowledge, it is easy with flexmix package.



On Thu, Apr 3, 2008 at 6:26 AM, Hung-Hsuan Chen (Sean)
[EMAIL PROTECTED] wrote:
 Dear R users:
  I am wondering how to ask for *fixed* number of distributions under
  parameterized Gaussian mixture model.

  I know that em() and some related functions can predict the
  parameterized Gaussian mixture model. However, there seems no
  parameter to decide number of distributions to be mixed (if we known
  the value in advance).

  That is, assume I know the (mixed)data is from 3 different
  distributions. The output, however, may indicate that number of
  distributions that form the data is 4. How can I assign number of
  distributions is 3 in advance?

  Thanks a lot for your help.

  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.




-- 
===
WenSui Liu
ChoicePoint Precision Marketing
Phone: 678-893-9457
Email : [EMAIL PROTECTED]
Blog : statcompute.spaces.live.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] newbie subset question

2008-04-03 Thread Farley, Robert
I want to look at all the records making up a row and column of a
crosstab, but I'm not getting it right.  I'm trying to use subset() but
my selection ((prod_act==other) || (attr_act==other)) gives my no
records.  See the second table below.  Getting just the row does work,
as seen in the third table.

 

Why is this failing me? 

 

 

 .Table - xtabs(~prod_act+attr_act, data=Dataset) ; .Table ;
remove(.Table)

attr_act

prod_act dk/rf  home other school  shop soc/rec university  work

  dk/rf 52 0 0  0 0   0  0 0

  home   0 12380  2826552  12041054   2709 15842

  other  0 0   703  162  48 42   227

  school 0 0 7187 4  11  0 8

  shop   0 029  1   466  17  730

  soc/rec0 039  122 191  840

  university 0 067  040  38247   119

  work   0 0   215  495  76107  1581

 D2 - subset(Dataset, ((prod_act==other) || (attr_act==other))  ,
select=direction:attr_mta_taz )

 .Table - xtabs(~prod_act+attr_act, data=D2)  ; .Table ;
remove(.Table)

attr_act

prod_act dk/rf home other school shop soc/rec university work

  dk/rf  00 0  00   0  00

  home   00 0  00   0  00

  other  00 0  00   0  00

  school 00 0  00   0  00

  shop   00 0  00   0  00

  soc/rec00 0  00   0  00

  university 00 0  00   0  00

  work   00 0  00   0  00

 D3 - subset(Dataset, (prod_act==other)   ,
select=direction:attr_mta_taz )

 .Table - xtabs(~prod_act+attr_act, data=D3)  ; .Table ;
remove(.Table)

attr_act

prod_act dk/rf home other school shop soc/rec university work

  dk/rf  00 0  00   0  00

  home   00 0  00   0  00

  other  00   703  1   62  48 42  227

  school 00 0  00   0  00

  shop   00 0  00   0  00

  soc/rec00 0  00   0  00

  university 00 0  00   0  00

  work   00 0  00   0  00

 

 

 

Robert Farley

Metro

1 Gateway Plaza

Mail Stop 99-23-7

Los Angeles, CA 90012-2952

Voice: (213)922-2532

Fax:(213)922-2868

www.Metro.net 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coding for categorical variables with unequal observations

2008-04-03 Thread Tanya Yatsunenko
Also, since I just started to use R, I have trouble generating and
understanding some of the codes, especially choosing the correct ones.
Thanks!
tanya

On Thu, Apr 3, 2008 at 3:54 PM, Tanya Yatsunenko [EMAIL PROTECTED] wrote:

 Hi,
 I am doing multiple regression, and have several X variables that are
 categorical.
 I read that I can use dummy or contrast codes for that, but are there any
 special rules when there're unequal #observations in each groups (4 females
 vs 7 males in a gender variable)?
 Also, can R generate these codes for me?
 THanks.




-- 
Tanya

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newbie subset question

2008-04-03 Thread jim holtman
Use | instead of ||.

On Thu, Apr 3, 2008 at 6:27 PM, Farley, Robert [EMAIL PROTECTED] wrote:
 I want to look at all the records making up a row and column of a
 crosstab, but I'm not getting it right.  I'm trying to use subset() but
 my selection ((prod_act==other) || (attr_act==other)) gives my no
 records.  See the second table below.  Getting just the row does work,
 as seen in the third table.



 Why is this failing me?





  .Table - xtabs(~prod_act+attr_act, data=Dataset) ; .Table ;
 remove(.Table)

attr_act

 prod_act dk/rf  home other school  shop soc/rec university  work

  dk/rf 52 0 0  0 0   0  0 0

  home   0 12380  2826552  12041054   2709 15842

  other  0 0   703  162  48 42   227

  school 0 0 7187 4  11  0 8

  shop   0 029  1   466  17  730

  soc/rec0 039  122 191  840

  university 0 067  040  38247   119

  work   0 0   215  495  76107  1581

  D2 - subset(Dataset, ((prod_act==other) || (attr_act==other))  ,
 select=direction:attr_mta_taz )

  .Table - xtabs(~prod_act+attr_act, data=D2)  ; .Table ;
 remove(.Table)

attr_act

 prod_act dk/rf home other school shop soc/rec university work

  dk/rf  00 0  00   0  00

  home   00 0  00   0  00

  other  00 0  00   0  00

  school 00 0  00   0  00

  shop   00 0  00   0  00

  soc/rec00 0  00   0  00

  university 00 0  00   0  00

  work   00 0  00   0  00

  D3 - subset(Dataset, (prod_act==other)   ,
 select=direction:attr_mta_taz )

  .Table - xtabs(~prod_act+attr_act, data=D3)  ; .Table ;
 remove(.Table)

attr_act

 prod_act dk/rf home other school shop soc/rec university work

  dk/rf  00 0  00   0  00

  home   00 0  00   0  00

  other  00   703  1   62  48 42  227

  school 00 0  00   0  00

  shop   00 0  00   0  00

  soc/rec00 0  00   0  00

  university 00 0  00   0  00

  work   00 0  00   0  00

 





 Robert Farley

 Metro

 1 Gateway Plaza

 Mail Stop 99-23-7

 Los Angeles, CA 90012-2952

 Voice: (213)922-2532

 Fax:(213)922-2868

 www.Metro.net






[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] ggplot2 - version 0.6

2008-04-03 Thread hadley wickham
ggplot2 

ggplot2 is a plotting system for R, based on the grammar of graphics,
which tries to take the good parts of base and lattice graphics and
avoid bad parts. It takes care of many of the fiddly details
that make plotting a hassle (like drawing legends) as well as
providing a powerful model of graphics that makes it easy to produce
complex multi-layered graphics.

Find out more at http://had.co.nz/ggplot2, and check out the over 400
examples of ggplot in use.

ggplot 0.6


The two big changes in this release are improved documentation and legends:

 * all major ggplot2 components now have their own built in
documentation, so that (e.g.) ?stat_smooth or ?geom_point now give you
useful information

 * the legend code is now considerably more sophisticated and will
attempt to merge together legends for the same variable

 * also, legends are drawn based on the geoms used (instead of the
scales used, as previously) so should match the plot much better (e.g.
for geom_smooth, geom_boxplot, geom_vline, geom_abline,
geom_pointrange).

These features are new, so there are likely to be a few bugs that I
haven't discovered.  Please me know if you do find any.

Other additions and corrections

  * coord_equal: should now work correctly in all situations
  * coord_polar: add start and direction parameters, giving more
control over the layout of the polar coords
  * coord_polar: added my favourite pie chart example
  * coord_trans now deals with groups appropriately, at the cost of
decreased speed
  * geom_abline, geom_vline, geom_hline: should now behave better in a
wider variety of settings
  * geom_boxplot: deals with continuous x-axis and grouping much better
  * geom_boxplot: now has it's own legend which actually looks like a boxplot
  * geom_boxplot: reports if missing values removed
  * geom_crossbar: the middle line is now display thicker than the
other lines, controlled by the parameter fatten (thanks to Heike
Hofmann for the suggestion)
  * geom_density: fix scale adjustment bug in geom_density
  * geom_line, geom_text: all size measurements (now lines and text as
well) are measured in mm, lines/paths default to paths 0.5mm wide
  * geom_rug: new to add marginal rug plots
  * geom_smooth: added example showing how to use geom_smooth with
your own models
  * geom_smooth: fixed bug where if se=FALSE x axis always includes 0
  * geom_vline, geom_hline: yet another rewrite which should make them
more powerful and less error prone.
  * ggsave reports width and height of saved image
  * position_stack: fixed bug when data was empty
  * qplot: allow qplot to use computed aesthetics too
  * scale_continuous: tweaks to minor breaks to make appearance better
on wider range of coordinate systems
  * scale_discrete: all discrete scales now have labels argument which
you can use to override the factor levels
  * scale_discrete: now works correctly with character vectors
  * scale_size: changed default range to [0.5, 3] to better reflect
new sizing decisions
  * scale_size: legends resize to avoid overlaps
  * scale_x_continuous, scale_y_continuous: new convenience functions
xlim and ylim (and zlim) that make it even easier to adjust the limits
of the x, y, and z axes
  * stat_bin, geom_area: fixed bug in combination of stat_bin and
geom_area that made it difficult to draw frequency polygons
  * stat_bin: fixed bug which resulted in increased counts when the x
axis was a categorical variable with a single level (thanks to Bob
Muenchen for pointing this out!)
  * stat_bin: no longer incorrectly warns that binwidth is unspecified
when breaks are set
  * stat_bin: now takes origin argument to manually specify origin of
first bin (default is round_any(min(range), bin_width, floor))
  * stat_boxplot, stat_contour, stat_density2d, stat_qq, stat_density:
na.rm parameter added to the following statistics (thanks to Leena
Choi for suggesting this)
  * stat_function: new, makes it easy to superimpose a function on the plot
  * stat_qq: axes flipped to agree with base R
  * stat_qq: now uses sample aesthetic to select variable for summary
  * stat_quantile: updated to work with latest version of quantreg
  * stat_spoke: new, to make it possible to use geom_segment
parameterised by angle and radius (thanks to Jiho for the suggestion)
  * stat_summary: better documentation
  * stat_summary: convenient auto wrapping of simple summary functions

Miscellaneous changes:

  * it's now easy to change the default scales (and their arguments)
with the set_default_scale function, see ?set_default_scale for more
details (thanks to Bob Muenchen for the suggestion)
  * new order aesthetic which controls the order in which elements are plotted
  * min and max are now scaled the same way as y
  * functions are silently dropped (e.g. aes(colour=col))
  * scales do not try and map variables that don't exist (fixes some
rather obscure bugs)
  * 

[R] markov switching model

2008-04-03 Thread harsu Gie
excuse me, i need your help. introduce my name is  harsugi, i have just 
graduated from my study at Gadjah mada
University, Indonesia. I have several Question about using R.
1. I  try to solve  markov switcing model with R, but i dont know if there any 
packages discuss about it, can you tell me if any?
2. when i read some papers, i saw that most of them use markov chain montecarlo 
methods. i don't know how to implement this procedure in R. Could you tell me 
how? and what package must i use?

please send me some responses/ answers to this e-mail. i will be very glad to 
receive and read your answers.


thank you 

Harsugi

   
-
[[elided Yahoo spam]]
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] looking for a CDF of bivariate noncentral Chisquare

2008-04-03 Thread Rossita Mohamad Yunus
Hi,

 

I would like to know if there is a program written in R to get the CDF
(cumulative distribution function) of a bivariate non-central chi-square
distribution.

 

Hope someone will reply.

 

Thank you,

Rossita M Yunus

[EMAIL PROTECTED]

 


This email (including any attached files) is confidentia...{{dropped:19}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] coding for categorical variables with unequal observations

2008-04-03 Thread Prof Brian Ripley
I think you need to do some background reading.  R does this automatically 
for you, and offers many options for how it is done.  (That you have asked 
suggests you have missed that.)

The account by Bill Venables in chapter 6 of MASS (the book, see the R 
FAQ) is regarded as the most comprehensive available.

On Thu, 3 Apr 2008, Tanya Yatsunenko wrote:

 Also, since I just started to use R, I have trouble generating and
 understanding some of the codes, especially choosing the correct ones.
 Thanks!
 tanya

 On Thu, Apr 3, 2008 at 3:54 PM, Tanya Yatsunenko [EMAIL PROTECTED] wrote:

 Hi,
 I am doing multiple regression, and have several X variables that are
 categorical.
 I read that I can use dummy or contrast codes for that, but are there any
 special rules when there're unequal #observations in each groups (4 females
 vs 7 males in a gender variable)?
 Also, can R generate these codes for me?
 THanks.




 --
 Tanya

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.