Re: [R] Dataset on Baltimore home energy costs

2006-10-21 Thread Richard Graham
 elect: Total cost of electricity (including delivery and commodity
 charges)

With Maryland's BGE/PSC/State Legislature rate control, does anybody
_really_ know what the total cost is?  That may be what you paid then,
but was that the total cost?

The cost data may be difficult to compare with non rate controlled
utility cost data.

Richard Graham

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] binom.test [Broadcast]

2006-10-21 Thread Liaw, Andy
To quote one of the previous answers you've got: The formula you're
using is the TV.  The one binom.test() uses is the ballpark.  Take your
pick.

Andy 

From: Ethan Johnsons
 
 Thank you for the info.  It helps.
 
 After all, it would be:
 
  0.1304348-1.96*(sqrt((0.1304348*(1-0.1304348))/46))
 [1] 0.03310968
  0.1304348+1.96*(sqrt((0.1304348*(1-0.1304348))/46))
 [1] 0.2277599
 
 Does R have a function for the calculation above?
 
 ej
 
 
 On 10/20/06, Francisco J. Zagmutt [EMAIL PROTECTED] wrote:
  Ethan,
 
  You need to explain why you think this is not the right 
 function to 
  use. R is doing exactly what you are asking it to do.  Now 
 is up to 
  you to choose the methodology you feel is correct.
  For a good discussion on your particular issue I recommend you the 
  following
  reference:
 
  A. Agresti and B. A. Coull, Approximate is better than exact for 
  interval estimation of binomial proportions, The American 
 Statistician, vol. 52, no.
  2, pp. 119-126, 1998.
 
  Once you figure out the right function to use see if the 
 function is
  available in R.   If not readily available, and if after 
 searching through
  R's documentation and the forum archives you still can't 
 find a way to 
  perform the calculation, then is time to get back to this forum.
 
  Regards,
 
  Francisco
 
 
  Dr. Francisco J. Zagmutt
  College of Veterinary Medicine and Biomedical Sciences 
 Colorado State 
  University
 
 
 
 
  From: Ethan Johnsons [EMAIL PROTECTED]
  To: r-help@stat.math.ethz.ch
  Subject: [R] binom.test
  Date: Fri, 20 Oct 2006 17:18:02 -0400
  
  A quick question, please.
  
  46 e coli lab samples are tested,  6 of them returned positive.
  
  So, the best point estimate for p is  6/46 = 0.1304348.
  
  For a 95% CI for p,  I thought binom.test would give me the correct
  result, but it seems it is not the right function to use.   What is
  the R function for this?
  
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot.mca in MASS

2006-10-21 Thread Marco LO
Hello All,
   
  Is it possible to plot each factor in a different color on the multiple 
correspondence plot?
   
  What is the implication and statistical meaning behind the coordinates for 
the rows on the plot?  Could I just plot a symbol instead of a number for the 
coordinates?
   
  Thanks!
  marco
   

 __



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot.POSIXct plot.POSIXlt

2006-10-21 Thread Patrick Giraudoux
Hi,

Just to signal that when I want to plot POSIXct variable on x using 
format within plot(), I get what I want on the plot but with a number of 
warnings:

  plot(y~x,format=%y-%m)
Warning messages:
1: format is not a graphical parameter in: plot.window(xlim, ylim, 
log, asp, ...)
2: format is not a graphical parameter in: plot.xy(xy, type, pch, lty, 
col, bg, cex, lwd, ...)
3: format is not a graphical parameter in: axis(side, at, labels, 
tick, line, pos, outer, font, lty, lwd, 
4: format is not a graphical parameter in: box(which = which, lty = 
lty, ...)
5: format is not a graphical parameter in: title(main, sub, xlab, 
ylab, line, outer, ...)

I suppose that format may not be at the right place in plot() or/and not 
handled by the functions called from there, however the documentation 
(?plot.POSIXct) seems to allow passing this argument:

 ...: Further arguments to be passed from or to other methods,
  typically graphical parameters or arguments of
  'plot.default'.  For the 'plot' methods, also 'format'.

Any comment?

Patrick

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] one is not one

2006-10-21 Thread Patrick Giraudoux
Folks,

I have got a strange behaviour when testing this:

sum(x) != 1

let us set

x-c(70,134,1,5,0)

and transform it in a vector of probabilities

x-x/sum(x)

One expect  sum(x) should be equal to 1, which is apparently the case

 sum(x)
[1] 1

However, when I try to test it I get:

 if(sum(x) !=1) print(lost) else (OK)
[1] lost

Which means that actually sum(x) is NOT considered equal to 1...

Any idea about what is going wrong?

Patrick

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] one is not one

2006-10-21 Thread Andrew Robinson
Patrick,

the problem arises because computers can't exactly represent real
numbers.  Try this for your test:

if(!all.equal(sum(x),1)) print(lost) else (OK)

see FAQ

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

I hope that this helps

Andrew


On Sat, Oct 21, 2006 at 11:04:13AM +0200, Patrick Giraudoux wrote:
 Folks,
 
 I have got a strange behaviour when testing this:
 
 sum(x) != 1
 
 let us set
 
 x-c(70,134,1,5,0)
 
 and transform it in a vector of probabilities
 
 x-x/sum(x)
 
 One expect  sum(x) should be equal to 1, which is apparently the case
 
  sum(x)
 [1] 1
 
 However, when I try to test it I get:
 
  if(sum(x) !=1) print(lost) else (OK)
 [1] lost
 
 Which means that actually sum(x) is NOT considered equal to 1...
 
 Any idea about what is going wrong?
 
 Patrick
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] one is not one

2006-10-21 Thread Roger Bivand
On Sat, 21 Oct 2006, Patrick Giraudoux wrote:

 Folks,
 
 I have got a strange behaviour when testing this:
 
 sum(x) != 1

FAQ 7.31 Why doesn't R think these numbers are equal?:

print(sum(x), digits=20)
identical(sum(x), 1)
all.equal(sum(x), 1)


 
 let us set
 
 x-c(70,134,1,5,0)
 
 and transform it in a vector of probabilities
 
 x-x/sum(x)
 
 One expect  sum(x) should be equal to 1, which is apparently the case
 
  sum(x)
 [1] 1
 
 However, when I try to test it I get:
 
  if(sum(x) !=1) print(lost) else (OK)
 [1] lost
 
 Which means that actually sum(x) is NOT considered equal to 1...
 
 Any idea about what is going wrong?
 
 Patrick
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cardinality constraint

2006-10-21 Thread Patrick Burns
To solve the problem you have in general, you will need
something more powerful than 'constrOptim'.  This is called
a mixed integer problem, and arises in several fields including
financial portfolio optimization.

If you really only have 3 variables, then there are only 3
combinations of 2 variables.  So just solve the problem with
each set and pick the best one.

Patrick Burns
[EMAIL PROTECTED]
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and A Guide for the Unwilling S User)

Neuro LeSuperHéros wrote:

Hello,

How do I implement a cardinality constraint with constrOptim?

I want to minimize (least square) a%*%x = 4
subject to
x12
x21
x34
count(x1, x2, x3)= 2 (cardinality constraint)

Is there a way to specify binary integer variables with constrOptim?

Here's my code so far:

a -matrix(1:3,1,3)
fr -  function(x) {
(a%*%x-4)^2
 }
constrOptim(c(1,0.5,3),fr,grad=NULL,ui=-diag(3), ci=c(-2,-1,-4))

I need the optimization to give me one variable that is zero to satisfy the 
maximum cardinality of 2.

Thanks

_
Voyez vos amis en faisant un appel vidèo dans Windows Live Messenger

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Constructing predictions from HPDinterval() after lmer()

2006-10-21 Thread Michael Kubovy
Dear r-helpers,

Following up on http://finzi.psych.upenn.edu/R/Rhelp02a/archive/ 
81159.html where Douglas Bates gives a helpful application of lmer()  
to data(sleepstudy, package = 'lme4'), I need a bit more help in  
order to plot the correct confidence intervals of a designed  
experiment such as:

  data(ratdrink, package = 'faraway')

I follow the steps Douglas took in his example:

  summary( rd.er - lmer(wt ~ weeks*treat + (1 | subject), data =  
ratdrink) )

Linear mixed-effects model fit by REML
Formula: wt ~ weeks * treat + (1 | subject)
Data: ratdrink
AIC   BIC logLik MLdeviance REMLdeviance
962.4 982.8 -474.2  964.3948.4
Random effects:
Groups   NameVariance Std.Dev.
subject  (Intercept) 71.206   8.4384
Residual 51.222   7.1570
number of obs: 135, groups: subject, 27

Fixed effects:
   Estimate Std. Error t value
(Intercept)52.8800 3.1928   16.56
weeks  26.4800 0.7157   37.00
treatthiouracil 4.7800 4.51531.06
treatthyroxine -0.7943 4.9756   -0.16
weeks:treatthiouracil  -9.3700 1.0121   -9.26
weeks:treatthyroxine0.6629 1.11530.59

Correlation of Fixed Effects:
 (Intr) weeks  trtthr trtthy wks:trtthr
weeks   -0.448
treatthircl -0.707  0.317
treatthyrxn -0.642  0.288  0.454
wks:trtthrc  0.317 -0.707 -0.448 -0.203
wks:trtthyr  0.288 -0.642 -0.203 -0.448  0.454

  rd.mc - mcmcsamp(rd.er, 5)
  library(coda)
  HPDinterval(rd.mc)

lower  upper
(Intercept)46.420404  59.406398
weeks  25.070131  27.930363
treatthiouracil-4.420942  14.009291
treatthyroxine-10.758369   9.435761
weeks:treatthiouracil -11.404620  -7.337025
weeks:treatthyroxine   -1.603858   2.842704
log(sigma^2)3.683085   4.226153
log(sbjc.(In))  3.633853   4.955867
deviance  965.750351 980.825613
attr(,Probability)
[1] 0.95


***To make sure my request is clear***
What I need is the analog of what is produced by all.effects() in the  
effects package:

  summary(rd - lm(wt ~ weeks*treat, data = ratdrink))

Call:
lm(formula = wt ~ weeks * treat, data = ratdrink)

Residuals:
 Min  1Q  Median  3Q Max
-23.514  -6.660   0.230   6.914  28.343

Coefficients:
   Estimate Std. Error t value Pr(|t|)
(Intercept)52.8800 2.6547  19.919   2e-16 ***
weeks  26.4800 1.0838  24.433   2e-16 ***
treatthiouracil 4.7800 3.7544   1.2730.205
treatthyroxine -0.7943 4.1371  -0.1920.848
weeks:treatthiouracil  -9.3700 1.5327  -6.113 1.08e-08 ***
weeks:treatthyroxine0.6629 1.6890   0.3920.695
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10.84 on 129 degrees of freedom
Multiple R-Squared: 0.9121, Adjusted R-squared: 0.9087
F-statistic: 267.8 on 5 and 129 DF,  p-value:  2.2e-16

  rd.eff - all.effects(rd)
  rd.ci - data.frame(y = rd.eff[[1]]$fit, Lower = rd.eff[[1]] 
$lower, Upper = rd.eff[[1]]$upper)

_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] one is not one

2006-10-21 Thread Philipp Pagel
On Sat, Oct 21, 2006 at 11:04:13AM +0200, Patrick Giraudoux wrote:
 Which means that actually sum(x) is NOT considered equal to 1...
 
 Any idea about what is going wrong?

Others have already pointed out the problem and I would like to add a
reference to a classical paper on this topic:

David Goldberg
What Every Computer Scientist Should Know About Floating-Point Arithmetic
Computing Surveys, 1991

Reprints can be found in many places online - e.g. here:

http://www.physics.ohio-state.edu/~dws/grouplinks/floating_point_math.pdf

cu
Philipp

-- 
Dr. Philipp PagelTel.  +49-8161-71 2131
Dept. of Genome Oriented Bioinformatics  Fax.  +49-8161-71 2186
Technical University of Munich
Science Center Weihenstephan
85350 Freising, Germany

 and

Institute for Bioinformatics / MIPS  Tel.  +49-89-3187 3675
GSF - National Research Center   Fax.  +49-89-3187 3585
  for Environment and Health
Ingolstädter Landstrasse 1
85764 Neuherberg, Germany
http://mips.gsf.de/staff/pagel

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems running IsoMDS using vegdist with pres-abs data and two sites with zero distance

2006-10-21 Thread Leonard Sandin
Hi
I have just (finally) started to poke around in R and wanted to analyse 
a stream fish dataset with 28 sites and 18 species. When trying to 
follow the Vegan manual to run nmds from distance measures calculated by 
the vegdist function it turns out that I have two sites (streams) with 
the exactly the same four species (I have used pres-abs data in this 
case). When I try to run isoMDS I get an error message saying that:
Error in isoMDS(vare.dis) : zero or negative distance between objects 14 
and 15
i.e. the two sites with the same fish species. I can run the decorana 
function with sensible results so the dataset seems to be ok.
My next step would be to compare the results of the fish ordination with 
some other datasets from the same sites (i.e. through procrustes 
rotation) - so I wouldn´t like to remove one of the twin sites.
Any suggestion on how to circumvent this or am I simply using the wrong 
method?

Best wishes, Leonard

-- 
Dr. Leonard Sandin
Department of Environmental Assessment / Miljöanalys
Swedish University of Agricultural Sciences / SLU
P.O. Box 7050
SE-750 07 Uppsala
Sweden
Work phone +46-(0)18-673813
fax+46-(0)18-673156
http://www.ma.slu.se
--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ReadLines question

2006-10-21 Thread Jonathan Greenberg
I'm getting the following error:

 headerinfo=readLines(met_station_file,n=8)
 headerinfo
[1] Plot Title: tahoe met validation ,,,
[2]Error: invalid multibyte string

met_station_file's first 8 lines are as follows:

Plot Title: tahoe met validation ,,,
#,Time, GMT-07:00,Temp, ƒF,Coupler Attached,Host Connected,Coupler
Detached,Stopped,End Of File
34,10/1/2005 0:00,49.937,
35,10/1/2005 0:30,47.266,
36,10/1/2005 1:00,47.446,
37,10/1/2005 1:30,47.982,
38,10/1/2005 2:00,48.517,
39,10/1/2005 2:30,49.228,

Why am I getting this error?  Are those quotation marks causing the hiccup?
If so, how do I get around this programmatically?

--j

-- 
Jonathan A. Greenberg, PhD
NRC Research Associate
NASA Ames Research Center
MS 242-4
Moffett Field, CA 94035-1000
Office: 650-604-5896
Cell: 415-794-5043
AIM: jgrn307
MSN: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ReadLines question

2006-10-21 Thread Roger Bivand
On Sat, 21 Oct 2006, Jonathan Greenberg wrote:

 I'm getting the following error:
 
  headerinfo=readLines(met_station_file,n=8)
  headerinfo
 [1] Plot Title: tahoe met validation ,,,
 [2]Error: invalid multibyte string


 
 met_station_file's first 8 lines are as follows:
 
 Plot Title: tahoe met validation ,,,
 #,Time, GMT-07:00,Temp, ƒF,Coupler Attached,Host Connected,Coupler
 ^^^

or whatever this looks like to you (was ^Ã for me in LC_CTYPE=en_GB) is a 
multibyte string. Is there a mismatch between the encoding (see ?locales) 
of the file and the machine into which you are reading? 

 Detached,Stopped,End Of File
 34,10/1/2005 0:00,49.937,
 35,10/1/2005 0:30,47.266,
 36,10/1/2005 1:00,47.446,
 37,10/1/2005 1:30,47.982,
 38,10/1/2005 2:00,48.517,
 39,10/1/2005 2:30,49.228,
 
 Why am I getting this error?  Are those quotation marks causing the hiccup?
 If so, how do I get around this programmatically?
 
 --j
 
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems running IsoMDS using vegdist with pres-abs data and two sites with zero distance

2006-10-21 Thread Gavin Simpson
On Sat, 2006-10-21 at 13:19 +0200, Leonard Sandin wrote:
 Hi
 I have just (finally) started to poke around in R and wanted to analyse 
 a stream fish dataset with 28 sites and 18 species. When trying to 
 follow the Vegan manual to run nmds from distance measures calculated by 
 the vegdist function it turns out that I have two sites (streams) with 
 the exactly the same four species (I have used pres-abs data in this 
 case). When I try to run isoMDS I get an error message saying that:
 Error in isoMDS(vare.dis) : zero or negative distance between objects 14 
 and 15
 i.e. the two sites with the same fish species. I can run the decorana 
 function with sensible results so the dataset seems to be ok.
 My next step would be to compare the results of the fish ordination with 
 some other datasets from the same sites (i.e. through procrustes 
 rotation) - so I wouldn´t like to remove one of the twin sites.
 Any suggestion on how to circumvent this or am I simply using the wrong 
 method?
 
 Best wishes, Leonard
 

Hi Leonard,

I assume you are running isoMDS (package MASS) directly? If so, take a
look at metaMDS function in vegan which has an argument zerodist which
if you set it to add adds a small distance so you don't get this
error.

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC  ENSIS, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/cv/
 London, UK. WC1E 6BT. [w] http://www.ucl.ac.uk/~ucfagls/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Dataset on Baltimore home energy costs

2006-10-21 Thread Zembower, Kevin
Sorry, dumb typo. Should be 18-Dec-2005. -Kevin



From: Richard M. Heiberger [mailto:[EMAIL PROTECTED]
Sent: Fri 10/20/2006 5:43 PM
To: Zembower, Kevin
Subject: Re: [R] Dataset on Baltimore home energy costs



Thanks for the data.  Can you clarify the date here.

18-Dec-2006: Brought home son; spouse and son home during the day,
setback thermostat no longer used. Interesting question: What's the
cost of adding a child?


It's still October, so I am not at all sure which year this should be.

Rich



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Filling in a series

2006-10-21 Thread Dennis Fisher
Colleagues

After reading in some clinical data, I discovered that the subject ID  
column contains entries only for the first record for each  
individual; subsequent rows are recorded as NA.  For example:
 1
 NA
 NA
 NA
 NA
 2
 NA
 NA
 NA
 NA
 3
 NA
 NA
 ...

I can think of various approaches to replace the NA values with  
appropriate entries.  I could loop through each row - if the value is  
NA, I replace it with the entry from the row above.  Or, I could find  
the positions of the non-NA values with match, then replace groups of  
entries (e.g., positions 2-5) with appropriate entries, again with a  
loop.

But, I expect that R allows some more clever approach to the  
problem.  Any thoughts?

Dennis

Dennis Fisher MD
P  (The P Less Than Company)
Phone: 1-866-PLessThan (1-866-753-7784)
Fax: 1-415-564-2220
www.PLessThan.com


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filling in a series

2006-10-21 Thread Gabor Grothendieck
The zoo package has na.locf (last occurrernce carried foward)
for this purpose:

 x
 [1]  1 NA NA NA NA  2 NA NA NA NA  3 NA NA
 library(zoo)
 na.locf(x)
 [1] 1 1 1 1 1 2 2 2 2 2 3 3 3

On 10/21/06, Dennis Fisher [EMAIL PROTECTED] wrote:
 Colleagues

 After reading in some clinical data, I discovered that the subject ID
 column contains entries only for the first record for each
 individual; subsequent rows are recorded as NA.  For example:
  1
  NA
  NA
  NA
  NA
  2
  NA
  NA
  NA
  NA
  3
  NA
  NA
  ...

 I can think of various approaches to replace the NA values with
 appropriate entries.  I could loop through each row - if the value is
 NA, I replace it with the entry from the row above.  Or, I could find
 the positions of the non-NA values with match, then replace groups of
 entries (e.g., positions 2-5) with appropriate entries, again with a
 loop.

 But, I expect that R allows some more clever approach to the
 problem.  Any thoughts?

 Dennis

 Dennis Fisher MD
 P  (The P Less Than Company)
 Phone: 1-866-PLessThan (1-866-753-7784)
 Fax: 1-415-564-2220
 www.PLessThan.com


[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Filling in a series

2006-10-21 Thread Gabor Grothendieck
Actually I think its last _observation_ carried forward.

On 10/21/06, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 The zoo package has na.locf (last occurrernce carried foward)
 for this purpose:

  x
  [1]  1 NA NA NA NA  2 NA NA NA NA  3 NA NA
  library(zoo)
  na.locf(x)
  [1] 1 1 1 1 1 2 2 2 2 2 3 3 3

 On 10/21/06, Dennis Fisher [EMAIL PROTECTED] wrote:
  Colleagues
 
  After reading in some clinical data, I discovered that the subject ID
  column contains entries only for the first record for each
  individual; subsequent rows are recorded as NA.  For example:
   1
   NA
   NA
   NA
   NA
   2
   NA
   NA
   NA
   NA
   3
   NA
   NA
   ...
 
  I can think of various approaches to replace the NA values with
  appropriate entries.  I could loop through each row - if the value is
  NA, I replace it with the entry from the row above.  Or, I could find
  the positions of the non-NA values with match, then replace groups of
  entries (e.g., positions 2-5) with appropriate entries, again with a
  loop.
 
  But, I expect that R allows some more clever approach to the
  problem.  Any thoughts?
 
  Dennis
 
  Dennis Fisher MD
  P  (The P Less Than Company)
  Phone: 1-866-PLessThan (1-866-753-7784)
  Fax: 1-415-564-2220
  www.PLessThan.com
 
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] newbie question

2006-10-21 Thread Larry White
Sorry - this must be obvious, but i haven't been able to find the
answer in the guides i've searched.  The examples seem to assume you
always want to look at all the data.

I want to be able to filter data in a dataframe before analyzing it.
For example, I'd like to plot(a,b) but only include values where b 
1000.

I'd also like to be able to do similar filtering before doing other
statistical functions.

Thanks for your help.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newbie question

2006-10-21 Thread Paul Hiemstra
Dear Larry,

Data can be filtered in the following manner:

a = c(1,2,3,4)
b = c(1,2,3,4)
b = b[a3]
b = b[b3]
a
b
[1] 4

Or if the data is in a data frame:

b = data.frame(seq(1:10),seq(1:10)
names(b) = c(a,b)
b
b = b[b$a  5,]
b

The trailing comma at the end is important.

Hopes this helps,

Paul





Larry White schreef:
 Sorry - this must be obvious, but i haven't been able to find the
 answer in the guides i've searched.  The examples seem to assume you
 always want to look at all the data.

 I want to be able to filter data in a dataframe before analyzing it.
 For example, I'd like to plot(a,b) but only include values where b 
 1000.

 I'd also like to be able to do similar filtering before doing other
 statistical functions.

 Thanks for your help.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logistic regression with a sample missing subjects with a value of an independent variable

2006-10-21 Thread Gabriele Stocco
Dear R-help,
I am trying to make logistic regression analysis using the R function
glm, with the parameter family set to binomial, in order to use a
logistic regression model.
I have 70 samples. The dependent variables has two levels (0 and 1) and
one of the independent variables has too two levels (0 and 1). 
The variables associate in the way shown in the table:

Dependent   0   1
Independent 0  55  10

1   0   5

This gives a strong association evaluated by the fisher test (p-value =
0.0002481), but with the logistic regression it gives a p-value of 0.99
with very high values of estimate and standard error (respectively and
-19.27 and 1769.26).

Is there any way (other function, different setting of a parameter) to
perform logistic regression analysis with these data with R?

Thank you.

Gabriele Stocco
University of Trieste

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with mode of marginal distriubtion of rdirichlet{gtools}

2006-10-21 Thread hong qin
Hi all,

I have a problem using rdirichlet{gtools}.
For Dir( a1, a2, ..., a_n), its mode can be found at  $( a_i -1)/ (
\sum_{i}a_i - n)$;
The means are   $a_i / (\sum_{i} a_i ) $;

I tried to study the above properties using rdirichlet from gtools. The code
are:

##
library(gtools)
alpha = c(1,3,9)  #totoal=13
mean.expect = c(1/13, 3/13,  9/13)
mode.expect = c(0,2/10,  8/10) # this is for the overall mode.

mode1 = numeric(3);
mode2 = numeric(3);

theta = data.frame( rdirichlet( 1, alpha) )
m1 = mean(theta)
for( i in 1:3) {
 h = hist(theta[,i], breaks=20);
 mode1[i]= h$mid[ h$density == max(h$density) ]
}

theta = data.frame( rdirichlet( 1, alpha) )
m2 = mean(theta)

for( i in 1:3) {
 h = hist(theta[,i], breaks=20);
 mode2[i]= h$mid[ h$density == max(h$density) ]
}

rbind(m1,m2,mean.expect) #the means are consistent
rbind(mode1, mode2, mode.expect);#the marginal modes are quite
different from the global mode.


An example output is:
 rbind(m1,m2,mean.expect)   #the means are consistent
X1X2X3
m1  0.07609384 0.2301840 0.6937222
m2  0.07716923 0.2300336 0.6927971
mean.expect 0.07692308 0.2307692 0.6923077
 rbind(mode1, mode2, mode.expect);
 [,1]  [,2]  [,3]
mode1   0.025 0.175 0.725
mode2   0.010 0.175 0.725
mode.expect 0.000 0.200 0.800

So, what is the problem with the mode?

Many thanks,

HQ

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newbie question

2006-10-21 Thread Chuck Cleland
Larry White wrote:
 Sorry - this must be obvious, but i haven't been able to find the
 answer in the guides i've searched.  The examples seem to assume you
 always want to look at all the data.
 
 I want to be able to filter data in a dataframe before analyzing it.
 For example, I'd like to plot(a,b) but only include values where b 
 1000.
 
 I'd also like to be able to do similar filtering before doing other
 statistical functions.

  In addition to what Paul Hiemstra mentioned, you should look at the
help page for subset().  For example:

df - data.frame(a = runif(20), b = 1:20)
plot(a ~ b, data = subset(df, b  5))

 Thanks for your help.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] logistic regression with a sample missing subjects with a value of an independent variable

2006-10-21 Thread Prof Brian Ripley
On Sat, 21 Oct 2006, Gabriele Stocco wrote:

 Dear R-help,
 I am trying to make logistic regression analysis using the R function
 glm, with the parameter family set to binomial, in order to use a
 logistic regression model.
 I have 70 samples. The dependent variables has two levels (0 and 1) and
 one of the independent variables has too two levels (0 and 1).
 The variables associate in the way shown in the table:

Dependent   0   1
 Independent 0  55  10

1   0   5

 This gives a strong association evaluated by the fisher test (p-value =
 0.0002481), but with the logistic regression it gives a p-value of 0.99
 with very high values of estimate and standard error (respectively and
 -19.27 and 1769.26).

Please see the comment at the bottom of this message, as your claims are 
not supported by any code.

 Is there any way (other function, different setting of a parameter) to
 perform logistic regression analysis with these data with R?

fit - glm(matrix(c(55,0,10,5), 2, 2) ~ factor(c(0,1)), binomial())
fit0 - glm(matrix(c(55,0,10,5), 2, 2) ~ 1,  binomial())
anova(fit0, fit, test=Chisq)

   Resid. Df Resid. Dev Df Deviance P(|Chi|)
1 1 16.929
2 0  2.208e-10  1   16.929 3.880e-05

is a reasonable way to do this.  Beware the Hauck-Donner phenomenon (see 
e.g. MASS, the book) for t-tests of coefficients, although I do not get 
the values you quote.  Since the expected values are low, you should not 
take the p value too seriously.

 Thank you.

 Gabriele Stocco
 University of Trieste

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Box's M test

2006-10-21 Thread GRAHAM LEASK
Dear List
   
  I am looking for a script that will calculate the Box M test to test the 
homogeneity of the variance/covariance matrix between two matrices.
   
  If anyone could send me the script I would appreciate it.
   
  I am aware of the scepticism about this test, where due to extreme 
sensitivity a p value of 0.01 is recommended. Despite this however Box's M test 
is the established method for identification of stable strategic time periods 
within the strategic management literature and I would like the opportunity to 
use this method within either R or S plus.
   
  Any help would be gratefully received.
   
  Kind regards
   
   
  Graham

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Box M test

2006-10-21 Thread GRAHAM LEASK
  Dear List
   
  I am looking for a script that will calculate the Box M test to test the 
homogeneity of the variance/covariance matrix between two matrices.
   
  If anyone could send me the script I would appreciate it.
   
  I am aware of the scepticism about this test, where due to extreme 
sensitivity a p value of 0.01 is recommended. Despite this however Box's M test 
is the established method for identification of stable strategic time periods 
within the strategic management literature and I would like the opportunity to 
use this method within either R or S plus.
   
  Any help would be gratefully received.
   
  Kind regards
   
   
  Graham



Kind regards


Dr Graham Leask
Economics and Strategy Group
Aston Business School
Aston University
Aston Triangle
Birmingham
B4 7ET

Tel: Direct line 0121 204 3150
email [EMAIL PROTECTED]
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how do I find the row index number, or row name, of a given value in a vector?

2006-10-21 Thread tom soyer
Hi,

I noticed that max(x) returns the maximum value of a vector, but the
function doesn't give the user the option of retrieving the row index number
instead. If I used max(x) to find the maximum value of vector x, then is
there a function I can use to find the index number, or row name, of the
maximum value?

Thanks,

Tom

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] newbie question

2006-10-21 Thread Alberto Vieira Ferreira Monteiro
Larry White wrote:

 Sorry - this must be obvious,

Yes, it is :-)

 I want to be able to filter data in a dataframe before analyzing it.
 For example, I'd like to plot(a,b) but only include values where b 
 1000.

If a and b are vectors, then b  1000 is another vector of logical
values.

You can use logical vectors to select pieces of other vectors, like
this:

x - c(1, 2, 3)
y - c(TRUE, FALSE, FALSE)
x[y]
# returns only the first element, 1

So, this plot can be done this way:

filter - (b  1000) # filter is a logical vector
a1 - a[filter]
b1 - b[filter]
# Now, a1 and b1 are, resp, a and b filtered
plot(a1, b1)

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do I find the row index number, or row name, of a given value in a vector?

2006-10-21 Thread John Kane

--- tom soyer [EMAIL PROTECTED] wrote:

 Hi,
 
 I noticed that max(x) returns the maximum value of a
 vector, but the
 function doesn't give the user the option of
 retrieving the row index number
 instead. If I used max(x) to find the maximum value
 of vector x, then is
 there a function I can use to find the index number,
 or row name, of the
 maximum value?
 
 Thanks,
 
 Tom
?which 

Is this what you want to do?  
xx - c(1,2,3,4,9,6,7,5)
bb - max(xx)
bb
which(xx==bb)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do I find the row index number, or row name, of a given value in a vector?

2006-10-21 Thread Gabor Grothendieck
?which.max

On 10/21/06, tom soyer [EMAIL PROTECTED] wrote:
 Hi,

 I noticed that max(x) returns the maximum value of a vector, but the
 function doesn't give the user the option of retrieving the row index number
 instead. If I used max(x) to find the maximum value of vector x, then is
 there a function I can use to find the index number, or row name, of the
 maximum value?

 Thanks,

 Tom

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do I find the row index number, or row name, of a given value in a vector?

2006-10-21 Thread tom soyer
Thanks John! I got it.

On 10/21/06, John Kane [EMAIL PROTECTED] wrote:


 --- tom soyer [EMAIL PROTECTED] wrote:

  Hi,
 
  I noticed that max(x) returns the maximum value of a
  vector, but the
  function doesn't give the user the option of
  retrieving the row index number
  instead. If I used max(x) to find the maximum value
  of vector x, then is
  there a function I can use to find the index number,
  or row name, of the
  maximum value?
 
  Thanks,
 
  Tom
 ?which

 Is this what you want to do?
 xx - c(1,2,3,4,9,6,7,5)
 bb - max(xx)
 bb
 which(xx==bb)

 __
 Do You Yahoo!?
 Tired of spam?  Yahoo! Mail has the best spam protection around
 http://mail.yahoo.com


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] I really don't understand functions in R :-)

2006-10-21 Thread Marcin Jaworski
Patrick Burns ([EMAIL PROTECTED]) wrote:

You may be interested in S Poetry.

Hi,
I do find your book very helpful. Thank you.

Best regards,
Marcin Jaworski

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [Fwd: [AGDG-LIST:405] R Computing Contest]

2006-10-21 Thread Gregor Gorjanc


 Original Message 
Subject: [AGDG-LIST:405] R Computing Contest
Date: Sat, 21 Oct 2006 12:08:13 -0400
From: Larry Schaeffer [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: Animal Geneticist's Discussion [EMAIL PROTECTED]

For those that are interested only:

R Computer Programming Challenge

Given:  y = Factor A + Factor B + b1(Covariate1) + b2(Covariate2) + ...
+ bp(Covariate p) + animal + e
where y is a single trait - vector of observations,
Factors A and B are fixed with ma and mb levels respectively,
Covariates 1 to p are fixed regressions and p is a general number but
less than 20.
Heritability is a variable and should be requested by the program.

You are given a list of the animal IDs of animals with records, and
their sires and dams.  Sire and dam
IDs are smaller than the IDs of their progeny, and all IDs go from 1 to
n where n can be up to 10 million
animals.

Tasks:
1) Estimate the EBVs of all animals using BLUP and mixed model equations
and
   list the top 10 largest EBVs with associated animal IDs,
   first ten levels of Factor A with solutions,
   first ten levels of Factor B with solutions,
   and all estimated regression coefficients.

Submit your R program to Bill Szkotnicki ([EMAIL PROTECTED]) by January
5, 2007.  Submissions
will be dated by their arrival date.  Provide any special functions or
details that might be needed to run your program.

All R programs will be run on the same test data sets, computer (Linux),
and timed.

Winner: Earliest submitted R program that runs the fastest and gives the
correct answers.

Prize: Free registration for the Kennedy Conference in 2008 in Guelph.

-- 
Lep pozdrav / With regards,
Gregor Gorjanc
--
University of Ljubljana PhD student
Biotechnical Faculty
Zootechnical Department URI: http://www.bfro.uni-lj.si/MR/ggorjan
Groblje 3   mail: gregor.gorjanc at bfro.uni-lj.si

SI-1230 Domzale tel: +386 (0)1 72 17 861
Slovenia, Europefax: +386 (0)1 72 17 888

--
One must learn by doing the thing; for though you think you know it,
 you have no certainty until you try. Sophocles ~ 450 B.C.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ReadLines question

2006-10-21 Thread Jonathan Greenberg
That looks to me like an infinity sign (I have no idea why that is part of
the header of this file, but it is there).  How do I modify the encoding to
read this in? 

--j


On 10/21/06 4:33 AM, Roger Bivand [EMAIL PROTECTED] wrote:

 On Sat, 21 Oct 2006, Jonathan Greenberg wrote:
 
 I'm getting the following error:
 
 headerinfo=readLines(met_station_file,n=8)
 headerinfo
 [1] Plot Title: tahoe met validation ,,,
 [2]Error: invalid multibyte string
 
 
 
 met_station_file's first 8 lines are as follows:
 
 Plot Title: tahoe met validation ,,,
 #,Time, GMT-07:00,Temp, ƒF,Coupler Attached,Host Connected,Coupler
  ^^^
 
 or whatever this looks like to you (was ^Ã for me in LC_CTYPE=en_GB) is a
 multibyte string. Is there a mismatch between the encoding (see ?locales)
 of the file and the machine into which you are reading?
 
 Detached,Stopped,End Of File
 34,10/1/2005 0:00,49.937,
 35,10/1/2005 0:30,47.266,
 36,10/1/2005 1:00,47.446,
 37,10/1/2005 1:30,47.982,
 38,10/1/2005 2:00,48.517,
 39,10/1/2005 2:30,49.228,
 
 Why am I getting this error?  Are those quotation marks causing the hiccup?
 If so, how do I get around this programmatically?
 
 --j
 
 


-- 
Jonathan A. Greenberg, PhD
NRC Research Associate
NASA Ames Research Center
MS 242-4
Moffett Field, CA 94035-1000
Office: 650-604-5896
Cell: 415-794-5043
AIM: jgrn307
MSN: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Box M test [Broadcast]

2006-10-21 Thread Liaw, Andy
See
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/0.html 

Andy

From: GRAHAM LEASK
 
   Dear List

   I am looking for a script that will calculate the Box M 
 test to test the homogeneity of the variance/covariance 
 matrix between two matrices.

   If anyone could send me the script I would appreciate it.

   I am aware of the scepticism about this test, where due to 
 extreme sensitivity a p value of 0.01 is recommended. Despite 
 this however Box's M test is the established method for 
 identification of stable strategic time periods within the 
 strategic management literature and I would like the 
 opportunity to use this method within either R or S plus.

   Any help would be gratefully received.

   Kind regards


   Graham
 
 
 
 Kind regards
 
 
 Dr Graham Leask
 Economics and Strategy Group
 Aston Business School
 Aston University
 Aston Triangle
 Birmingham
 B4 7ET
 
 Tel: Direct line 0121 204 3150
 email [EMAIL PROTECTED]
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 


--
Notice:  This e-mail message, together with any attachments,...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ReadLines question

2006-10-21 Thread Roger Bivand
On Sat, 21 Oct 2006, Jonathan Greenberg wrote:

 That looks to me like an infinity sign (I have no idea why that is part of
 the header of this file, but it is there).  How do I modify the encoding to
 read this in? 

The problem is the degree sign. Under linux:

$ file tmp/Marlette_lake_snotel.csv
tmp/Marlette_lake_snotel.csv: ISO-8859 text, with CRLF, CR line terminators

so probably the conversion to multibyte is happening on your reading 
platform. Reading the file into 2.4.0 on Windows with a Norwegian 1252 
setting (Sys.getlocale()), I see the degree sign.

If you know the column names anyway, jump over the header and 
insert them yourself. Alternatively filter the non-ASCII character out 
before reading, it looks predictably like a degree sign. In any case, the 
character is not very practical in a column name.

Roger

 
 --j
 
 
 On 10/21/06 4:33 AM, Roger Bivand [EMAIL PROTECTED] wrote:
 
  On Sat, 21 Oct 2006, Jonathan Greenberg wrote:
  
  I'm getting the following error:
  
  headerinfo=readLines(met_station_file,n=8)
  headerinfo
  [1] Plot Title: tahoe met validation ,,,
  [2]Error: invalid multibyte string
  
  
  
  met_station_file's first 8 lines are as follows:
  
  Plot Title: tahoe met validation ,,,
  #,Time, GMT-07:00,Temp, ƒF,Coupler Attached,Host Connected,Coupler
   ^^^
  
  or whatever this looks like to you (was ^Ã for me in LC_CTYPE=en_GB) is a
  multibyte string. Is there a mismatch between the encoding (see ?locales)
  of the file and the machine into which you are reading?
  
  Detached,Stopped,End Of File
  34,10/1/2005 0:00,49.937,
  35,10/1/2005 0:30,47.266,
  36,10/1/2005 1:00,47.446,
  37,10/1/2005 1:30,47.982,
  38,10/1/2005 2:00,48.517,
  39,10/1/2005 2:30,49.228,
  
  Why am I getting this error?  Are those quotation marks causing the hiccup?
  If so, how do I get around this programmatically?
  
  --j
  
  
 
 
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ReadLines question

2006-10-21 Thread Peter Dalgaard
Roger Bivand [EMAIL PROTECTED] writes:

 On Sat, 21 Oct 2006, Jonathan Greenberg wrote:
 
  That looks to me like an infinity sign (I have no idea why that is part of
  the header of this file, but it is there).  How do I modify the encoding to
  read this in? 
 
 The problem is the degree sign. Under linux:
 
 $ file tmp/Marlette_lake_snotel.csv
 tmp/Marlette_lake_snotel.csv: ISO-8859 text, with CRLF, CR line terminators
 
 so probably the conversion to multibyte is happening on your reading 
 platform. Reading the file into 2.4.0 on Windows with a Norwegian 1252 
 setting (Sys.getlocale()), I see the degree sign.

Nono, there is no conversion. R is _expecting_ a multibyte sequence
(utf8 most likely) and finding something that isn't part of one.  

The fix should be something close to

read.csv(file(tmp/Marlette_lake_snotel.csv), encoding=iso-8859-1) 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Possible bugs in 'seek' and 'readBin'

2006-10-21 Thread Zepu Zhang
I found that 
  seek(..., origin = 'current', ...)
and
  readBin(..., what = 'integer', ...)
or 'int'
do not work correctly.

Did anyone have the same experience?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pie

2006-10-21 Thread Dietrich Tissen
Hi,

I would like to draw a pie chart. I've already tried out the standard 
pie-function in the GRAPH-package. My question: is there any 'better' 
function or package to draw a pie chart. For example I would like to 
draw a 3D pie chart.

Dietrich Tissen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tcltk: multiple listboxes, selection

2006-10-21 Thread JeeBee
Dear list,

I have multiple (BWidget) listboxes in the same toplevel window.
The problem is, if I select (by left clicking) on one of those
listbox elements, the current selection in the *other* listboxes is
cleared!
Anybody knows how I can prevent this?

Here's my code (sorry not complete):
(E.g. If I select an X value, I'd lose the Y value I selected before)

gui.create.tab.general - function(tt) {
  # Network topology
  frame.1 - tkframe(tt)
  combo_topology - tkwidget(frame.1, ComboBox, values = infras,
editable = FALSE)
  tcl(combo_topology, setvalue, first)
  tkgrid(tklabel(frame.1, text=Network topology), combo_topology)
  tkgrid(frame.1, sticky = w)
  # X values
  frame.x - tkwidget(tt, labelframe, text = X value)
  tl.x - tklistbox(frame.x, height = length(xvals) / 2,
width = 50, selectmode = browse, background = white)
  for(i in seq(from=1, to=length(xvals), by=2)) {
tkinsert(tl.x, end, xvals[[i]])
  }
  tkselection.set(tl.x, 0)
  tkgrid(tl.x, sticky=news)
  # Y values
  frame.y - tkwidget(tt, labelframe, text = Y value(s))
  tl.y - tklistbox(frame.y, height = 20, #length(yvals) / yvals.field.count,
yscrollcommand=function(...) tkset(scr.y,...),
width = 50, selectmode = extended, background = white)
  scr.y - tkscrollbar(frame.y, repeatinterval = 5,
command = function(...) tkyview(tl.y, ...))
  for(i in seq(from=1, to=length(yvals), by=yvals.field.count)) {
tkinsert(tl.y, end, yvals[[i]])
  }
  tkselection.set(tl.y, 0)
  tkgrid(tl.y, scr.y, sticky=news)
  # Packing
  tkgrid(frame.x, frame.y, sticky=news)
 #side = left, fill = both, expand = TRUE)
}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pie

2006-10-21 Thread Marc Schwartz
On Sat, 2006-10-21 at 22:46 +0200, Dietrich Tissen wrote:
 Hi,
 
 I would like to draw a pie chart. I've already tried out the standard 
 pie-function in the GRAPH-package. My question: is there any 'better' 
 function or package to draw a pie chart. For example I would like to 
 draw a 3D pie chart.
 
 Dietrich Tissen

Take a look at the plotrix package on CRAN.

That being said, just say NO to 3D pie charts:

When a graphic is taken over by decorative forms of computer debris,
when the data measures and structures become Design Elements, when the
overall design purveys Graphical Style rather than quantitative
information, then the graphic may be called a duck in honor of the
duck-form store... 

   — Edward Tufte


The primary source of extraneous lines in charting graphics today are
the 3-D options offered by conventional spreadsheet graphics. These 3-D
options serve no useful purpose; they add only ink to the chart, and
more often than not make it more difficult to estimate the values
represented. Even worse are the spreadsheet options that allow one to
rotate the perspective. For those who would take bad graphical display
to even higher levels, the Excel spreadsheet program offers the option
of doughnut, radar, cylinder, cone, bubble charts.

   — Gary Klass



Consider a ?dotchart or even ?barplot instead.


For more information:

http://lilt.ilstu.edu/gmklass/pos138/datadisplay/badchart.htm

http://lilt.ilstu.edu/gmklass/pos138/datadisplay/sections/goodcharts.htm

http://www.edwardtufte.com/tufte/books_vdqi

and the two books by Cleveland:

http://stat.bell-labs.com/wsc/index.html


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Latex editor recommendations

2006-10-21 Thread Jens Scheidtmann
Richard M. Heiberger [EMAIL PROTECTED] writes:

 I recommend emacs
 http://ftp.gnu.org/pub/gnu/emacs

 It has modes for TeX and LaTeX and automatically chooses the
 right one.

 As a side benefit once you have emacs you can then
 run R through ESS, the package that provides the modes for handling
 statistical languages
 http://ess.r-project.org/

In addition you get support for Sweave (mixing R and LaTeX).

Recommended packages for Editing LaTeX / R are:

AucTeX + reftex 
ESS

Jens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using the ROracle package with Oracle 10 ?

2006-10-21 Thread Jens Scheidtmann
Ram Dessau [EMAIL PROTECTED] writes:

[...]
 Window binary for ROracle 10 ?

 Is there anyone who could help with a version of ROracle for windows
 which is compatible with Oracle 10 or more general just uses the
 default-home The available (not officially released) version only
 supports Oracle 9

Any used oracle 9 driver should be able to connect to a 10g instance
and work reasonably well with it.

Did you try connecting to a 10g instance?

Do you have a specific problem or thing not working?

Do you have errors ora-600 or ora-03113?

Jens

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROracle error in Windows. Memory could not be read.

2006-10-21 Thread Jens Scheidtmann
Adrian Dragulescu [EMAIL PROTECTED] writes:

 I've seen from earlier posts that other people had problems installing
 ROracle under Windows.  I run R-2.3.1.

 I got the Windows binaries for ROracle from
 http://stat.bell-labs.com/RS-DBI/download/index.html

 Here is my session:
 require(ROracle)
 Loading required package: ROracle
 Loading required package: DBI
 [1] TRUE
 drv - dbDriver(Oracle)
 drv
 OraDriver:(1648)
 con - dbConnect(drv, user=USER, password=PASS, dbname=TEST)

 A window pops up, with the message:  The instruction at 0x6260c621
 referenced memory at 0x4a280ae2.  The memory could not be read.  And R
 freezes.

Switch on debugging in the oracle client (sqlnet.ora).
Maybe you can get a glimpse on what's wrong.

Please provide version information (Oracle Client libs, Oracle DB).

Is there any network traffic going to or coming from the DB?



 Something is working, because I get to establish an Oracle driver.

What do you mean by that?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ReadLines question

2006-10-21 Thread Roger Bivand
On 21 Oct 2006, Peter Dalgaard wrote:

 Roger Bivand [EMAIL PROTECTED] writes:
 
  On Sat, 21 Oct 2006, Jonathan Greenberg wrote:
  
   That looks to me like an infinity sign (I have no idea why that is part of
   the header of this file, but it is there).  How do I modify the encoding 
   to
   read this in? 
  
  The problem is the degree sign. Under linux:
  
  $ file tmp/Marlette_lake_snotel.csv
  tmp/Marlette_lake_snotel.csv: ISO-8859 text, with CRLF, CR line terminators
  
  so probably the conversion to multibyte is happening on your reading 
  platform. Reading the file into 2.4.0 on Windows with a Norwegian 1252 
  setting (Sys.getlocale()), I see the degree sign.
 
 Nono, there is no conversion. R is _expecting_ a multibyte sequence
 (utf8 most likely) and finding something that isn't part of one.  
 
 The fix should be something close to
 
 read.csv(file(tmp/Marlette_lake_snotel.csv), encoding=iso-8859-1) 
 

Yes, thanks:

read.csv(file(tmp/Marlette_lake_snotel.csv, encoding=iso-8859-1), 
 skip=1, check.names=FALSE) 

gives usable results on Linux/en_GB and Win/Norwegian (Bokmål) 1252.


 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with mode of marginal distriubtion of rdiri chlet{gtools}

2006-10-21 Thread Ben Bolker
hong qin alongway at gmail.com writes:

 
 Hi all,
 
 I have a problem using rdirichlet{gtools}.
 For Dir( a1, a2, ..., a_n), its mode can be found at  $( a_i -1)/ (
 \sum_{i}a_i - n)$;
 The means are   $a_i / (\sum_{i} a_i ) $;
 

  I believe the problem is that the marginal
modes are not equal to the full posterior modes.
(Think about the two-parameter normal distribution
for example; for a sample {x_i}, the mode
of the posterior variance is near sum((x_i-\bar x)^2)/n
while the marginal mode is near sum((x_i-\bar x)^2)/(n-1) ...)

 [haven't checked this at all carefully.]

  Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] object(s) are masked from package - what does it mean?

2006-10-21 Thread tom soyer
Hi,

Sometime when I attach a dataset, R gives me the following
message/warning:The following object(s) are masked from package:datasets:
column_name. Does anyone know what it means? Since it seems that the
dataset was attached and I could manipulate the data from the dataset
without problems, I am wondering what was R trying to tell me.

Thanks,

Tom

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glm function question

2006-10-21 Thread Chris Linton
I am creating a model attempting to predict the probability someone will
reoffend after being caught for a crime.  There are seven total inputs and I
planned on using a logistic regression.  I started with a null deviance of
182.91 and ended up with a residual deviance of 83.40 after accounting for
different interactions and such.  However, I realized after that my code is
different from that in my book.  And I can't figure out what I need to put
in it's place.  Here's my code:

library(foreign)

library(car)

foo = read.table(C:/Documents and
Settings/Chris/Desktop/4330/criminals.dat, header=TRUE)


reoff = foo[ ,1]

race = foo[ ,2]

age = foo[ ,3]

gender = foo[ ,4]

educ = foo[ ,5]

subst = foo[ ,6]

prior = foo[ ,7]

violence = foo[ ,8]

fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior +
factor(violence):factor(subst) + factor(violence):factor(educ) +
factor(violence):factor(age) + factor(violence):factor(prior))

summary(fit1h)


If you noticed, there's no part of my code that looks like:

family=binomial(link=logit))


If I code like my book has done, it would look like:

fit1i = glm(reoff ~ factor(subst) + factor(violence) + prior +
factor(violence):factor(subst) + factor(violence):factor(educ) +
factor(violence):factor(age) + factor(violence):factor(prior),
family=binomial(link=logit))

summary(fit1i)




However, when I do this, my null deviance is 1104 and my residual deviance
is 23460.  THIS IS A HUGE DIFFERENCE IN MODEL FIT!  I'm not sure if I have
to redo my model or if my book was simply doing the
family=binomial(link=logit) for a specific problem/reason.

So, to my question:
Do I need to include family=binomial(link=logit) in my code?  Do I need
to include any type of family?


Thanks for your help,

-chris

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm function question

2006-10-21 Thread Marc Schwartz
On Sat, 2006-10-21 at 20:02 -0400, Chris Linton wrote:
 I am creating a model attempting to predict the probability someone will
 reoffend after being caught for a crime.  There are seven total inputs and I
 planned on using a logistic regression.  I started with a null deviance of
 182.91 and ended up with a residual deviance of 83.40 after accounting for
 different interactions and such.  However, I realized after that my code is
 different from that in my book.  And I can't figure out what I need to put
 in it's place.  Here's my code:
 
 library(foreign)
 
 library(car)
 
 foo = read.table(C:/Documents and
 Settings/Chris/Desktop/4330/criminals.dat, header=TRUE)
 
 
 reoff = foo[ ,1]
 
 race = foo[ ,2]
 
 age = foo[ ,3]
 
 gender = foo[ ,4]
 
 educ = foo[ ,5]
 
 subst = foo[ ,6]
 
 prior = foo[ ,7]
 
 violence = foo[ ,8]
 
 fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior +
 factor(violence):factor(subst) + factor(violence):factor(educ) +
 factor(violence):factor(age) + factor(violence):factor(prior))
 
 summary(fit1h)
 
 
 If you noticed, there's no part of my code that looks like:
 
 family=binomial(link=logit))
 
 
 If I code like my book has done, it would look like:
 
 fit1i = glm(reoff ~ factor(subst) + factor(violence) + prior +
 factor(violence):factor(subst) + factor(violence):factor(educ) +
 factor(violence):factor(age) + factor(violence):factor(prior),
 family=binomial(link=logit))
 
 summary(fit1i)
 
 
 
 
 However, when I do this, my null deviance is 1104 and my residual deviance
 is 23460.  THIS IS A HUGE DIFFERENCE IN MODEL FIT!  I'm not sure if I have
 to redo my model or if my book was simply doing the
 family=binomial(link=logit) for a specific problem/reason.
 
 So, to my question:
 Do I need to include family=binomial(link=logit) in my code?  

Yes, though you could do with just 'family = binomial' since logit is
the default link function.

 Do I need
 to include any type of family?

If you don't want to use the default Gaussian family, then yes.

Whatever book it is you are working from (which you fail to identify)
ought to clearly explain the background on the use of the distribution
families in GLM's. There is a reason the author has included these
instructions and you need to pay attention to them.

If you look carefully at the output of summary(fit1h), you will likely
see:

  (Dispersion parameter for gaussian family taken to be )

and you will also notice that the tests being applied (3rd and 4th
columns in the coefficient summary table) are t tests and not z tests.

These should be a big hint that you are not working with the proper
family and are therefore not fitting a logistic regression model, which
is presumably the intent of this section of the book.

See ?glm and pay careful attention to the function defaults.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object(s) are masked from package - what does it mean?

2006-10-21 Thread Gregor Gorjanc
tom soyer tom.soyer at gmail.com writes:

 
 Hi,
 
 Sometime when I attach a dataset, R gives me the following
 message/warning:The following object(s) are masked from package:datasets:
 column_name. Does anyone know what it means? Since it seems that the
 dataset was attached and I could manipulate the data from the dataset
 without problems, I am wondering what was R trying to tell me.

As far as I understand it: Imagine there is a package A with object tmp and you
load it. When you load another package B with object of the same name i.e. tmp
then you will access tmp from package B and not A. You might work though with
namespaces, :: and :::

Gregor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with installing Hmisc and Design: gfrotran: error

2006-10-21 Thread Stephan Lindner
cannot exec 'f951': No such file or directory
Reply-To: 

Hi all,

I'm using R on an Suse Linux system. Since Hmisc and Design need both
fortran, I installed gfortran (through rpm). However, I still get an
error message, namely:

gfrotran: error cannot exec 'f951': No such file or directory

And the installation is cancelled. I tried solving the problem through
googling it, and looking at the fortran and Hmisc page, but couldn't
find any valuable information. It would be wonderful if anyone could
help me.

Cheers,

   Stephan




-- 
---
Stephan Lindner, Dipl.Vw., MA
PhD Candidate
University of Michigan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to convert multiple dummy variables to 1 factor variable?

2006-10-21 Thread Wensui Liu
Dear Listers,

I am wondering how to convert multiple dummy variables to 1 factor variable.

Thanks.

wensui

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm function question

2006-10-21 Thread Gregor Gorjanc
Chris Linton connect.chris at gmail.com writes:

 
 I am creating a model attempting to predict the probability someone will
 reoffend after being caught for a crime.  There are seven total inputs and I
 planned on using a logistic regression.  I started with a null deviance of
 182.91 and ended up with a residual deviance of 83.40 after accounting for
 different interactions and such.  However, I realized after that my code is
 different from that in my book.  And I can't figure out what I need to put
 in it's place.  Here's my code:
 
...
 fit1h = glm(reoff ~ factor(subst) + factor(violence) + prior +
 factor(violence):factor(subst) + factor(violence):factor(educ) +
 factor(violence):factor(age) + factor(violence):factor(prior))
 
 summary(fit1h)
 
 If you noticed, there's no part of my code that looks like:
 
 family=binomial(link=logit))
 
...
 
 However, when I do this, my null deviance is 1104 and my residual deviance
 is 23460.  THIS IS A HUGE DIFFERENCE IN MODEL FIT!  I'm not sure if I have
 to redo my model or if my book was simply doing the
 family=binomial(link=logit) for a specific problem/reason.

You state that you model the *probability* that ... Then family=gaussian, which
is the default data generation model in glm is not appropriate. Yes, you need to
use family=binomial(link=logit) or family=binomial(link=probit), but you
also need to take care in proper specification of your y in the glm call.

Gregor

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object(s) are masked from package - what does it mean?

2006-10-21 Thread Marc Schwartz
On Sat, 2006-10-21 at 18:43 -0500, tom soyer wrote:
 Hi,
 
 Sometime when I attach a dataset, R gives me the following
 message/warning:The following object(s) are masked from package:datasets:
 column_name. Does anyone know what it means? Since it seems that the
 dataset was attached and I could manipulate the data from the dataset
 without problems, I am wondering what was R trying to tell me.
 
 Thanks,
 
 Tom

This has to do with R's searchpath.

When you attach() a dataset, it loads the dataset into R's current
searchpath. The same occurs with packages when you use library(). This
can be seen with:

 search()
 [1] .GlobalEnvpackage:Designpackage:survival
 [4] package:splines   package:Hmisc package:chron
 [7] package:xtablepackage:gplotspackage:gtools
[10] package:gmodels   package:gdata package:methods
[13] package:stats package:graphics  package:grDevices
[16] package:utils package:datasets  Autoloads
[19] package:base

or with a bit more detail:

 searchpaths()
 [1] .GlobalEnv
 [2] /usr/local/lib/R/library/Design
 [3] /usr/local/lib/R/library/survival
 [4] /usr/local/lib/R/library/splines
 [5] /usr/local/lib/R/library/Hmisc
 [6] /usr/local/lib/R/library/chron
 [7] /usr/local/lib/R/library/xtable
 [8] /usr/local/lib/R/library/gplots
 [9] /usr/local/lib/R/library/gtools
[10] /usr/local/lib/R/library/gmodels
[11] /usr/local/lib/R/library/gdata
[12] /usr/local/lib/R/library/methods
[13] /usr/local/lib/R/library/stats
[14] /usr/local/lib/R/library/graphics
[15] /usr/local/lib/R/library/grDevices
[16] /usr/local/lib/R/library/utils
[17] /usr/local/lib/R/library/datasets
[18] Autoloads
[19] /usr/local/lib/R/library/base


You can see that the 'datasets' package is number 17 in the searchpath.

For example, if I now attach() the iris dataset (a widely used example
dataset in R):

 attach(iris)
 search()
 [1] .GlobalEnviris  package:Design
 [4] package:survival  package:splines   package:Hmisc
 [7] package:chron package:xtablepackage:gplots
[10] package:gtoolspackage:gmodels   package:gdata
[13] package:methods   package:stats package:graphics
[16] package:grDevices package:utils package:datasets
[19] Autoloads package:base


You can see that 'iris' is now number 2 in the searchpath. This means
that I can now access columns in the iris dataset without needing to use
the typical DataframeName$ColumnName syntax:

 Species
  [1] setosa setosa setosa setosa setosa setosa
  [7] setosa setosa setosa setosa setosa setosa
  ...

However:

 detach(iris)
 Species
Error: object Species not found
 iris$Species
  [1] setosa setosa setosa setosa setosa setosa
  [7] setosa setosa setosa setosa setosa setosa
  ...


In your case, the warning is telling you that you have attached a data
frame that presumably contains a column, whose name is 'column_name'. If
you use:

   column_name

in your R session, the object with that name in your data frame will be
seen before another object with the same name that is lower in the
searchpath, since your object is higher in the searchpath. Thus, your
object is 'masking' the other.

I did not see another object called column_name in my version of the
datasets package, so perhaps you have done some manipulation on those
objects during your R session.

As a general comment, as has been discussed previously in other threads,
this is a potentially significant danger in the habit of using attach()
and should generally be avoided.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multilevel model (lme) question

2006-10-21 Thread Lukas Rode
Dear list,

I'm trying to fit a multilevel (mixed-effects) model using the lme function
(package nlme) in R 2.4.0. As a mixed-effects newbie I'm neither sure about
the modeling nor the correct R syntax.

My data is structured as follows: For each subject, a quantity Y is measured
at a number (= 2) of time points. Moreover, at time point 0 (baseline), a
quantity X is measured for each subject (I am interested to see whether X,
or log(X), is a predictor of Y). I saw in some examples that time-invariant
predictors should be repeated for all rows of a subject, which is why I
copied the baseline value of X also to time points  0. The resulting data
frame looks like this:

Grouped Data: Y ~ TIME | Subject
Y TIMESubject X.Baseline
9   0.0 11084
7   3.7 11084
11 0.0 27150
8   9.2 27150

Intra-subject trajectories of Y very close to linear. I'd like to check
whether slope (and maybe also offset) of this line are (in part) predicted
by X.baseline.
Thus, I think the multilevel model specification should be as follows (i =
subject, j=measurement):
y_ij = \beta_i  + b_i * TIME_ij + \epsilon_ij,
with
b_i = \zeta_i0 + \zeta_i1 * X.Baseline
Is this correct?

Now, I am completely unsure how to translate this into the syntax needed
by lme.
Is there any standard procedure on how to get from e.g. the LairdWare'82
matrix model notation to the lme input?
And, in my case, is the correct model as follows, or am I wrong?
my.model - lme( Y ~ TIME, my.data.frame, random=~X.Baseline | Subject )
[in case this is correct, it is by accident, since I made it up from some
examples I found without really understanding the translation from the model
formulation]

I'd greatly appreciate some advice or help.
Best,
  Lukas

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to convert multiple dummy variables to 1 factor variable?

2006-10-21 Thread Marc Schwartz
On Sat, 2006-10-21 at 21:04 -0400, Wensui Liu wrote:
 Dear Listers,
 
 I am wondering how to convert multiple dummy variables to 1 factor variable.
 
 Thanks.
 
 wensui

I was thinking of a function that is essentially the reverse of
model.matrix() which is used by R modeling functions. I did not see one,
though it is possible that I missed it.

However, I suppose that something along the lines of the following would
work.

Say we have a matrix as follows, where the columns represent the
presence or absence of the factor levels, as one would see in a model
matrix. There should be a single '1' in each row as each row corresponds
to a single observation.

 mat
 Level1 Level2 Level3 Level4 Level5
[1,]  0  1  0  0  0
[2,]  1  0  0  0  0
[3,]  0  0  0  1  0
[4,]  0  0  1  0  0
[5,]  0  0  0  0  1


# Create a new factor based upon the index of each 1 in each row
# Use the matrix column names as the labels for each level
NewFactor - factor(apply(mat, 1, function(x) which(x == 1)), 
labels = colnames(mat))

 NewFactor
[1] Level2 Level1 Level4 Level3 Level5
Levels: Level1 Level2 Level3 Level4 Level5


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] cross tabs with percents?

2006-10-21 Thread Larry White
-- apologies if this is a dup - i got a bounce saying the message was
unprocessed.

Is there a straightforward way to get a table with percents in the
cells rather than counts? I've looked at table, ftable, xtabs, and
ctab, which did the conversion but returned the results in a single
row without labels.

any suggestions are appreciated.
thank you.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cross tabs with percents?

2006-10-21 Thread ronggui
Maybe _CrossTable_ in gmodels package is what you want.


2006/10/22, Larry White [EMAIL PROTECTED]:

 -- apologies if this is a dup - i got a bounce saying the message was
 unprocessed.

 Is there a straightforward way to get a table with percents in the
 cells rather than counts? I've looked at table, ftable, xtabs, and
 ctab, which did the conversion but returned the results in a single
 row without labels.

 any suggestions are appreciated.
 thank you.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
»ÆÈÙ¹ó
Department of Sociology
Fudan University

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object(s) are masked from package - what does it mean?

2006-10-21 Thread tom soyer
Thanks Marc and Gregor for the detailed explanation. You are right, the
masking is potentially dangerous. Since R is object oriented, I am surprised
that this is an issue. Does this mean that encapsulation does not exist in
R?

On 10/21/06, Marc Schwartz [EMAIL PROTECTED] wrote:

 On Sat, 2006-10-21 at 18:43 -0500, tom soyer wrote:
  Hi,
 
  Sometime when I attach a dataset, R gives me the following
  message/warning:The following object(s) are masked from
 package:datasets:
  column_name. Does anyone know what it means? Since it seems that the
  dataset was attached and I could manipulate the data from the dataset
  without problems, I am wondering what was R trying to tell me.
 
  Thanks,
 
  Tom

 This has to do with R's searchpath.

 When you attach() a dataset, it loads the dataset into R's current
 searchpath. The same occurs with packages when you use library(). This
 can be seen with:

  search()
 [1] .GlobalEnvpackage:Designpackage:survival
 [4] package:splines   package:Hmisc package:chron
 [7] package:xtablepackage:gplotspackage:gtools
 [10] package:gmodels   package:gdata package:methods
 [13] package:stats package:graphics  package:grDevices
 [16] package:utils package:datasets  Autoloads
 [19] package:base

 or with a bit more detail:

  searchpaths()
 [1] .GlobalEnv
 [2] /usr/local/lib/R/library/Design
 [3] /usr/local/lib/R/library/survival
 [4] /usr/local/lib/R/library/splines
 [5] /usr/local/lib/R/library/Hmisc
 [6] /usr/local/lib/R/library/chron
 [7] /usr/local/lib/R/library/xtable
 [8] /usr/local/lib/R/library/gplots
 [9] /usr/local/lib/R/library/gtools
 [10] /usr/local/lib/R/library/gmodels
 [11] /usr/local/lib/R/library/gdata
 [12] /usr/local/lib/R/library/methods
 [13] /usr/local/lib/R/library/stats
 [14] /usr/local/lib/R/library/graphics
 [15] /usr/local/lib/R/library/grDevices
 [16] /usr/local/lib/R/library/utils
 [17] /usr/local/lib/R/library/datasets
 [18] Autoloads
 [19] /usr/local/lib/R/library/base


 You can see that the 'datasets' package is number 17 in the searchpath.

 For example, if I now attach() the iris dataset (a widely used example
 dataset in R):

  attach(iris)
  search()
 [1] .GlobalEnviris  package:Design
 [4] package:survival  package:splines   package:Hmisc
 [7] package:chron package:xtablepackage:gplots
 [10] package:gtoolspackage:gmodels   package:gdata
 [13] package:methods   package:stats package:graphics
 [16] package:grDevices package:utils package:datasets
 [19] Autoloads package:base


 You can see that 'iris' is now number 2 in the searchpath. This means
 that I can now access columns in the iris dataset without needing to use
 the typical DataframeName$ColumnName syntax:

  Species
 [1] setosa setosa setosa setosa setosa setosa
 [7] setosa setosa setosa setosa setosa setosa
 ...

 However:

  detach(iris)
  Species
 Error: object Species not found
  iris$Species
 [1] setosa setosa setosa setosa setosa setosa
 [7] setosa setosa setosa setosa setosa setosa
 ...


 In your case, the warning is telling you that you have attached a data
 frame that presumably contains a column, whose name is 'column_name'. If
 you use:

  column_name

 in your R session, the object with that name in your data frame will be
 seen before another object with the same name that is lower in the
 searchpath, since your object is higher in the searchpath. Thus, your
 object is 'masking' the other.

 I did not see another object called column_name in my version of the
 datasets package, so perhaps you have done some manipulation on those
 objects during your R session.

 As a general comment, as has been discussed previously in other threads,
 this is a potentially significant danger in the habit of using attach()
 and should generally be avoided.

 HTH,

 Marc Schwartz




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cross tabs with percents?

2006-10-21 Thread tom soyer
try prop.table()

On 10/21/06, ronggui [EMAIL PROTECTED] wrote:

 Maybe _CrossTable_ in gmodels package is what you want.


 2006/10/22, Larry White [EMAIL PROTECTED]:
 
  -- apologies if this is a dup - i got a bounce saying the message was
  unprocessed.
 
  Is there a straightforward way to get a table with percents in the
  cells rather than counts? I've looked at table, ftable, xtabs, and
  ctab, which did the conversion but returned the results in a single
  row without labels.
 
  any suggestions are appreciated.
  thank you.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 »ÆÈÙ¹ó
 Department of Sociology
 Fudan University

[[alternative HTML version deleted]]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question:shardsplot (package:klaR)

2006-10-21 Thread ebi
Dear all,


 I have a question on the shardsplot package:klaR(see the below Example).

Plese tell me the meanings of  logstand - t((t(logcount) / sdlogcount) *
c(1,2,6,5,5,3)), much more.

Why does this example use c(1,2,6,5,5,3) ?

 Examples:



 # Compute clusters and an Eight Directions Arranged Map for the

 # country data. Plotting the result.

 data(countries)

 logcount - log(countries[,2:7])

 sdlogcount - apply(logcount, 2, sd)

 logstand - t((t(logcount) / sdlogcount) * c(1,2,6,5,5,3))

 cclasses - cutree(hclust(dist(logstand)), k = 6)

 countryEDAM - EDAM(logstand, classes = cclasses, sa = FALSE,

 iter.max = 10, random = FALSE)

 plot(countryEDAM, vertices = FALSE, label = TRUE, stck = FALSE)




Sincerely yours,
Mikio

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cross tabs with percents?

2006-10-21 Thread Jim Lemon
Larry White wrote:
 -- apologies if this is a dup - i got a bounce saying the message was
 unprocessed.
 
 Is there a straightforward way to get a table with percents in the
 cells rather than counts? I've looked at table, ftable, xtabs, and
 ctab, which did the conversion but returned the results in a single
 row without labels.
 
xtab in the prettyR package.

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] least median squares

2006-10-21 Thread Pedro Mardones
Does anyone can provide a code to implement least median squares
regression in R (not using the lqs function or calling C functions)?
Reason: teaching/learning purposes
Thanks
PM

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] least median squares

2006-10-21 Thread Gabor Grothendieck
Try this using builtin data set BOD:


medsq - function(p, DF) median((p[1] + p[2] * DF[[1]] - DF[[2]])^2)
init - coef(lm(demand ~ Time, BOD))
optim(init, medsq, DF = BOD)

library(MASS)
lqs(demand ~ Time, BOD, method = lms)



On 10/22/06, Pedro Mardones [EMAIL PROTECTED] wrote:
 Does anyone can provide a code to implement least median squares
 regression in R (not using the lqs function or calling C functions)?
 Reason: teaching/learning purposes
 Thanks
 PM

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] currency or stock trading strategy

2006-10-21 Thread Darren Weber
Thanks to everyone for many useful tips to more information.I
think this thread is worth reading, if only for that reason alone
(despite some disparaging comments about my research abilitites).  I
appreciate the advice from those who have more experience with these
finance tools than me (that is not hard!), as your advice could save
me many, many hours of googling and reading materials, trying to form
my own impressions and evaluations of the strengths and weaknesses in
various packages, with the aim of finding something easy to work with
at a relatively high level of function and performance (without the
price tag!). So, thanks!

I wonder if I might pose a small coding problem to the list, with the
intention that list members might offer some code in this thread to
illustrate the utilities available?  I would hope to see some
alternatives and some further commentary on preferred methods.  Take
for instance, the charts plotted on this page:

http://invest.kleinnet.com/bmw1/intro.html

I think it should be easy to keep a daily catalogue of all these
charts, and the associated metrics, for all stocks that have quote
data readily available.  What is the most efficient, succint methods
for doing this with open-source software?

Best, Darren



On 9/19/06, BBands [EMAIL PROTECTED] wrote:
 On 9/17/06, Darren Weber [EMAIL PROTECTED] wrote:
  Hi,
 
  are there any good charting and analysis tools for use with
  currencies, stocks, etc. in R?  I have some tools to download currency
  data from the NYFRB using python and XML.  Can we get and parse an XML
  download using R?  Can we have interaction in R plots?  Does anyone
  use R for back-testing trading strategies?  Are there any forums for
  discussion of using R for this specific purpose (apart from this
  general list)?  Is anyone aware of any general open-source
  developments for these purposes (I don't see any from GNU or google
  searches)?
 
  Take care, Darren

 I hardly know where to start either. I use R, Python, rpy, gnuplot-py
 and gnuplot together and am very happy with the result. I find that
 each piece has it own strengths and together they provide a very
 powerful solution. I am currently working on integrating TA-lib into
 this mix via a project started by Andrea Malagoli--see the
 r-sig-finance archives.

 http://rpy.sourceforge.net/
 http://www.gnuplot.info/
 http://gnuplot-py.sourceforge.net/
 http://ta-lib.org/

  jab
 --
 John Bollinger, CFA, CMT
 www.BollingerBands.com

 If you advance far enough, you arrive at the beginning.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.