Re: [R] improvement of Ancova analysis

2008-05-04 Thread Stephan Kolassa

Hi Tobias,

If you want to do inferential statistics with groups differing 
systematically on the covariate, you will need to be extra careful in 
your interpretation. See, e.g., Miller, G. A.  Chapman, J. P. 
Misunderstanding Analysis of Covariance, Journal of Abnormal Psychology, 
2001, 110, 40-48, and a lot of other similar things.


That said, with your wide variation in pes you may want to consider 
restricted cubic splines (natural splines) in Frank Harrell's Hmisc 
and Design packages. At least, it would be interesting to test whether 
the influence of pes really is linear, which can be done easily with 
splines. See also Harrell, F. E. Regression Modeling Strategies, 
Springer, 2001.


Good luck with your small furry creatures!
Stephan


Tobias Erik Reiners schrieb:

Dear Helpers,

I just started working with R and I'm a bit overloaded with information.

My data is from marsupials reindroduced in a area. I have weight(wt), 
hind foot

lenghts(pes) as continues variables and origin and gender as categorial.
condition is just the residuals i took from the model.


names(dat1)

[1] wt pes origin  gender condition

my model after model simplification so far:
model1-lm(log(wt)~log(pes)+origin+gender+gender:log(pes))
--six intercepts and two slopes

the problem is i have some things I can't include in my analysis:
1.Very different sample sizes for each of the treatments

tapply(log(wt),origin,length)

captivesitewild
119 149  19
2.Substantial differences in the range of values taken by the covariate 
(leg length) between treatments

tapply(pes,origin,var)

 captive site wild
82.43601 71.2 60.42544

tapply(pes,origin,mean)

 captive site wild
147.3261 144.8698 148.2895

4.Outliers
5.Poorly behaved residuals

thanks for the answer I am open minded to any different kind of analysis.

Tobi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ancova_non-normality of errors

2008-05-04 Thread Tobias Erik Reiners

Hello Helpers,

I have some problems with fitting the model for my data...
--my Literatur says (crawley testbook)=
Non-normality of errors--I get a banana shape Q-Q plot with opening  
of banana downwards


Structure of data:
 origin   wt   pes gender
1  wild 5.35 147.0   male
2  wild 5.90 148.0   male
3  wild 6.00 156.0   male
4  wild 7.50 157.0   male
5  wild 5.90 148.0   male
6  wild 5.95 148.0   male
7  wild 8.55 160.5   male
8  wild 5.90 148.0   male
9  wild 8.45 161.0   male
10 wild 4.90 147.0   male
11 wild 6.80 153.0   male
12 wild 5.75 146.0   male
13 wild 8.60 160.0   male
14  captive 6.85 159.0   male
15  captive 7.00 160.0   male
16  captive 6.80 155.0   male
..
...
283site 4.10 130.4 female
284site 3.55 131.1 female
285site 4.20 135.7 female
286site 3.45 128.0 female
287site 3.65 125.3 female

The goal of my analysis is to work out what effect the categorial  
factors(origin, gender) on the relation between  
log(wt)~log(pes)(--Condition, fett ressource), have.
Does the source(origin) of translocated animals have an affect on  
performance(condition)in the new area?

I have already a best fit model and it looks quite good (or not?see below).

two slopes(gender difference)and 6 intercepts(3origin levels*2gender levels)

lm(formula = log(wt) ~ log(pes) + origin + gender + gender:log(pes))

Residuals:
 Min   1Q   Median   3Q  Max
-0.54181 -0.07671  0.01520  0.09474  0.28818

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept) -7.398791.97605  -3.744 0.000219 ***
log(pes) 1.780200.40118   4.437 1.31e-05 ***
originsite   0.065720.01935   3.397 0.000781 ***
originwild   0.076550.03552   2.155 0.032011 *
gendermale  -9.324182.37476  -3.926 0.000109 ***
log(pes):gendermale  1.903930.47933   3.972 9.06e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1433 on 281 degrees of freedom
Multiple R-Squared: 0.7227, Adjusted R-squared: 0.7177
F-statistic: 146.4 on 5 and 281 DF,  p-value:  2.2e-16

When plot this model I get a banana-shape in Normal Q-Q Plot(with open  
site pointing downwards) , indicating non-normality of my datahow  
to handle this?


--Do I have unbalanced data?
  captivesitewild
n-- 119 149  19

My problem is that I see that my data is not as good as the  
modelsummary tells.

Should I include another term in my model formular?

I think I have to differenciate more, but I don't know  
how.(contrasts?, TukeyHSD?,Akaike Information Criterion? or lme())to  
many different ways out there.


Cheers,
Tobi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2008-05-04 Thread lmbewjs
Dear R users, 
 I have a question about R installation under Cygwin. Which versionof R should 
I download ,linux or windows? If linux ,which release should I download? Thanks 
a lot!


Jiansheng Wu 
PhD Candidate of State Key Laboratory of Bioelectronics Southeast University, 
Nanjing, 210096, China
Tel 86-25-83790881
Email: [EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing R under Cygwin [was (no subject)]

2008-05-04 Thread Prof Brian Ripley

On Sun, 4 May 2008, lmbewjs wrote:


Dear R users,


I have a question about R installation under Cygwin. Which versionof R 
should I download ,linux or windows? If linux ,which release should I 
download? Thanks a lot!


Neither.  You need to download the source tarball (R-2.7.0.tar.gz on teh 
CRAN front page), and build from the sources on Cygwin, following the 
instructions in the 'R Installation and Administration' manual.


Cygwin is not a supported platform, but R does more or less work under it.

You could install a binary Windows version, but it will not understand 
Cygwin file paths.



Jiansheng Wu
PhD Candidate of State Key Laboratory of Bioelectronics Southeast University,
Nanjing, 210096, China
Tel 86-25-83790881
Email: [EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


PLEASE do -- use a helpful subject line, no HTML mail, read the manuals 
before posting 



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is my understanding of rlnorm correct?

2008-05-04 Thread phil colbourn
Thank you for your reply.

I think I have a poor understanding of this distribution but, if I
understand your answer albeit roughly, then to get a mean of 100 I need to
select a mu and derive the sd using sqrt(2*(log(100)-mu)).

That helps a lot.

My application is in modeling/simulating failure/repair processes which I
have read are typically log-normal. I should now be able to get the result i
expect to get.

Thanks again.

On Sun, May 4, 2008 at 3:11 PM, Berwin A Turlach [EMAIL PROTECTED]
wrote:

 G'day Phil,

 On Sun, 4 May 2008 14:05:09 +1000
 phil colbourn [EMAIL PROTECTED] wrote:

  rlnorm takes two 'shaping' parameters: meanlog and sdlog.
 
  meanlog would appear from the documentation to be the log of the
  mean. eg if the desired mean is 1 then meanlog=0.

 These to parameters are the mean and the sd on the log scale of the
 variate, i.e. if you take the logarithm of the produced numbers then
 those values will have the given mean and sd.

 If X has an N(mu, sd^2) distribution, then Y=exp(X) has a log-normal
 distribution with parameters mu and sd.

 R set.seed(1)
 R y - rlnorm(1, mean=3, sd=2)
 R summary(log(y))
Min. 1st Qu.  MedianMean 3rd Qu.Max.
  -4.343   1.653   2.968   2.987   4.355  10.620
 R mean(log(y))
 [1] 2.986926
 R sd(log(y))
 [1] 2.024713


  I noticed on wikipedia lognormal page that the median is exp(mu) and
  that the mean is exp(mu + sigma^2/2)
 
  http://en.wikipedia.org/wiki/Log-normal_distribution

 Where mu and sigma are the mean and standard deviation of a normal
 variate which is exponentiated to obtain a log normal variate.  And
 this holds for the above example (upto sampling variation):

 R mean(y)
 [1] 143.1624
 R exp(3+2^2/2)
 [1] 148.4132

  So, does this mean that if i want a mean of 100 that the meanlog
  value needs to be log(100) - log(sd)^2/2?

 A mean of 100 for the log-normal variate?  In this case any set of mu
 and sd for which exp(mu+sd^2/2)=100 (or mu+sd^2/2=log(100)) would do
 the trick:

 R mu - 2
 R sd - sqrt(2*(log(100)-mu))
 R summary(rlnorm(1, mean=mu, sd=sd))
  Min.   1st Qu.Median  Mean   3rd Qu.  Max.
 4.010e-04 1.551e+00 7.075e+00 1.006e+02 3.344e+01 3.666e+04
 R mu - 4
 R sd - sqrt(2*(log(100)-mu))
 R summary(rlnorm(1, mean=mu, sd=sd))
  Min.   1st Qu.Median  Mean   3rd Qu.  Max.
0.9965   25.9400   56.0200  101.2000  115.5000 3030.
 R mu - 1
 R sd - sqrt(2*(log(100)-mu))
 R summary(rlnorm(1, mean=mu, sd=sd))
  Min.   1st Qu.Median  Mean   3rd Qu.  Max.
 9.408e-05 4.218e-01 2.797e+00 8.845e+01 1.591e+01 7.538e+04

 Note that given the variation we would expect in the mean in the last
 example, the mean is actually close enough to the theoretical value
 of 100:

 R sqrt((exp(sd^2)-1)*exp(2*mu + sd^2)/1)
 [1] 36.77435

 HTH.

 Cheers,

Berwin

 === Full address =
 Berwin A TurlachTel.: +65 6515 4416 (secr)
 Dept of Statistics and Applied Probability+65 6515 6650 (self)
 Faculty of Science  FAX : +65 6872 3919
 National University of Singapore
 6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
 Singapore 117546
 http://www.stat.nus.edu.sg/~statbahttp://www.stat.nus.edu.sg/%7Estatba




-- 
Phil

 Someone else has solved it and posted it on the internet for free 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] improvement of Ancova analysis

2008-05-04 Thread hadley wickham
On Sat, May 3, 2008 at 9:00 PM, Tobias Erik Reiners
[EMAIL PROTECTED] wrote:
 Dear Helpers,

  I just started working with R and I'm a bit overloaded with information.

  My data is from marsupials reindroduced in a area. I have weight(wt), hind
 foot
  lenghts(pes) as continues variables and origin and gender as categorial.
  condition is just the residuals i took from the model.


  names(dat1)
 
  [1] wt pes origin  gender condition

  my model after model simplification so far:
  model1-lm(log(wt)~log(pes)+origin+gender+gender:log(pes))
  --six intercepts and two slopes

  the problem is i have some things I can't include in my analysis:
  1.Very different sample sizes for each of the treatments

  tapply(log(wt),origin,length)
 
  captivesitewild
 119 149  19
  2.Substantial differences in the range of values taken by the covariate
 (leg length) between treatments

  tapply(pes,origin,var)
 
   captive site wild
  82.43601 71.2 60.42544

  tapply(pes,origin,mean)
 
   captive site wild
  147.3261 144.8698 148.2895

  4.Outliers
  5.Poorly behaved residuals

  thanks for the answer I am open minded to any different kind of analysis.

How about starting with some graphics?  e.g. with ggplot2 the
following would give you some clues as to whether your models are
appropriate or not:

qplot(pes, wt, data=dat1, colour=gender, facets = . ~ origin,
log=xy) + geom_smooth(method=lm)
qplot(pes, wt, data=dat1, facets = gender ~ origin, log=xy) +
geom_smooth(method=lm)

If you wanted to the see the effect of a robust fit, as suggested by
Brian Ripley, replace lm with rlm.

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Errors bar in barchart

2008-05-04 Thread Ronaldo Reis Junior
Em Sex 02 Mai 2008, Deepayan Sarkar escreveu:
 On 5/2/08, Ronaldo Reis Junior [EMAIL PROTECTED] wrote:
  Hi,
 
   I user barplot2 to make a plot bar with errors bars. In old times I
  needed to use a sequence of segments commands to make this.
 
   Now I try to make the same but using lattice. Is possible to use
  barplot2 in barchart function?
 
   If not, what is the simplest way to put errors bar in barchart? I try to
  find an example in Lattice book, but dont find anythink like this.

 No there isn't.

 I don't like the idea of error bars on bar charts, and I would suggest
 you use them with dot plots instead. There is a demo of this that you
 can run using

  demo(intervals, package = lattice)

 -Deepayan

Thanks,

I get it. I dont like this idea too, but some people living in the past 
(Jethro?)

Thanks
Ronaldo
-- 
If you wait long enough, it will go away... after having done its damage.
If it was bad, it will be back.
--
 Prof. Ronaldo Reis Júnior
|  .''`. UNIMONTES/Depto. Biologia Geral/Lab. de Biologia Computacional
| : :'  : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia
| `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil
|   `- Fone: (38) 3229-8187 | [EMAIL PROTECTED] | [EMAIL PROTECTED]
| http://www.ppgcb.unimontes.br/ | ICQ#: 5692561 | LinuxUser#: 205366

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ancova_non-normality of errors

2008-05-04 Thread John Fox
Dear Tobias,

Your observation that When plot [the residuals from?] this model I get a
banana-shape in Normal Q-Q Plot(with open site [side?] pointing downwards),
suggests that the residuals are negatively skewed, which in turn suggests
that using log(wt) as the response variable may have been ill-advised.
Perhaps simply using wt, or a weaker transformation such as sqrt(wt), would
produce better-behaved residuals.

I hope this helps,
 John 

--
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On
 Behalf Of Tobias Erik Reiners
 Sent: May-04-08 5:56 AM
 To: r-help@r-project.org
 Subject: [R] Ancova_non-normality of errors
 
 Hello Helpers,
 
 I have some problems with fitting the model for my data...
 --my Literatur says (crawley testbook)=
 Non-normality of errors--I get a banana shape Q-Q plot with opening
 of banana downwards
 
 Structure of data:
   origin   wt   pes gender
 1  wild 5.35 147.0   male
 2  wild 5.90 148.0   male
 3  wild 6.00 156.0   male
 4  wild 7.50 157.0   male
 5  wild 5.90 148.0   male
 6  wild 5.95 148.0   male
 7  wild 8.55 160.5   male
 8  wild 5.90 148.0   male
 9  wild 8.45 161.0   male
 10 wild 4.90 147.0   male
 11 wild 6.80 153.0   male
 12 wild 5.75 146.0   male
 13 wild 8.60 160.0   male
 14  captive 6.85 159.0   male
 15  captive 7.00 160.0   male
 16  captive 6.80 155.0   male
 ..
 ...
 283site 4.10 130.4 female
 284site 3.55 131.1 female
 285site 4.20 135.7 female
 286site 3.45 128.0 female
 287site 3.65 125.3 female
 
 The goal of my analysis is to work out what effect the categorial
 factors(origin, gender) on the relation between
 log(wt)~log(pes)(--Condition, fett ressource), have.
 Does the source(origin) of translocated animals have an affect on
 performance(condition)in the new area?
 I have already a best fit model and it looks quite good (or not?see
below).
 
 two slopes(gender difference)and 6 intercepts(3origin levels*2gender
levels)
 
 lm(formula = log(wt) ~ log(pes) + origin + gender + gender:log(pes))
 
 Residuals:
   Min   1Q   Median   3Q  Max
 -0.54181 -0.07671  0.01520  0.09474  0.28818
 
 Coefficients:
  Estimate Std. Error t value Pr(|t|)
 (Intercept) -7.398791.97605  -3.744 0.000219 ***
 log(pes) 1.780200.40118   4.437 1.31e-05 ***
 originsite   0.065720.01935   3.397 0.000781 ***
 originwild   0.076550.03552   2.155 0.032011 *
 gendermale  -9.324182.37476  -3.926 0.000109 ***
 log(pes):gendermale  1.903930.47933   3.972 9.06e-05 ***
 ---
 Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 
 Residual standard error: 0.1433 on 281 degrees of freedom
 Multiple R-Squared: 0.7227, Adjusted R-squared: 0.7177
 F-statistic: 146.4 on 5 and 281 DF,  p-value:  2.2e-16
 
 When plot this model I get a banana-shape in Normal Q-Q Plot(with open
 site pointing downwards) , indicating non-normality of my datahow
 to handle this?
 
 --Do I have unbalanced data?
captivesitewild
 n-- 119 149  19
 
 My problem is that I see that my data is not as good as the
 modelsummary tells.
 Should I include another term in my model formular?
 
 I think I have to differenciate more, but I don't know
 how.(contrasts?, TukeyHSD?,Akaike Information Criterion? or lme())to
 many different ways out there.
 
 Cheers,
 Tobi
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adaptive optimization of mesh size

2008-05-04 Thread baptiste Auguié

DeaR list,


I'm running an external program that computes some electromagnetic  
response of a scattering body. The numerical scheme is based on a  
discretization with a characteristic mesh size y. The smaller y is,  
the better the result (but obviously the computation will take longer).


A convergence study showed the error between the computed values and  
the exact solution of the problem to be a quadratic in y, with  
standard error increasing as y^3. I wrote the interface to the  
program in R, as it is much more user friendly and allows for post- 
processing analysis. Currently, it only runs with user-defined  
discretization parameter. I would like to implement an adaptive  
scheme [1] and provide the following improvements,


1) obtain an estimate of the error by fitting the result against a  
series of mesh sizes with the quadratic model, and extrapolate at y =  
0. (quite straight forward)


2) adapt dynamically the set of mesh sizes to fulfill a final  
accuracy condition, between a starting value (a rule-of thumb  
estimate is given by the problem values). The lower limit of y should  
also be constrained by the resources (again, an empirical rule  
dictates the computation time and memory usage).


I'm looking for advice on this second point (both on the technical  
aspect, and whether this is sound statistically):


- I can foresee that I should always start with a few y values before  
I can do any extrapolation, but how many of them? 3, 10? How could I  
know?


- once I have enough points (say, 10) to use the fitting procedure  
and get an estimate of the error, how should I decide the best  
location of the next y if the error is too important?


- in a practical implementation, I would use a while loop and append  
the successive values to a data.frame(y, value). However, this  
procedure will be run for different parameters (wavelengths,  
actually), so the set and number of y values may vary between one run  
and another. I think I'd be better off using a list with each new run  
having its own data.frame. Does this make sense?



Below are a few lines of code to illustrate the problem,


program.result - function(x, p){ # made up function that mimicks  
the results of the real program


y - p[3]*x^2 + p[2]*x + p[1]
y * (1 + rnorm(1, mean=0,  sd =  0.1 * y^3))
}


p0 - c(0.1, 0.1, 2) # set of parameters

## user defined limits of the y parameter (log scale)
limits - c(0.1, 0.8)
limits.log - (10^limits)
y.log - seq(limits.log[1], limits.log[2], l=10)

y - log10(y.log)

result - sapply(y, function(x) program.result(x, p0)) # results of  
the program


 fitting and extrapolation procedure 
library(gplots) # plot with CI
plotCI(y,  result, y^3,  xlim=c(0, 1), ylim=c(0, 2)) # the data  
with y^3 errors


my.data - data.frame(y = y,  value = result)

fm - lm(value ~ poly(y, degree=2, raw=TRUE), data = my.data ,   
weights = 1/y^3)

lines(y, predict(fm, data.frame(y=y)), col = 2)

extrap - summary(fm)$coefficients[1,] # intercept and error on it
plotCI(0,extrap[1], 2 * extrap[2],  col = 2, add=T)


### my naive take on adaptive runs... ##

objective - 1e-3 # stop when the standard error of the  
extrapolated value is smaller than this


err - extrap[2]
my.color - 3
while (err  objective){

new.value - min(y)/2 # i don't know how to choose this optimally
y - c(new.value, y)
new.result - program.result(new.value, p0)
result - c(new.result, result)
points(new.value, new.result, col= my.color)
my.data - data.frame(y = y,  value = result)
	fm - lm(value ~ poly(y, degree=2, raw=TRUE), data = my.data ,   
weights = 1/y^3)

lines(y, predict(fm, data.frame(y=y)), col = my.color)

extrap - summary(fm)$coefficients[1,] # intercept and error on it
err - extrap[2]
print(err)
plotCI(0,extrap[1], 2 * err,  col = 2, add=T)
my.color - my.color + 1

}
err




Many thanks in advance for your comments,

baptiste


[1]: Yurkin et al., Convergence of the discrete dipole approximation.  
II. An extrapolation technique to increase the

accuracy. J. Opt. Soc. Am. A / Vol. 23, No. 10 / October 2006
_

Baptiste Auguié

Physics Department
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
http://projects.ex.ac.uk/atto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with mars output

2008-05-04 Thread Russell Brown
I need help in interpreting some output of polymars.

The returned model by my command is:

m$model
  pred1knot1 pred2 knot2   coefs   SE
1 0   NA 0NA  1.3163 0.0007806758
2 1   NA 0NA -0.10904285 0.0006735827
3 1 1.193575 0NA  1.25396217 0.0205092438
4 1 1.205400 0NA -0.18020096 0.0261931173
5 1 1.230725 0NA  0.04709508 0.0118559207
6 1 1.285400 0NA -0.02593593 0.0097733958
7 1 1.317275 0NA  0.03516571 0.0177043207


I'm trying to confirm the generated output of mars by manually entering the 
basis functions into my dataset, What I don't understand is that in the knot1 
column there are to rows containing NA.  In the polymars literature I was 
reading this means that these basis functions are linear.

BF1 = MAX(0, A2 - 1.193575)
BF2 =MAX(0, 1.193575 -  A2)
BF3 = MAX(0, A2 - 1.2054)
BF5 = MAX(0, A2 - 1.230725)
BF7 = MAX(0, A2 - 1.2854)
BF11 = MAX(0, A2 - 1.317275)

RESPONSE = 1.3163 - 0.10904285 * C2 + 1.25396217 * D2 - 0.18020096 *
 E2 + 0.04709508 * F2 - 0.02593593 * G2 + 0.03516571 * H2

On the attached, the CLOSE column is my source data, the ESTIMATE is the column 
generated by polymars, the PREDICTION column is my formula.

Can I have some help with this please.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Categorizing Fonts using Statistical Methods

2008-05-04 Thread Leonard Mada

Dear list members,

Every modern OS comes with dozens of useless fonts, so that the 
current font drop-down list in most programs is overcrowded with fonts 
one never will use. Selecting a useful font becomes a nightmare.


In an attempt to ease the selection of useful fonts, I began looking 
into sorting fonts using some statistical techniques. I summed my ideas 
on the OpenOffice.org wiki:

http://wiki.services.openoffice.org/wiki/User_Experience/ToDo/Product/Font_Categories

Of course, there is NO guarantee that something useful will emerge, but 
at least someone has tried it.


I would like to try various statistical methods using R, unfortunately, 
I got rather stuck in my attempts.


I wish to compute:
  - the length of a standard string for the various fonts
  - the weight
  - some variance-type measures for the OX and OY-axis
  - DCT (possibly analysing separately the low/high-frequencies)
  - maybe some other measures
[I am open to suggestions]

1.) First and foremost, I need the list of fonts installed on my system.
[I am using Win2k]

Is there any way to get it automatically in R?

IF this is not possible, I could create one by hand, though this is 
cumbersome, but the 2nd problem is more severe.


2.) How do I create/get the 2D-pixel matrix?

I need of course the f(x,y)-image representation for a standard text.

The following seems a rather ugly hack and I do not actually have the 
exact text-box size.


 png(file=mytestfontimage.png)
 plot.new()
 title(This is a font)
 dev.off()

strwidth() and strheight() seem to be able to look into the fonts. But 
how do I get the pixels?

And more importantly, can I get also the exact pixel-matrix?
[Though, it seems there are no pixel-units in strheight()/width(), but I 
might be wrong on this.]


3.) The image-analyses capabilities of R are rather limited.
I couldn't find any reference to a DCT transform (or other techniques). 
A search yielded only the following thread:

http://tolstoy.newcastle.edu.au/R/help/06/01/19615.html

As I do not know (and have access to) mathlab, I am rather confined to 
R. Which is not bad, but I need a lot of help to accomplish this task.


Any help is highly appreciated.

Sincerely,

Leonard Mada

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speedups with Ra and jit

2008-05-04 Thread Luke Tierney

A couple of comments on this and the original thread.

As pointed out by several posters, in a vectorized language like R one
can usually create the fastest and cleanest code by using vectorized
operations.  This applies to R as well as Matlab.

That said, there are at times reasons for using code consisting of a
set of tight nested loops, such as

it may not be obvious how to rewrite the loops using
vectorization

the code may match a published algorithm and so verifying its
correctness and maintaining it in this form may be easier than
rewriting

the code may not be fast or pretty but it works and gets the right
answer

Given these points it would be nice if R's interpreter could do a
better job.  R's interpreter is currently rather slow on these sorts
of computations.  For example, a direct translation of the original
poster's grw_permute example runs about 5 times as fast in the
xlispstat interpreter (similar internals in many ways).  There are
some things we could do involving relatively little effort to
marginally improve R's interpreter, but significant improvements would
likely require a fairly significant rewrite (which may be worth
considering at some point but probably not soon).

An alternative is to pursue automated tools to transform this sort of
code into faster code.  This is in part the motivation for creating a
byte code compiler for R, and for the Ra/jit project.  The current
byte code compiler (http://www.stat.uiowa.edu/~luke/R/compiler/)
speeds up this computation by about a factor of 3.  This compiler does
not yet optimize variable lookup or indexing of vectors or matrices;
these should be added sometime this summer and should add another
factor of 3 or so for this example.

Steve has done some very interesting things in his Ra/jit project.
Some of the optimizations done there are either already done in the
byre code compiler or have been planned to be added for a while.
Others, in particular specialization for basic data types may be best
done at run time as jit does, and there may be room for merging these
ideas with the static byte code compiler.

The particular specialization approach used in jit means that some
code will produce different results or generate errors; i.e. a user
who requests jit compilation is implicitly agreeing not to try to do
certain things, such as change types or sizes of values stored in
variables used in the compilation.  In the long run I would prefer
either a mechanism where such assumptions are declared explicitly by
the user or to arrange for R to automatically switch back to less
optimized code when the assumptions of the optimization are violated.

I believe the main aspect of runtime specialization done in jit now
that may be hard to match in statically compiled byte code is that
a function defined as

f - function(x) {
jit()
s - 0
for (i in seq_along(x)) s - s + x[i]
s
}

will be optimized for integer data when run on integer x and on for
real data when run on real x, and in both cases allocation of
intermediate results is avoided.  How valuable this is in the long run
is not yet clear -- it would definitely be very helpful if machine
code was being generated that also allowed intermediate values to stay
in registers (which I believe is what psyco does), but that is messy
and hard to do across many platforms.  With fast allocation the
benefits of avoiding allocation alone may not be that substantial.
For example, the byte compiled xlispstat version of the grw_protect
example mentioned above runs about twice as fast at the Ra/jit one
without avoiding intermediate allocation.  This isn't conclusive of
course and it will be interesting to do some more careful tests and
see what directions those suggest.

Best,

luke


On Fri, 2 May 2008, [EMAIL PROTECTED] wrote:


The topic of Ra and jit has come up on this list recently

(see http://www.milbo.users.sonic.net/ra/index.html)

so I thought people might be interested in this little demo. For it I
used my machine, a 3-year old laptop with 2Gb memory running Windows
XP, and the good old convolution example, the same one as used on the
web page, (though the code on the web page has a slight glitch in it).

This is using Ra with R-2.7.0.


conv1 - function(a, b) {
### with Ra and jit

 require(jit)
 jit(1)
 ab - numeric(length(a)+length(b)-1)
 for(i in 1:length(a))
   for(j in 1:length(b))
 ab[i+j-1] - ab[i+j-1] + a[i]*b[j]
 ab
   }


conv2 - function(a, b) {
### with just Ra

 ab - numeric(length(a)+length(b)-1)
 for(i in 1:length(a))
   for(j in 1:length(b))
 ab[i+j-1] - ab[i+j-1] + a[i]*b[j]
 ab
   }


x - 1:2000
y - 1:500
system.time(tst1 - conv1(x, y))

  user  system elapsed
  0.530.000.55

system.time(tst2 - conv2(x, y))

  user  system elapsed
  9.490.009.56

all.equal(tst1, tst2)

[1] TRUE


9.56/0.55

[1] 17.38182




However for this example you can achieve 

[R] Text shrinking in pdf graphics

2008-05-04 Thread Ronnen Levinson

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross Spectrum Analysis

2008-05-04 Thread stephen sefick
is this a problem?  are there error messages?  if so could you provide
them.  Try
as.matrix(yourdata).  One thing you could do is create a moving average that
reduces the signals to the lowest common denominator.  Could you provide
reproducable code with maybe a toy data set so anybody could have a look at
what is going on.
good luck

Stephen

On Sat, May 3, 2008 at 6:11 PM, Maura E Monville [EMAIL PROTECTED]
wrote:

 In the case of muitivariate, from the documentation it looks like I can
 compare more than two signals at a time.
 Each column of the input matix seem to accommodate a signal.
 The problem is that my signals do NOT have the same number of samples
 (length).
 They were all collected at 30Hz so the sampling time interval is roughly
 0.033[s].
 Some signals have about 5000 samples and other ones have more than 8000.
 The R routine spectrum expect the multivariate to be a matrix ...
 Any idea how to overcome such an obstacle ?
 Padding with zeros would alter (I think) the phenomen being studied that
 is breathing patterns.
 Is there a way to feed the spectrum function with the signal spectrum
 (power density) instead of the time domain signal ?
 Since the sampling interval is equal for all the signal, so is the Nyquist
 frequency. I can easily get the power spectrum
 defined over the domain [0, Nyquist-frequency]  which does not have the
 problem of different lengths ... ???

 Thank you so much.

 Maura

 On Wed, Apr 30, 2008 at 8:56 AM, stephen sefick [EMAIL PROTECTED] wrote:

   $names
   [1] freq  spec  coh   phase kerneldf
   [7] bandwidth n.usedorig.nseriessnames
   method
  [13] taper pad   detrend   demean
 
  $freq and $spec are used to plot the power spectrum.  freq is the x-axis
  and spec is the y-axis.  $coh is the squared coherency between the two
  signals in your case and I believe that this is also plotted against
  frequency.  This is your correlation strength.  Phase I haven't been able
  to figure out- I think that it is some sort of estimator for the phase
  shift.  to get either phase or coherency plot add the plot.type argument to
  your plot command
 
  x - spectrum(yourdata, log=no) #this will plot it without a log scale
  I find it useful to look at both the no log plot and then the logscale plot
  (just remove the log=no)
 
  plot(x, plot.type=marginal)  #this is the default type (the
  powerspectrum)
  plot(x, plot.type=phase)
  plot(x, plot.type=coherency)
 
  also just look at
 
  ?spectrum
  schumway is a good book - I think it is something like time series
  analysis with examples in R
 
  hope this helps
 
  stephen
 
 
  On Tue, Apr 29, 2008 at 8:54 PM, Maura E Monville 
  [EMAIL PROTECTED] wrote:
 
   I am reading some documentation about Cross Spectrum Analysis as a
   technique
   to compare spectra.
   My understanding is  that it estimates the correlation strength
   between
   quasi-periodic structures embedded in two signals. I believe it may be
   useful for my signals  analysis.
  
   I was referred to the R  functions that  implement this type of
   analysis. I
   tried all the examples which generated a series of fancy plots. But  I
   need
   to work on the numerical results.
  
   I have read that the following info is available through Cross Spectra
   analysis:
   *Cross-periodogram, Cross-Density, Quadrature-density,
   Cross-amplitude, Squared
   Coherency, Gain, and Phase Shift*
   I went through a couple of the two-series (bivariate) cross-spectrum
   analysis examples with R.
   I also printed out the attributes of the analysis (see the following).
   I
   cannot quite match the above quantities with the attributes/features
   output
   of cross-spectra analysis with R.
   I would greatly appreciate some explanation (which is what) and seeing
   some
   more worked out examples.
  
attributes(mfdeaths.spc)
   $names
[1] freq  spec  coh   phase kerneldf
[7] bandwidth n.usedorig.nseriessnames
method
   [13] taper pad   detrend   demean
  
   $class
   [1] spec
  
  
   Thank you so much.
  
   Yours Faithfully,
   --
   Maura E.M
  
  [[alternative HTML version deleted]]
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  --
  Let's not spend our time and resources thinking about things that are so
  little or so large that all they really do for us is puff us up and make us
  feel like gods. We are mammals, and have not exhausted the annoying little
  problems of being mammals.
 
  -K. Mullis




 --
 Maura E.M




-- 
Let's not spend our time and resources thinking about things that are so
little or so large that all they really do for us is puff us up and make us
feel like gods. We are mammals, 

[R] Validating a mixed-effects model

2008-05-04 Thread Armin Goralczyk
Hi

I constructed a mixed-effects model from longitudinal repeated
measurements of lab values in 22 patients seperated into two groups
with the groups as fixed effect using lme. I thought about using the
jackknife procedure, i. e., removing any one subject and calculating
the fixed effect, to assess the stability of the fixed effect and
thereby validate the model. I suppose this has been done in the
following study:

http://content.nejm.org/cgi/content/full/357/19/1903
(this may be restricted access, sorry)

Is such an approach feasible?

Also in the article results are confirmed by comparing the mixed model
with a fitted least-squares regression. I understand that this can be
achieved with lmlist, but only for for models without an additional
fixed effect!?

Are there any other good approaches to validate a mixed-effects model
that will be accepted in medical peer review?

-- 
Armin Goralczyk, M.D.
--
Universitätsmedizin Göttingen
Abteilung Allgemein- und Viszeralchirurgie
Rudolf-Koch-Str. 40
39099 Göttingen
--
Dept. of General Surgery
University of Göttingen
Göttingen, Germany
--
http://www.gwdg.de/~agoralc
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Categorizing Fonts using Statistical Methods

2008-05-04 Thread Johannes Hüsing
Leonard Mada [EMAIL PROTECTED] [Sun, May 04, 2008 at 07:26:04PM CEST]:
 Dear list members,
 
 Every modern OS comes with dozens of useless fonts, so that the 
 current font drop-down list in most programs is overcrowded with fonts 
 one never will use. Selecting a useful font becomes a nightmare.
 
 In an attempt to ease the selection of useful fonts, I began looking 
 into sorting fonts using some statistical techniques. I summed my ideas 
 on the OpenOffice.org wiki:
 http://wiki.services.openoffice.org/wiki/User_Experience/ToDo/Product/Font_Categories
 
 Of course, there is NO guarantee that something useful will emerge, but 
 at least someone has tried it.
 

Why is there nothing mentioned with respect to the classical font 
categorization,
Venetian, Aldine, Transitional, Modern, Slab Serif, ... ?


[...]
   - maybe some other measures

If you can obtain the *.afm information of the font, you have some useful 
parameters such 
as cap height, ascender height, descender height, oblique angle ...

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:[EMAIL PROTECTED]  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting pie-charts into a coordinate system

2008-05-04 Thread Georg Ehret
Dear R user group,
 I wish to plot small pie-charts to specific coordinates in a e.g.
scatter-plot:

E.g.:
 plot(rnorm(100),rnorm(100))
 points(1,1,col=red,cex=4)
- I wish to put pie(c(2,3)) at the position of the red circle...

How can I do this efficiently?

Thanking you and wishing you a wonderful Sunday!
Georg.
**
Georg Ehret
Johns Hopkins
Baltimore
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting pie-charts into a coordinate system

2008-05-04 Thread hadley wickham
On Sun, May 4, 2008 at 2:30 PM, Georg Ehret [EMAIL PROTECTED] wrote:
 Dear R user group,
  I wish to plot small pie-charts to specific coordinates in a e.g.
  scatter-plot:

  E.g.:
   plot(rnorm(100),rnorm(100))
   points(1,1,col=red,cex=4)
  - I wish to put pie(c(2,3)) at the position of the red circle...

  How can I do this efficiently?

For some discussion of the disadvantages of this type of display, and
some suggestions for alternatives, you might like to read
http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=00018S.

Another interesting page is
http://www.math.yorku.ca/SCS/Gallery/minard/minard.pdf, which
describes the first use of this technique, by Minard.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Categorizing Fonts using Statistical Methods

2008-05-04 Thread Leonard Mada

Hello Johannes,


Johannes Hüsing wrote:

Leonard Mada lmada_at_gmx.net [Sun, May 04, 2008 at 07:26:04PM CEST]:
 Dear list members,

 Every modern OS comes with dozens of useless fonts, so that the
 current font drop-down list in most programs is overcrowded with fonts
 one never will use. Selecting a useful font becomes a nightmare.

 In an attempt to ease the selection of useful fonts, I began looking
 into sorting fonts using some statistical techniques. I summed my ideas
 on the OpenOffice.org wiki:
 
http://wiki.services.openoffice.org/wiki/User_Experience/ToDo/Product/Font_Categories


 Of course, there is NO guarantee that something useful will emerge, but
 at least someone has tried it.


Why is there nothing mentioned with respect to the classical font 
categorization, Venetian, Aldine, Transitional, Modern, Slab Serif, ... ?


I played with the idea over and over again, but decided then against it.

I had a look both on the Adobe site, and on various other sites (e.g. 
http://graphicdesign.spokanefalls.edu/tutorials/process/type_basics/type_families.htm#oldstyle).Unfortunately, 
fonts belonging to different families may look very similar, while fonts 
within one family are different enough to warrant a distinct 
classification. Especially this latter aspect makes me think that the 
font families are not that helpful, and - when choosing the appropriate 
font -  I do NOT want to limit myself to one family. A different font 
family might look even better.


Also, I cannot remember a  single time I have used a font based on its 
family. Rather, a font gets selected based on how it looks within a 
specific document (well, mostly it gets selected because the person 
knows it - but lets ignore this and adopt a more scientific approach).


Selecting some measures, like font width, height, weight, complexity, 
compactness, slant, [...] seems a sensible approach.



[...]
 - maybe some other measures

If you can obtain the *.afm information of the font, you have some 
useful parameters such as cap height, ascender height, descender 
height, oblique angle ...


I do have a rather limited understanding of the font-files proper. If I 
am correct, .afm-files are available only for post-script fonts. Of 
course, on Windows, most fonts will be TrueType and OpenType. I have no 
idea, IF such information is available for these fonts.


My primary problem is however, that the purpose of this analysis is to 
let end-users perform this same analysis on their computers on their own 
font sets. My plan was to do a proof of concept analysis in R, and later 
(when I have some better idea how to categorise fonts and everything 
works fine) to post such a feature request in specific programs.


At this point, this sorting of fonts is of unproven benefit and of 
unknown behaviour. So, I wouldn't want to waste developers time into 
something that might prove useless (though I have high expectations that 
something useful will emerge - NOT sure however which of the specific 
measures will bring the most differentiating features).


I still hope in completing succefully this task.

Many thanks for your advice, I will take another look at afm-files.

Sincerely,

Leonard

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting pie-charts into a coordinate system

2008-05-04 Thread Paul Murrell
Hi


Georg Ehret wrote:
 Dear R user group,
  I wish to plot small pie-charts to specific coordinates in a e.g.
 scatter-plot:
 
 E.g.:
 plot(rnorm(100),rnorm(100))
 points(1,1,col=red,cex=4)
 - I wish to put pie(c(2,3)) at the position of the red circle...
 
 How can I do this efficiently?


For one approach see Integrating grid Graphics Output
with Base Graphics Output in R News 3/2
(http://www.r-project.org/doc/Rnews/Rnews_2003-2.pdf)

Paul


 Thanking you and wishing you a wonderful Sunday!
 Georg.
 **
 Georg Ehret
 Johns Hopkins
 Baltimore
 USA
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Residual resampling for non linear reg model

2008-05-04 Thread Mary Winter



 I was attempting to use the residual resampling approach to generate 999 
bootstrap samples of alpha and beta and find their confidence intervals. 
However, I keep getting the error message:Error in nls(resample.mp ~ 
cases/(alpha + (beta * cases)), start = init.values,  : singular 
gradientafter R has only produced a few bootstraps.Could anyone suggest where I 
am going wrong? Would greatly appreciate it!Mary 
manpower-read.table(http://www.creem.st-and.ac.uk/len/classes/mt3607/data/manhours_surgical.dat;,
 header=TRUE)#attach dataattach(manpower)B-999#number of data 
pointsn-dim(manpower)[1]#alpha level to use for the confidence 
limitsalpha-0.05#matrix that's going to contain the bootstrap 
coefficientsboot.coef-matrix(NA, B+1, 2)#fit the initial 
modelinit.values-c(alpha=20,beta=0)model-nls(manhours~cases/(alpha+(beta*cases)),
 start=init.values, trace=TRUE)pred-predict(model)resid-resid(model)#do the 
bootstrapfor (i in 1:B){  #resample the residuals  resample.ind!
 ex-sample(1:n,n,replace=T)  resample.mp-pred+resid[resample.index]  #refit 
the model  init.values-c(alpha=20,beta=0)  
new.model-nls(resample.mp~cases/(alpha+(beta*cases)),start=init.values, 
trace=TRUE)  #extract the parameter estimates  
boot.coef[i,]-coef(new.model)}#add the original parameter estimates 
boot.coef[B+1,]-coef(model)#calculate confidence intervalsfor(i in 1:2){  
ci-quantile(boot.coef[,i],probs=c(alpha/2,(1-alpha/2)))  cat(residual 
bootstrap confidence intervals for parameter,i,are,ci,\n)}

Miss your Messenger buddies when on-the-go? Get Messenger on your Mobile! 
_
Win Indiana Jones prizes with Live Search

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Residual resampling for non linear reg model

2008-05-04 Thread Mary Winter
Since sending this message I have now solved the problem - needed to alter the 
initial values of alpha and beta!



 From: [EMAIL PROTECTED] To: r-help@r-project.org Date: Mon, 5 May 2008 
 00:03:53 +0100 Subject: [R] Residual resampling for non linear reg model  
I was attempting to use the residual resampling approach to generate 
 999 bootstrap samples of alpha and beta and find their confidence intervals. 
 However, I keep getting the error message:Error in nls(resample.mp ~ 
 cases/(alpha + (beta * cases)), start = init.values, : singular gradientafter 
 R has only produced a few bootstraps.Could anyone suggest where I am going 
 wrong? Would greatly appreciate it!Mary 
 manpower-read.table(http://www.creem.st-and.ac.uk/len/classes/mt3607/data/manhours_surgical.dat;,
  header=TRUE)#attach dataattach(manpower)B-999#number of data 
 pointsn-dim(manpower)[1]#alpha level to use for the confidence 
 limitsalpha-0.05#matrix that's going to contain the bootstrap 
 coefficientsboot.coef-matrix(NA, B+1, 2)#fit the initial 
 modelinit.values-c(alpha=20,beta=0)model-nls(manhours~cases/(alpha+!
 (beta*cases)), start=init.values, 
trace=TRUE)pred-predict(model)resid-resid(model)#do the bootstrapfor (i in 
1:B){ #resample the residuals resample.ind! ex-sample(1:n,n,replace=T) 
resample.mp-pred+resid[resample.index] #refit the model 
init.values-c(alpha=20,beta=0) 
new.model-nls(resample.mp~cases/(alpha+(beta*cases)),start=init.values, 
trace=TRUE) #extract the parameter estimates 
boot.coef[i,]-coef(new.model)}#add the original parameter estimates 
boot.coef[B+1,]-coef(model)#calculate confidence intervalsfor(i in 1:2){ 
ci-quantile(boot.coef[,i],probs=c(alpha/2,(1-alpha/2))) cat(residual 
bootstrap confidence intervals for parameter,i,are,ci,\n)}  Miss your 
Messenger buddies when on-the-go? Get Messenger on your Mobile!  
_ Win Indiana 
Jones prizes with Live Search  [[alternative HTML version deleted]]  
__ R-help@r-project.org mailing 
list https!
 ://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting gu
ide http://www.R-project.org/posting-guide.html and provide commented, 
minimal, self-contained, reproducible code.
_




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calling a C program from R

2008-05-04 Thread Paulo Cardoso
Is it possible to call a program (an .exe wrote in C) from R Gui? I'm
interested in interact with inputs for this program and analyze outputs with
R.

The program itself calls a input.dat  file with a number of needed
parameters. Based on this input file it produces a number of files resuming
the parameterized analysis. I'm actually already able to read these outputs
with R, and produce a number of outputs (graphs and tables). Nevertheless,
each run of the program is manually done by now. It would be interesting to
loop the call of the program, manipulating parameters  at each run.

Do exist a way of interact with a external exe from R?

Thanks in advance.

Paulo


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] calling a C program from R

2008-05-04 Thread Duncan Murdoch

On 04/05/2008 7:39 PM, Paulo Cardoso wrote:

Is it possible to call a program (an .exe wrote in C) from R Gui? I'm
interested in interact with inputs for this program and analyze outputs with
R.

The program itself calls a input.dat  file with a number of needed
parameters. Based on this input file it produces a number of files resuming
the parameterized analysis. I'm actually already able to read these outputs
with R, and produce a number of outputs (graphs and tables). Nevertheless,
each run of the program is manually done by now. It would be interesting to
loop the call of the program, manipulating parameters  at each run.

Do exist a way of interact with a external exe from R?


There are several.  You probably want to use system() or shell().

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cross Spectrum Analysis

2008-05-04 Thread Maura E Monville
R function spectrum expects a time series as input. I have attached a
compressed archive with two detrended and denoised signals (txt format)
whose spectra I would like to compare.
I start out trying to generate a multivariate time series.
Please, notice the different signals length. Moreover, R command cbind
forces a matrix with number of rows equal to the longer of the two signals,
padding the shorter one by replicating it from its start values What
happens in the frequency domain ?
The signals are sampled at 30 Hz.

 s10146 - read.table(10146-Clean-Signal.txt)
 s45533 - read.table(45533-Clean-Signal.txt)
 v10146 - as.vector(s10146[,1])
 length(v10146)
[1] 8133
 v45533 - as.vector(s45533[,1])
 length(v45533)
[1] 6764
 xx -cbind(v10146, v45533)
 dim(xx)
[1] 81332
 v45533[1:10]
 [1] -1.7721546 -1.7482835 -1.6964711 -1.6154405 -1.5045701 -1.3747449
 [7] -1.2332980 -1.0912172 -0.9585821 -0.8420886
 xx[6760:6770,]
  v10146 v45533
 [1,] -0.8585375 -0.6076069
 [2,] -0.8060065 -0.5288312
 [3,] -0.7541174 -0.4447711
 [4,] -0.7028816 -0.3592778
 [5,] -0.6524279 -0.2767786
 [6,] -0.6027233 -1.7721546# start replicating shorter signal
 [7,] -0.5536868 -1.7482835
 [8,] -0.5052780 -1.6964711
 [9,] -0.4574095 -1.6154405
[10,] -0.4097922 -1.5045701
[11,] -0.3623641 -1.3747449

 twosig - ts(xx,deltat=0.033,start=0)  # time series







On Sun, May 4, 2008 at 2:14 PM, stephen sefick [EMAIL PROTECTED] wrote:

 is this a problem?  are there error messages?  if so could you provide
 them.  Try
 as.matrix(yourdata).  One thing you could do is create a moving average
 that reduces the signals to the lowest common denominator.  Could you
 provide reproducable code with maybe a toy data set so anybody could have a
 look at what is going on.
 good luck

 Stephen



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantitative spectra analysis

2008-05-04 Thread stephen sefick
look at the spectrums before you do the cbind - I would not suggest letting
R wrap the data to fill in a data frame.  I would suggest using something
that you know how it acts in the frequency domain like zero.  You are
probably introducing periodicies that are not real, and I would suggest not
to go down this path.  As for finding commonalities amongst signals- it all
depends on what a commonality is in the signal of interest.  I am sure if
you refine your question an answer can be found.  I have used beam forming
to look for common peaks among dissolved oxygen signals at different miles
along the river- a physicist friend wrote the code in matlab- so I could
provide that, but I haven't looked into trying to make it and R specific
algorithm- and in reality I am not sure if my programming is to the point of
being able to do something like that.  hope this helps

On Sun, May 4, 2008 at 8:48 PM, Maura E Monville [EMAIL PROTECTED]
wrote:

 The attached picture is what I get passing the time series where the
 shorter signal is wrapped around.

  s10146 - read.table(10146-Clean-Signal.txt)
  s45533 - read.table(45533-Clean-Signal.txt)
  v10146 - as.vector(s10146[,1])
  length(v10146)
 [1] 8133
  v45533 - as.vector(s45533[,1])
  length(v45533)
 [1] 6764
  xx -cbind(v10146, v45533)
  dim(xx)
 [1] 81332
  v45533[1:10]
  [1] -1.7721546 -1.7482835 -1.6964711 -1.6154405 -1.5045701 -1.3747449
  [7] -1.2332980 -1.0912172 -0.9585821 -0.8420886
  xx[6760:6770,]
   v10146 v45533
  [1,] -0.8585375 -0.6076069
  [2,] -0.8060065 -0.5288312
  [3,] -0.7541174 -0.4447711
  [4,] -0.7028816 -0.3592778
  [5,] -0.6524279 -0.2767786
  [6,] -0.6027233 -1.7721546# start replicating shorter signal
  [7,] -0.5536868 -1.7482835
  [8,] -0.5052780 -1.6964711
  [9,] -0.4574095 -1.6154405
 [10,] -0.4097922 -1.5045701
 [11,] -0.3623641 -1.3747449

  twosig - ts(xx,deltat=0.033,start=0)  # time series
  spectrum(twosig)

 I*s there any quantitative analysis, operating in the frequency domain,
 that can help me identify common pattern features in signals ?*

 Thank you very much.

 --
 Maura E.M




-- 
Let's not spend our time and resources thinking about things that are so
little or so large that all they really do for us is puff us up and make us
feel like gods. We are mammals, and have not exhausted the annoying little
problems of being mammals.

-K. Mullis

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in downViewport.vpPath(vpPathDirect(name)

2008-05-04 Thread Paul Murrell
Hi


Andrewjohnclose wrote:
 Hi,
 
 I am having trouble plotting  a series of dendrograms using lattice and grid
 code as found in Paul Murrells book R Graphics.
 
 This is the error message I recieve:
 
 Error in downViewport.vpPath(vpPathDirect(name), strict, recording =
 recording) : 
   Viewport 'plot1.panel.1.1.off.vp' was not found
 
 I have attached the code and also my data file. Should anyone have any
 suggestions then your help would be gratefully appreciated.


Your 'height' factor has values 1, 2, 2, 3 so when the second panel is
drawn (for 'height == 2'), there are two dendrograms to draw in the
panel.  Specifically, 'dend4b$lower[[subscripts]]' fails because
'subscripts' is c(2, 3).  You can see this with some crude debugging as
 in ...

dendpanel - function(x, y, subscripts, ...) {
pushViewport(viewport(y = space,
  width = 0.90,
  height = unit(0.90, npc) - space,
  just = bottom))
par(plt = gridPLT(), new = TRUE, ps = 10)
cat(subscripts, \n)
plot(dend4b$lower[[subscripts]], axes = FALSE)
popViewport()
}

... and you can get something to work if you adjust the code like this ...

dendpanel - function(x, y, subscripts, ...) {
pushViewport(viewport(y = space,
  width = 0.90,
  height = unit(0.90, npc) - space,
  just = bottom))
par(plt = gridPLT(), new = TRUE, ps = 10)
plot(dend4b$lower[subscripts][[1]], axes = FALSE)
popViewport()
}

... but that drops one of the dendrograms.  You need to set up x, y, and
height differently so that they correspond better to the dendrogram
structure that you have in 'dend4b'.

Hope you can take it from there ...

Paul


 Thank you
 
 Andrew
 http://www.nabble.com/file/p17017801/dend4c.txt dend4c.txt 
 http://www.nabble.com/file/p17017801/gL2.csv gL2.csv 

-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in downViewport.vpPath(vpPathDirect(name)

2008-05-04 Thread Paul Murrell
Hi


Paul Murrell wrote:
 Hi
 
 
 Andrewjohnclose wrote:
 Hi,

 I am having trouble plotting  a series of dendrograms using lattice and grid
 code as found in Paul Murrells book R Graphics.

 This is the error message I recieve:

 Error in downViewport.vpPath(vpPathDirect(name), strict, recording =
 recording) : 
   Viewport 'plot1.panel.1.1.off.vp' was not found


BTW, you are getting this error because the panel function is failing
(so the popViewport() never happens and lattice gets left inside the
grid viewport that you pushed and so lattice cannot find the viewports
that it is expecting to be able to see).

Paul


 I have attached the code and also my data file. Should anyone have any
 suggestions then your help would be gratefully appreciated.
 
 
 Your 'height' factor has values 1, 2, 2, 3 so when the second panel is
 drawn (for 'height == 2'), there are two dendrograms to draw in the
 panel.  Specifically, 'dend4b$lower[[subscripts]]' fails because
 'subscripts' is c(2, 3).  You can see this with some crude debugging as
  in ...
 
 dendpanel - function(x, y, subscripts, ...) {
 pushViewport(viewport(y = space,
   width = 0.90,
   height = unit(0.90, npc) - space,
   just = bottom))
 par(plt = gridPLT(), new = TRUE, ps = 10)
 cat(subscripts, \n)
 plot(dend4b$lower[[subscripts]], axes = FALSE)
 popViewport()
 }
 
 ... and you can get something to work if you adjust the code like this ...
 
 dendpanel - function(x, y, subscripts, ...) {
 pushViewport(viewport(y = space,
   width = 0.90,
   height = unit(0.90, npc) - space,
   just = bottom))
 par(plt = gridPLT(), new = TRUE, ps = 10)
 plot(dend4b$lower[subscripts][[1]], axes = FALSE)
 popViewport()
 }
 
 ... but that drops one of the dendrograms.  You need to set up x, y, and
 height differently so that they correspond better to the dendrogram
 structure that you have in 'dend4b'.
 
 Hope you can take it from there ...
 
 Paul
 
 
 Thank you

 Andrew
 http://www.nabble.com/file/p17017801/dend4c.txt dend4c.txt 
 http://www.nabble.com/file/p17017801/gL2.csv gL2.csv 
 

-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] splitting a vector on comma

2008-05-04 Thread Georg Ehret
Dear R Usergroup,
 I have the following vector and I would like to split it on ,.
How can I do this?
 u
[1]
160798191,160802762,160813395,160816017,160817873,160824082,160825247,160826925,160834272,160836257,

Thank you in advance!
With my best regards, Georg.

Georg Ehret
Baltimore
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting a vector on comma

2008-05-04 Thread David Scott


?strsplit


On Sun, 4 May 2008, Georg Ehret wrote:


Dear R Usergroup,
I have the following vector and I would like to split it on ,.
How can I do this?

u

[1]
160798191,160802762,160813395,160816017,160817873,160824082,160825247,160826925,160834272,160836257,

Thank you in advance!
With my best regards, Georg.

Georg Ehret
Baltimore
USA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.