date:20100902

Re: [R] Cross Spectrum Analysis

2010-09-02 Thread David Winsemius



On Sep 2, 2010, at 1:00 AM, aftar wrote:



Hi

Is anyone know how could I find the cross spectrum?

from the spectrum function, it only give the spectrum for each  
individual

series.


I read the manual very differently. It is telling me that you _do_ get  
cross spectrum results in the coh matrix:


VALUE
cohNULL for univariate series. For multivariate time series, a  
matrix containing the squared coherency between different series.  
Column i + (j - 1) * (j - 2)/2 of coh contains the squared coherency  
between columns i and j ofx, where i  j.


And there is a link on that help page to a help page that includes the  
plot.spec.coherency function.


This response from Prof Ripley (very easily found on an RSiteSearch  
search for cross spectrum) has a further link to a supplement to VR4:

http://finzi.psych.upenn.edu/Rhelp10/2009-November/219526.html

http://www.stats.ox.ac.uk/pub/MASS4/VR4stat.pdf

... which discusses coherency and illustrates processing of the coh  
object.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to access some elements of a S4 object?

2010-09-02 Thread Xiang Li

David, 
 
Thanks a lot!!!  It works,and that's what I need.  Good night.
 
Sean



From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Wed 9/1/2010 9:50 PM
To: Xiang Li
Cc: r-help@r-project.org
Subject: Re: [R] How to access some elements of a S4 object?




On Sep 1, 2010, at 10:04 PM, Xiang Li wrote:

 Hi,

 Could you please tell me how to access the elements of a S4 object?

 Slot alpha.name:

 [1] Cutoff



 p...@alpha.values

 [[1]]

  [1] Inf 0.991096434 0.984667270 0.984599159 0.983494405
 0.970641299 0.959417835 0.945548941 0.943432189

If that is the print() result of:

p...@alpha.values# then it's probably an ordinary list with a 
single vector element, so just try

p...@alpha.values[[1]][2]

(The str function also works on S4 objects .)

--
David.


David Winsemius, MD
West Hartford, CT




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to generate integers from uniform distribution with fixed mean

2010-09-02 Thread Yi

Hi, folks,

runif (n,min,max) is the typical code for generate R.V from uniform dist.

But what if we need to fix the mean as 20, and we want the values to be
integers only?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] extract month data automatically

2010-09-02 Thread Roslina Zakaria

Hi,

I would like to extract monthly data automatically using a code.  Given are my 
sample data.

I know how to do one by one: 
jan_data1 - Pooraka_data[Pooraka_data$Month==1,4]
feb_data1 - Pooraka_data[Pooraka_data$Month==2,4]
mar_data1 - Pooraka_data[Pooraka_data$Month==3,4]
apr_data1 - Pooraka_data[Pooraka_data$Month==4,4]
...

then i try:
p0_mn - function(dt)
{  p0 - sum(dt==0)/length(dt)
   mn - mean(dt)
   c(p0=p0, mn=mn)
}

rbind(p0_mn(jan_data2),p0_mn(feb_data1), 
p0_mn(mar_data1),p0_mn(apr_data1),p0_mn(may_data1),p0_mn(jun_data1),p0_mn(jul_data1),
p0_mn(aug_data1),p0_mn(sep_data1),p0_mn(oct_data1),p0_mn(nov_data1),p0_mn(dec_data1))
)


 head(Pooraka_data,20); tail(Pooraka_data,20)
   Year Month Day Amount
1  1901 1   1    0.0
2  1901 1   2    3.0
3  1901 1   3    0.0
4  1901 1   4    0.5
5  1901 1   5    0.0
6  1901 1   6    0.0
7  1901 1   7    0.3
8  1901 1   8    0.0
9  1901 1   9    0.0
10 1901 1  10    0.0
11 1901 1  11    0.5
12 1901 1  12    1.8
13 1901 1  13    0.0
14 1901 1  14    0.0
15 1901 1  15    2.5
16 1901 1  16    0.0
17 1901 1  17    0.0
18 1901 1  18    0.0
19 1901 1  19    0.0
20 1901 1  20    0.0
  Year Month Day Amount
32858 1990    12  17    0.0
32859 1990    12  18    0.0
32860 1990    12  19    0.8
32861 1990    12  20    0.0
32862 1990    12  21    0.0
32863 1990    12  22    8.0
32864 1990    12  23    0.0
32865 1990    12  24    0.0
32866 1990    12  25    0.0
32867 1990    12  26    0.0
32868 1990    12  27    0.4
32869 1990    12  28    0.0
32870 1990    12  29    0.0
32871 1990    12  30    0.0
32872 1990    12  31    0.0
32873 1991 1   1    0.0
32874 1991 1   2    0.0
32875 1991 1   3    0.0
32876 1991 1   4    0.0
32877 1991 1   5    5.4

thank you.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R CMD check Package(Windows): Error in inDL(x, as.logical(local), as.logical(now), ...) :

2010-09-02 Thread raje...@cse.iitm.ac.in


Hi,

I've built my own package in windows and when I run R CMD check Package-Name I 
get,

* install options are ' --no-html'
* installing *source* package 'AceTest' ...
** libs
  making DLL ...
g++ ...etc. 
installing to PATH
  ... done
** R
** preparing package for lazy loading
** help
Warning: ./man/AceTest-package.Rd:34: All text must be in a section
Warning: ./man/AceTest-package.Rd:35: All text must be in a section
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
Error in inDL(x, as.logical(local), as.logical(now), ...) : 
  unable to load shared library 
'H:/RTick/Project/Client/AceTest.Rcheck/AceTest/libs/AceTest.dll':
  LoadLibrary failure:  The specified module could not be found.
ERROR: loading failed
* removing 'H:/RTick/Project/Client/AceTest.Rcheck/AceTest'


can someone point out what went wrong?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] using R's svd from outside R

2010-09-02 Thread Ralf Goertz

Hi,

I have to compute the singular value decomposition of rather large
matrices. My test matrix is 10558 by 4255 and it takes about three
minutes in R to decompose on a 64bit quadruple core linux machine. (R is
running svd in parallel, all four cores are at their maximum load while
doing this.) I tried several blas and lapack libraries as well as the
gnu scientific library in my C++ programm. Apart from being unable to
have them do svd in parallel mode (although I thought I did everything
to make them do it in parallel) execution time always exceeds 25 minutes
which is still way more than the expected 12 minutes for the
non-parallel R code.

I am now going to call R from within my program, but this not very
elegant. So my questions are: Does R use a special svd-routine and is it
possible to use it directly by linking in the relevant libraries? (Sorry
but I couldn't figure that out by looking at the source code.) If that
is possible, can I have the code run in parallel mode?

Thanks,

Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs with same xlim and ylim scale

2010-09-02 Thread Shi, Tao

Hi Dejian,

You're right on this!  Do you know how to pass those two argument into 
lower.panel?  Thanks!

...Tao



From: Dejian Zhao zha...@ioz.ac.cn
To: r-help@r-project.org
Sent: Tue, August 31, 2010 6:10:16 PM
Subject: Re: [R] pairs with same xlim and ylim scale

I think you have successfully passed the xlim and ylim into the 
function pairs1. Compare the two graphs produced by the codes you 
provided, you can find the xlim and ylim in the second graph have been 
reset to the assigned value. It seems that the program halted in 
producing the second plot after adding xlim and ylim. According to the 
error message, the two added parameters were not used in lower.panel, or 
the customized function f.xy.

On 2010-9-1 2:26, Shi, Tao wrote:
 Hi list,

 I have a function which basically is a wrapper of pairs with some useful panel
 functions.  However, I'm having trouble to pass the xlim and ylim into the
 function so the x and y axes are in the same scale and 45 degree lines are
 exactly diagonal.   I've looked at some old posts, they didn't help much.  I
[[elided Yahoo spam]]

 Thanks!

 ...Tao


 pairs1- function(x, ...) {
  f.xy- function(x, y, ...) {
  points(x, y, ...)
  abline(0, 1, col = 2)
  }

  panel.cor- function(x, y, digits=2, prefix=, cex.cor) {
   usr- par(usr); on.exit(par(usr))
   par(usr = c(0, 1, 0, 1))
   r- abs(cor(x, y, method=p, use=pairwise.complete.obs))
   txt- format(c(r, 0.123456789), digits=digits)[1]
   txt- paste(prefix, txt, sep=)
   if(missing(cex.cor)) cex- 0.8/strwidth(txt)
   text(0.5, 0.5, txt, cex = cex * r)
   }

   panel.hist- function(x, ...) {
   usr- par(usr); on.exit(par(usr))
   par(usr = c(usr[1:2], 0, 1.5) )
   h- hist(x, plot = FALSE)
   breaks- h$breaks; nB- length(breaks)
   y- h$counts; y- y/max(y)
   rect(breaks[-nB], 0, breaks[-1], y, col=cyan, ...)
   }

  pairs(x, lower.panel=f.xy, upper.panel=panel.cor, diag.panel=panel.hist,
 ...)
 }


 x- rnorm(100, sd=0.2)
 x- cbind(x=x-0.1, y=x+0.1)
 pairs1(x)
 pairs1(x, xlim=c(-1,1), ylim=c(-1,1))
  
 Error in lower.panel(...) :
unused argument(s) (xlim = c(-1, 1), ylim = c(-1, 1))



 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to generate integers from uniform distribution with fixed mean

2010-09-02 Thread Barry Rowlingson

On Thu, Sep 2, 2010 at 7:17 AM, Yi liuyi.fe...@gmail.com wrote:
 Hi, folks,

 runif (n,min,max) is the typical code for generate R.V from uniform dist.

 But what if we need to fix the mean as 20, and we want the values to be
 integers only?

 It's not clear what you want. Uniformly random integers with expected
mean 20 - but what range? Any range centred on 20 will work, for
example you could use sample() with replacement. To see the
distribution, use sample()

 table(sample(17:23,1,TRUE))

 which gives a uniform distribution of integers from 17 to 23, so the
mean is 20.0057 for 1 samples.

 Is that what you want?

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross Spectrum Analysis

2010-09-02 Thread aftar


Hi

I'm sorry, but I don't think that coherence is the same as the cross
spectrum. People use coherence since it is much easier to deal with. I know
how by using R to plot and calculate the coherence and phase, but what I
didn't know is how to calculate the cross spectrum by using R. 

Regards
Aftar
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Cross-Spectrum-Analysis-tp855188p2486248.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R code changes to crap

2010-09-02 Thread khush ........

Hi,

I am using vim to edit my files...I do not know what has been pressed by me
as my R code get converted to such a crap...

%PDF-1.4
%81â81ã81Ï81Ó\r
1 0 obj

/CreationDate (D:20100902122215)
/ModDate (D:20100902122215)
/Title (R Graphics Output)
/Producer (R 2.11.1)
/Creator (R)

how to reconvert it to the my actual code?? as I do not have  backup for
it..?

Thanks in advance
Khushwant

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] reshape to wide format takes extremely long

2010-09-02 Thread Coen van Hasselt

Hello,

I have a data.frame with the following format:

 head(clin2)
Study Subject  Type  Obs Cycle Day   Date  Time
1 A001101   10108   ALB 44.098   1 2004-03-11 14:26
2 A001101   10108   ALP 95.098   1 2004-03-11 14:26
3 A001101   10108   ALT 61.098   1 2004-03-11 14:26
5 A001101   10108   AST 33.098   1 2004-03-11 14:26

I want to transform this data.frame so that I have Obs columns for
each Type. The full dataset is 45000 rows long. For a short subset
of 100 rows, reshaping takes 0.2 seconds, and produces what I want.
All columns are either numeric or character format (incl. date/time).

 reshape(clin2, v.names=Obs, timevar=Type, 
 direction=wide,idvar=c(Study,Subject,Cycle,Day,Date,Time),)
  Study Subject Cycle Day   Date  Time Obs.ALB Obs.ALP Obs.ALT Obs.AST
1   A001101   1010898   1 2004-03-11 14:26  44  95  61  33
11  A001101   10108 1   1 2004-03-12 14:01  41  85  39  33
21  A001101   10108 1   8 2004-03-22 10:34  40  90  70  34
30  A001101   10108 1  15 2004-03-29 09:56  45  97  66
 48 []

However, when using the same reshape command for the full data.frame
of 45000 rows, it still wasn't finished when run overnight (8 GB RAM +
8 GB swap in use).

The time to process this data.frame from a 100-row subset to a
1000-row subset increases from 0.2 sec to 60 sec.

I would greatly appreciate a advice why the time for reshaping is
increasing exponentially with the nr. of rows, and how I can do this
in an elegant way.

Thanks!

Coen.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R code changes to crap

2010-09-02 Thread peter dalgaard


On Sep 2, 2010, at 09:16 , khush  wrote:

 Hi,
 
 I am using vim to edit my files...I do not know what has been pressed by me
 as my R code get converted to such a crap...
 
 %PDF-1.4
 %81â81ã81Ï81Ó\r
 1 0 obj
 
 /CreationDate (D:20100902122215)
 /ModDate (D:20100902122215)
 /Title (R Graphics Output)
 /Producer (R 2.11.1)
 /Creator (R)
 
 how to reconvert it to the my actual code?? as I do not have  backup for
 it..?
 

Well, that's a PDF file, with some R Graphics inside, so presumably you did 
pdf(file=foo.R) or copied the graphics output as PDF to foo.R using the 
user-friendly interface to shooting yourself in the foot.

Depending on your (unstated) platform and configuration, VIM may be leaving 
.bak files around, which you can use, or you may have the commands echoed into 
an output scriptfile, from which you can extract them. Otherwise, I suspect 
that you are out of luck.


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] general question on binomial test / sign test

2010-09-02 Thread Kay Cecil Cichini

i test the null that the coin is fair (p(succ) = p(fail) = 0.5) with  
one trail and get a p-value of 1. actually i want to proof the  
alternative H that the estimate is different from 0.5, what certainly  
can not be aproven here. but in reverse the p-value of 1 says that i  
can 100% sure that the estimate of 0.5 is true (??) - that's the point  
that astonishes me.


thanks if anybody could clarify this for me,
kay

Zitat von Greg Snow greg.s...@imail.org:

Try thinking this one through from first principles, you are  
essentially saying that your null hypothesis is that you are  
flipping a fair coin and you want to do a 2-tailed test.  You then  
flip the coin exactly once, what do you expect to happen?  The  
p-value of 1 just means that what you saw was perfectly consistent  
with what is predicted to happen flipping a single time.


Does that help?

If not, please explain what you mean a little better.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of Kay Cichini
Sent: Wednesday, September 01, 2010 3:06 PM
To: r-help@r-project.org
Subject: [R] general question on binomial test / sign test


hello,

i did several binomial tests and noticed for one sparse dataset that
binom.test(1,1,0.5) gives a p-value of 1 for the null, what i can't
quite
grasp. that would say that the a prob of 1/2 has p-value of 0 ?? - i
must be
wrong but can't figure out the right interpretation..

best,
kay





-

Kay Cichini
Postgraduate student
Institute of Botany
Univ. of Innsbruck


--
View this message in context: http://r.789695.n4.nabble.com/general-
question-on-binomial-test-sign-test-tp2419965p2419965.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R code changes to crap

2010-09-02 Thread Ted Harding

On 02-Sep-10 07:16:54, khush  wrote:
 Hi,
 
 I am using vim to edit my files...I do not know what has been
 pressed by me as my R code get converted to such a crap...
 
 %PDF-1.4
 %81â81ã81Ï81Ó\r
 1 0 obj
 
 /CreationDate (D:20100902122215)
 /ModDate (D:20100902122215)
 /Title (R Graphics Output)
 /Producer (R 2.11.1)
 /Creator (R)
 
 how to reconvert it to the my actual code?? as I do not have  backup
 for
 it..?
 
 Thanks in advance
 Khushwant

The content you have listed above is part of a header from a PDF file,
apparently generated by R from a plot() command (R Graphics Output)
presumably encapsulated between pdf(...) ... dev.off() commands.
It is most unlikely that such a file would contain any R code
(i.e. a list of C commands).

Are you sure you are editing the correct file? What are you trying
to do? In any case, there are almost no circumstances in which it
is meaningful to apply a text editor (like 'vim') to a PDF file!

Possibly, if you are using a Windows platform, you may have mixed
up a text file with R commands (e.g. project1.R) with a PDF
file generated by R using the same name (e.g. project1.pdf)
because Windows (by default) does not show you the extension
(respectively .R and .pdf).

Please clarify!
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 02-Sep-10   Time: 09:01:35
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2 multiple group barchart

2010-09-02 Thread ONKELINX, Thierry

Dear Greg,

First convert your data.frame to a long format using melt(). Then use
ddply() to calculate the averages. Once you get at this point is should
be rather straightforward.

library (ggplot2)

v1  - c(1,2,3,3,4)
v2  - c(4,3,1,1,9)
v3  - c(3,5,7,2,9)
gender - c(m,f,m,f,f)

d.data  - data.frame (v1, v2, v3, gender) 

molten - melt(d.data)

Average - ddply(molten, c(gender, variable), function(z){
c(Mean = mean(z$value))}
)

ggplot (data=Average, aes(x = variable, y = Mean, fill = gender)) + 
geom_bar(position = position_dodge()) + 
coord_flip() + 
geom_text (position = position_dodge(0.5),
aes(label=round(Mean, 1)), vjust=0.5, hjust=4,colour=white, size=7)

ggplot (data=Average, aes(x = variable, y = Mean)) + 
geom_bar() + 
coord_flip() + 
geom_text(aes(label=round(Mean, 1)), vjust=0.5,
hjust=4,colour=white, size=7) + 
facet_wrap(~gender)

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Waller Gregor (wall)
 Verzonden: woensdag 1 september 2010 17:16
 Aan: r-help@r-project.org
 Onderwerp: [R] ggplot2 multiple group barchart
 
 
 hi there.. i got a problem with ggplot2.
 
 here my example:
 
 library (ggplot2)
 
 v1  - c(1,2,3,3,4)
 v2  - c(4,3,1,1,9)
 v3  - c(3,5,7,2,9)
 gender - c(m,f,m,f,f)
 
 d.data  - data.frame (v1, v2, v3, gender) d.data
 
 x  - names (d.data[1:3])
 y  -  mean (d.data[1:3])
 
 
 pl  - ggplot (data=d.data, aes (x=x,y=y)) pl  - pl + 
 geom_bar() pl  - pl + coord_flip() pl  - pl + geom_text 
 (aes(label=round(y,1)),vjust=0.5, hjust=4,colour=white, size=7) pl
 
 this gives me a nice barchart to compare the means of my 
 variables v1,v2 and v3.
 my question: how do i have to proceed if i want this barchart 
 splittet by the variable gender.
 so i get two small bars for v1, one for female and one for 
 male, two bars for v2 etc. 
 i need them all in one chart. 
 
 fill=gender, position=dodge do not work... 
 
 any ideas?
 
 thanks a lot
 
 greg
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using library and lib.loc

2010-09-02 Thread omerle

Hi,

I didn't find any post on this subject so I ll ask you some advices.

Let's say that I have two library trees.

Number 1 is the default R library tree on path1
Number 2 is another library tree on a server with all packages on path2.

When I set library(aaMI,lib.loc=paths2) it loads the package even if its not on 
default R library
When I set library(fOptions,lib.loc=paths2) it doesn't load because timeSeries 
is not on default R library (timeSeries is a required package for fOptions)

 library(fOptions,lib.loc=.lib.loc)
Le chargement a nÃ©cessitÃ© le package : timeDate
Le chargement a nÃ©cessitÃ© le package : timeSeries
Erreur : le package 'timeSeries' ne peut Ãªtre chargÃ©
De plus : Message d'avis :
In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = 
lib.loc) :
Â  aucun package nommÃ© 'timeSeries' n'est trouvÃ©

(Sorry for french error message. By the way, how can I set error in French 
(setting language in English in R installation is not sufficient !)

How can I set lib.loc for every package that I will load ?

Or is there any global way of doing this ?

Thanks,

Olivier


Une messagerie gratuite, garantie Ã  vie et des services en plus, Ã§a vous 
tente ?
Je crÃ©e ma boÃ®te mail www.laposte.net

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] general question on binomial test / sign test

2010-09-02 Thread Ted Harding

You state: in reverse the p-value of 1 says that i can 100% sure
that the estimate of 0.5 is true. This is where your logic about
significance tests goes wrong.

The general logic of a singificance test is that a test statistic
(say T) is chosen such that large values represent a discrepancy
between possible data and the hypothesis under test. When you
have the data, T evaluates to a value (say t0). The null hypothesis
(NH) implies a distribution for the statistic T if the NH is true.

Then the value of Prob(T = t0 | NH) can be calculated. If this is
small, then the probability of obtaining data at least as discrepant
as the data you did obtain is small; if sufficiently small, then
the conjunction of NH and your data (as assessed by the statistic T)
is so unlikely that you can decide to not believe that it is possible.
If you so decide, then you reject the NH because the data are so
discrepant that you can't believe it!

This is on the same lines as the reductio ad absurdum in classical
logic: An hypothesis A implies that an outcome B must occur. But I
have observed that B did not occur. Therefore A cannot be true.

But it does not follow that, if you observe that B did occur
(which is *consistent* with A), then A must be true. A could be
false, yet B still occur -- the only basis on which occurrence
of B could *prove* that A must be true is when you have the prior
information that B will occur *if and only if* A is true. In the
reductio ad absurdum, and in the parallel logic of significance
testing, all you have is B will occur *if* A is true. The only if
part is not there. So you cannot deduce that A is true from
the observation that B occurred, since what you have to start with
allows B to occur if A is false (i.e. B will occur *if* A is true
says nothing about what may or may not happen if A is false).

So, in your single toss of a coin, it is true that I will observe
either 'succ' or 'fail' if the coin is fair. But (as in the above)
you cannot deduce that the coin is fair if you observe either
'succ' or 'fail', since it is possible (indeed necessary) that you
obtain such an observation if the coin is not fair (even if the
coin is the same, either 'succ' or 'fail', on both sides, therefore
completely unfair). This is an attempt to expand Greg Snow's reply!

Your 2-sided test takes the form T=1 if either outcome='succ' or
outcome='fail'. And that is the only possible value for T since
no other outcome is possible. Hence Prob(T==1) = 1 whether the coin
is fair or not. It is not possible for such data to discriminate
between a fair and an unfair coin.

And, as explained above, a P-value of 1 cannot prove that the
null hypothesis is true. All that is possible with a significance
test is that a small P-value can be taken as evidence that the
NH is false.

Hoping this helps!
Ted.

On 02-Sep-10 07:41:17, Kay Cecil Cichini wrote:
 i test the null that the coin is fair (p(succ) = p(fail) = 0.5) with  
 one trail and get a p-value of 1. actually i want to proof the  
 alternative H that the estimate is different from 0.5, what certainly  
 can not be aproven here. but in reverse the p-value of 1 says that i  
 can 100% sure that the estimate of 0.5 is true (??) - that's the point 
 that astonishes me.
 
 thanks if anybody could clarify this for me,
 kay
 
 Zitat von Greg Snow greg.s...@imail.org:
 
 Try thinking this one through from first principles, you are  
 essentially saying that your null hypothesis is that you are  
 flipping a fair coin and you want to do a 2-tailed test.  You then  
 flip the coin exactly once, what do you expect to happen?  The  
 p-value of 1 just means that what you saw was perfectly consistent  
 with what is predicted to happen flipping a single time.

 Does that help?

 If not, please explain what you mean a little better.

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Kay Cichini
 Sent: Wednesday, September 01, 2010 3:06 PM
 To: r-help@r-project.org
 Subject: [R] general question on binomial test / sign test


 hello,

 i did several binomial tests and noticed for one sparse dataset that
 binom.test(1,1,0.5) gives a p-value of 1 for the null, what i can't
 quite
 grasp. that would say that the a prob of 1/2 has p-value of 0 ?? - i
 must be
 wrong but can't figure out the right interpretation..

 best,
 kay





 -
 
 Kay Cichini
 Postgraduate student
 Institute of Botany
 Univ. of Innsbruck
 

 --
 View this message in context: http://r.789695.n4.nabble.com/general-
 question-on-binomial-test-sign-test-tp2419965p2419965.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting

[R] Is there any package or function perform stepwise variable selection under GEE method?

2010-09-02 Thread 黃敏慈

Hi ,

I use library(gee),library(geepack),library(yags) perform GEE data analysis
, but all of them cannot do variable selection!

Both step and stepAIC can do variable selection based on AIC criterion under
linear regression and glm,
  but they  cannot work when model is based on GEE.

I want to ask whether any variable selection function or package under GEE
model avaliable now?
Thanks!

Best,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape to wide format takes extremely long

2010-09-02 Thread Dennis Murphy

Hi:

I did the following test using function ddply() in the plyr package on a toy
data frame with 5 observations using five studies, 20 subjects per
study, 25 cycles per subject, five days per cycle and four observations by
type per day. No date-time variable was included.

# Test data frame
big - data.frame(study = factor(rep(1:5, each = 1)),
  subject = factor(rep(1:100, each = 500)),
  cycle = rep(rep(1:25, each = 20), 100),
  day = rep(rep(1:5, each = 4), 500),
  type = rep(c('ALB', 'ALP', 'ALT', 'AST'), 12500),
  obs = rpois(5, 70) )
 dim(big)
[1] 5 6

# 64-bit R on a Windows 7 box with 8Gb RAM and a 2.93GHz Core Duo chip.
system.time(bigtab - ddply(big, .(study, subject, cycle, day), function(x)
xtabs(obs ~ type, data = x)))
   user  system elapsed
  30.220.02   30.60

 dim(bigtab)
[1] 12500 8
 head(bigtab)
  study subject cycle day ALB ALP ALT AST
1 1   1 1   1  77  80  67  70
2 1   1 1   2  60  54  70  70
3 1   1 1   3  71  77  69  65
4 1   1 1   4  62  71  73  68
5 1   1 1   5  78  67  69  78
6 1   1 2   1  71  69  74  69
 tail(bigtab)
  study subject cycle day ALB ALP ALT AST
12495 5 10024   5  75  83  72  70
12496 5 10025   1  85  52  62  70
12497 5 10025   2  79  64  84  68
12498 5 10025   3  67  65  74  81
12499 5 10025   4  62  86  66  80
12500 5 10025   5  58  76  85  84

There may be an easier/more efficient way to do this with melt() and cast()
in the reshape package, but moved on when I couldn't figure it out within
ten minutes (probably because I was thinking 'xtabs of obs by type for
study/subject/cycle/day combinations - that's the ticket!' :) Packages sqldf
and data.table are other viable options for this sort of task, and now that
there is a test data set to play with, it would be interesting to see what
else can be done. I'd be surprised if this couldn't be done within a few
seconds because the data frame is not that large.

Anyway, HTH,
Dennis



On Thu, Sep 2, 2010 at 12:24 AM, Coen van Hasselt
coenvanhass...@gmail.comwrote:

 Hello,

 I have a data.frame with the following format:

  head(clin2)
Study Subject  Type  Obs Cycle Day   Date  Time
 1 A001101   10108   ALB 44.098   1 2004-03-11 14:26
 2 A001101   10108   ALP 95.098   1 2004-03-11 14:26
 3 A001101   10108   ALT 61.098   1 2004-03-11 14:26
 5 A001101   10108   AST 33.098   1 2004-03-11 14:26

 I want to transform this data.frame so that I have Obs columns for
 each Type. The full dataset is 45000 rows long. For a short subset
 of 100 rows, reshaping takes 0.2 seconds, and produces what I want.
 All columns are either numeric or character format (incl. date/time).

  reshape(clin2, v.names=Obs, timevar=Type,
 direction=wide,idvar=c(Study,Subject,Cycle,Day,Date,Time),)
  Study Subject Cycle Day   Date  Time Obs.ALB Obs.ALP Obs.ALT
 Obs.AST
 1   A001101   1010898   1 2004-03-11 14:26  44  95  61
  33
 11  A001101   10108 1   1 2004-03-12 14:01  41  85  39
  33
 21  A001101   10108 1   8 2004-03-22 10:34  40  90  70
  34
 30  A001101   10108 1  15 2004-03-29 09:56  45  97  66
 48 []

 However, when using the same reshape command for the full data.frame
 of 45000 rows, it still wasn't finished when run overnight (8 GB RAM +
 8 GB swap in use).

 The time to process this data.frame from a 100-row subset to a
 1000-row subset increases from 0.2 sec to 60 sec.

 I would greatly appreciate a advice why the time for reshaping is
 increasing exponentially with the nr. of rows, and how I can do this
 in an elegant way.

 Thanks!

 Coen.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to represent error bar in a plot legend?

2010-09-02 Thread Jim Lemon


On 09/02/2010 04:56 AM, josef.kar...@phila.gov wrote:

I have a simple barplot of 4 mean values, each mean value has an
associated 95% confidence interval drawn on the plot as an error bar.
I want to make a legend on the plot that uses the error bar symbol, and
explains 95% C.I.
How do I show the error bar symbol in the legend?  I could not find any
pch values that are appropriate


Hi Josef,
There is a way to do this, but it is not too easy. First, use the 
my.symbols function in the TeachingDemos package to define the error 
bar symbol. Then, you can draw a legend without one or more symbol 
elements and place your error bar symbol where the symbol would have 
been. You can see how to do this in the code of the legendg function in 
the plotrix package by getting the necessary coordinates from the legend 
function.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape to wide format takes extremely long

2010-09-02 Thread ONKELINX, Thierry

Dear Dennis,

cast() is in this case much faster.

 system.time(bigtab - ddply(big, .(study, subject, cycle, day),
function(x) xtabs(obs ~ type, data = x)))
   user  system elapsed 
  35.360.12   35.53 
 system.time(bigtab2 - cast(data = big, study + subject + cycle + day
~type, value = obs, fun = mean))
   user  system elapsed 
   4.090.004.09 

I have the feeling that ddply() has a lot of overhead when the number of
levels is large.

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Dennis Murphy
 Verzonden: donderdag 2 september 2010 11:34
 Aan: Coen van Hasselt
 CC: r-help@r-project.org
 Onderwerp: Re: [R] reshape to wide format takes extremely long
 
 Hi:
 
 I did the following test using function ddply() in the plyr 
 package on a toy data frame with 5 observations using 
 five studies, 20 subjects per study, 25 cycles per subject, 
 five days per cycle and four observations by type per day. No 
 date-time variable was included.
 
 # Test data frame
 big - data.frame(study = factor(rep(1:5, each = 1)),
   subject = factor(rep(1:100, each = 500)),
   cycle = rep(rep(1:25, each = 20), 100),
   day = rep(rep(1:5, each = 4), 500),
   type = rep(c('ALB', 'ALP', 'ALT', 'AST'), 12500),
   obs = rpois(5, 70) )
  dim(big)
 [1] 5 6
 
 # 64-bit R on a Windows 7 box with 8Gb RAM and a 2.93GHz Core 
 Duo chip.
 system.time(bigtab - ddply(big, .(study, subject, cycle,
 day), function(x) xtabs(obs ~ type, data = x)))
user  system elapsed
   30.220.02   30.60
 
  dim(bigtab)
 [1] 12500 8
  head(bigtab)
   study subject cycle day ALB ALP ALT AST
 1 1   1 1   1  77  80  67  70
 2 1   1 1   2  60  54  70  70
 3 1   1 1   3  71  77  69  65
 4 1   1 1   4  62  71  73  68
 5 1   1 1   5  78  67  69  78
 6 1   1 2   1  71  69  74  69
  tail(bigtab)
   study subject cycle day ALB ALP ALT AST
 12495 5 10024   5  75  83  72  70
 12496 5 10025   1  85  52  62  70
 12497 5 10025   2  79  64  84  68
 12498 5 10025   3  67  65  74  81
 12499 5 10025   4  62  86  66  80
 12500 5 10025   5  58  76  85  84
 
 There may be an easier/more efficient way to do this with 
 melt() and cast() in the reshape package, but moved on when I 
 couldn't figure it out within ten minutes (probably because I 
 was thinking 'xtabs of obs by type for 
 study/subject/cycle/day combinations - that's the ticket!' :) 
 Packages sqldf and data.table are other viable options for 
 this sort of task, and now that there is a test data set to 
 play with, it would be interesting to see what else can be 
 done. I'd be surprised if this couldn't be done within a few 
 seconds because the data frame is not that large.
 
 Anyway, HTH,
 Dennis
 
 
 
 On Thu, Sep 2, 2010 at 12:24 AM, Coen van Hasselt
 coenvanhass...@gmail.comwrote:
 
  Hello,
 
  I have a data.frame with the following format:
 
   head(clin2)
 Study Subject  Type  Obs Cycle Day   Date  Time
  1 A001101   10108   ALB 44.098   1 2004-03-11 14:26
  2 A001101   10108   ALP 95.098   1 2004-03-11 14:26
  3 A001101   10108   ALT 61.098   1 2004-03-11 14:26
  5 A001101   10108   AST 33.098   1 2004-03-11 14:26
 
  I want to transform this data.frame so that I have Obs 
 columns for 
  each Type. The full dataset is 45000 rows long. For a 
 short subset 
  of 100 rows, reshaping takes 0.2 seconds, and produces what I want.
  All columns are either numeric or character format (incl. 
 date/time).
 
   reshape(clin2, v.names=Obs, timevar=Type,
  
 direction=wide,idvar=c(Study,Subject,Cycle,Day,Date
 ,Time),)
   Study Subject Cycle Day   Date  Time Obs.ALB 
 Obs.ALP Obs.ALT
  Obs.AST
  1   A001101   1010898   1 2004-03-11 14:26  44  
 95  61
   33
  11  A001101   10108 1   1 2004-03-12 14:01  41  
 85  39
   33
  21  A001101   10108 1   8 2004-03-22 10:34  40  
 90

Re: [R] R CMD check Package(Windows): Error in inDL(x, as.logical(local), as.logical(now), ...) :

2010-09-02 Thread Duncan Murdoch


On 02/09/2010 2:29 AM, raje...@cse.iitm.ac.in wrote:

Hi,

I've built my own package in windows and when I run R CMD check Package-Name I 
get,

* install options are ' --no-html'
* installing *source* package 'AceTest' ...
** libs
  making DLL ...
g++ ...etc. 
installing to PATH

  ... done
** R
** preparing package for lazy loading
** help
Warning: ./man/AceTest-package.Rd:34: All text must be in a section
Warning: ./man/AceTest-package.Rd:35: All text must be in a section
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
Error in inDL(x, as.logical(local), as.logical(now), ...) : 
  unable to load shared library 'H:/RTick/Project/Client/AceTest.Rcheck/AceTest/libs/AceTest.dll':

  LoadLibrary failure:  The specified module could not be found.
ERROR: loading failed


The message The specified module could not be found comes from 
Windows.  It probably means that your dll depends on some other dll and 
the other one is unavailable, but it might mean that AceTest.dll itself 
can't be loaded (permission problems, defective dll, etc.) In some cases 
Windows will pop up a dialog box with more information than that skimpy 
message, e.g. if you install the package and try to run library(AceTest) 
 from within Rgui.


Duncan Murdoch



* removing 'H:/RTick/Project/Client/AceTest.Rcheck/AceTest'


can someone point out what went wrong?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape to wide format takes extremely long

2010-09-02 Thread carslaw


picking up on Thierry's example, I don't think you need any function because
you are just reshaping 
(not aggregating).  Therefore:

bigtab2 - cast(data = big, study + subject + cycle + day  ~type, value =
obs)
 head(bigtab2)
  study subject cycle day ALB ALP ALT AST
1 1   1 1   1  66  71  83  76
2 1   1 1   2  66  87  78  58
3 1   1 1   3  72  84  78  61
4 1   1 1   4  72  63  68  69
5 1   1 1   5  64  68  89  89
6 1   1 2   1  78  65  65  76

system.time(bigtab2 - cast(data = big, study + subject + cycle + day 
~type, value = obs))
   user  system elapsed 
  0.760   0.000   0.782 

david
-- 
View this message in context: 
http://r.789695.n4.nabble.com/reshape-to-wide-format-takes-extremely-long-tp2487153p2506575.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R graphics: Add labels to stacked bar chart

2010-09-02 Thread Jens Oldeland


Hi,

I am looking for a way to add labels, i.e. absolute values, into a 
stacked bar chart using the basic plot functions of R. The labels should 
be inside the stacked bars.


For example,

###  I have this dataset

height = cbind(x = c(465, 91) / 465 * 100,
  y = c(840, 200) / 840 * 100,
  z = c(37, 17) / 37 * 100)

### and use this plot
barplot(height,
   beside = FALSE,
   horiz=T,
   col = c(2, 3)
   )

how can I put the values from height within the respective stacked bars?


Thank you!

--
+
Dipl.Biol. Jens Oldeland
Biodiversity of Plants
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg 
Ohnhorststr. 18

22609 Hamburg,
Germany

Tel:0049-(0)40-42816-407
Fax:0049-(0)40-42816-543
Mail:   oldel...@botanik.uni-hamburg.de
   oldel...@gmx.de  (for attachments  2mb!!)
Skype:  jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://jensoldeland.wordpress.com
+

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R CMD check Package(Windows): Error in inDL(x, as.logical(local), as.logical(now), ...) :

2010-09-02 Thread Duncan Murdoch


On 02/09/2010 6:46 AM, raje...@cse.iitm.ac.in wrote:
It is dependent on another dll but it did not give compilation errors. It seemed to link fine at that point. Why does it have a problem at this stage? 


Windows needs to be able to find the other DLL at load time.  It will 
find it if it's in the same directory or on the PATH.


Duncan Murdoch




From: Duncan Murdoch murdoch.dun...@gmail.com 
To: raje...@cse.iitm.ac.in 
Cc: r-help r-help@r-project.org 
Sent: Thursday, September 2, 2010 4:05:14 PM 
Subject: Re: [R] R CMD check Package(Windows): Error in inDL(x, as.logical(local), as.logical(now), ...) : 

On 02/09/2010 2:29 AM, raje...@cse.iitm.ac.in wrote: 
Hi, 

I've built my own package in windows and when I run R CMD check Package-Name I get, 

* install options are ' --no-html' 
* installing *source* package 'AceTest' ... 
** libs 
making DLL ... 
g++ ...etc. 
installing to PATH 
... done 
** R 
** preparing package for lazy loading 
** help 
Warning: ./man/AceTest-package.Rd:34: All text must be in a section 
Warning: ./man/AceTest-package.Rd:35: All text must be in a section 
*** installing help indices 
** building package indices ... 
** testing if installed package can be loaded 
Error in inDL(x, as.logical(local), as.logical(now), ...) : 
unable to load shared library 'H:/RTick/Project/Client/AceTest.Rcheck/AceTest/libs/AceTest.dll': 
LoadLibrary failure: The specified module could not be found. 
ERROR: loading failed 


The message The specified module could not be found comes from 
Windows. It probably means that your dll depends on some other dll and 
the other one is unavailable, but it might mean that AceTest.dll itself 
can't be loaded (permission problems, defective dll, etc.) In some cases 
Windows will pop up a dialog box with more information than that skimpy 
message, e.g. if you install the package and try to run library(AceTest) 
from within Rgui. 

Duncan Murdoch 



* removing 'H:/RTick/Project/Client/AceTest.Rcheck/AceTest' 



can someone point out what went wrong? 
[[alternative HTML version deleted]] 

__ 
R-help@r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code. 





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] draw a graph of the semi-partial R-squared /CAH

2010-09-02 Thread Yan leeuw

Dear r-help,

I am using CAH. I would cut my dendogram. What is the command in R that
allows draw a graph of the semi-partial R-squared ?

Best Regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R CMD check Package(Windows): Error in inDL(x, as.logical(local), as.logical(now), ...) :

2010-09-02 Thread raje...@cse.iitm.ac.in

It is dependent on another dll but it did not give compilation errors. It 
seemed to link fine at that point. Why does it have a problem at this stage? 

From: Duncan Murdoch murdoch.dun...@gmail.com 
To: raje...@cse.iitm.ac.in 
Cc: r-help r-help@r-project.org 
Sent: Thursday, September 2, 2010 4:05:14 PM 
Subject: Re: [R] R CMD check Package(Windows): Error in inDL(x, 
as.logical(local), as.logical(now), ...) : 

On 02/09/2010 2:29 AM, raje...@cse.iitm.ac.in wrote: 
 Hi, 

 I've built my own package in windows and when I run R CMD check Package-Name 
 I get, 

 * install options are ' --no-html' 
 * installing *source* package 'AceTest' ... 
 ** libs 
 making DLL ... 
 g++ ...etc. 
 installing to PATH 
 ... done 
 ** R 
 ** preparing package for lazy loading 
 ** help 
 Warning: ./man/AceTest-package.Rd:34: All text must be in a section 
 Warning: ./man/AceTest-package.Rd:35: All text must be in a section 
 *** installing help indices 
 ** building package indices ... 
 ** testing if installed package can be loaded 
 Error in inDL(x, as.logical(local), as.logical(now), ...) : 
 unable to load shared library 
 'H:/RTick/Project/Client/AceTest.Rcheck/AceTest/libs/AceTest.dll': 
 LoadLibrary failure: The specified module could not be found. 
 ERROR: loading failed 

The message The specified module could not be found comes from 
Windows. It probably means that your dll depends on some other dll and 
the other one is unavailable, but it might mean that AceTest.dll itself 
can't be loaded (permission problems, defective dll, etc.) In some cases 
Windows will pop up a dialog box with more information than that skimpy 
message, e.g. if you install the package and try to run library(AceTest) 
from within Rgui. 

Duncan Murdoch 

 * removing 'H:/RTick/Project/Client/AceTest.Rcheck/AceTest' 

 can someone point out what went wrong? 
 [[alternative HTML version deleted]] 

 __ 
 R-help@r-project.org mailing list 
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code. 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error: could not find function ad.test

2010-09-02 Thread DrCJones


Hi,
I'm trying to run an anderson-darling test for normality on a given variable
'Y': 
ad.test(Y)

I think I need the 'nortest' package, but since it does not appear in any of
the Ubuntu repositories for 2.10.1, I am wondering if it goes by the name of
something else now? 

Thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Error-could-not-find-function-ad-test-tp2501293p2501293.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R graphics: Add labels to stacked bar chart

2010-09-02 Thread Jim Lemon


On 09/02/2010 08:50 PM, Jens Oldeland wrote:
 ...
 I am looking for a way to add labels, i.e. absolute values, into a
 stacked bar chart using the basic plot functions of R. The labels
 should be inside the stacked bars.

barpos-barplot(height,beside = FALSE,
 horiz=TRUE,col = c(2, 3))
library(plotrix)
boxed.labels(c(height[1,]/2,height[1,]+height[2,]/2),
 rep(barpos,2),c(height[1,],round(height[2,],1)))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nlme formula from model specification

2010-09-02 Thread Mikkel Meyer Andersen

Dear R-community,

I'm analysing some noise using the nlme-package. I'm writing in order
to get my usage of lme verified.

In practise, a number of samples have been processed by a machine
measuring the same signal at four different channels. I want to model
the noise. I have taken the noise (the signal is from position 1 to
3500, and after that there is only noise).

My data looks like this:
channel.matrix:
  pos channel0 channel1 channel2 channel3 samplenumber
   1 350183   1211
   2 3502370   141
   3 350391   1331
   4 3504373   141
   5 350565451
   6 350670   1601
...
 495 399552991
 496 3996246   101
 497 399732771
 498 399824391
 499 3999316   111
 500 400003671
2301 350114392
2302 3502334   132
2303 350341852
2304 350431   1022
2305 350523582
2306 350605822
...

The model is
channel0 ~ alpha_i + eps_{i, j} + channel1 + channel2 + channel3
where i is sample number, j is position, and:
  alpha_i: fixed effect for each samplenumber
  eps_{i, j}:  random effect, here with correlation
structure as AR(1)
  channel1, ..., channel3: fixed effect for each channel not depending on
   samplenumber nor position

(And then afterwards I would model channel1 ~ ... + channel2 + channel3 etc.)

I then use this function call:
channel.matrix.grouped - groupedData(channel0 ~ pos | samplenumber,
  data = channel.matrix)

fit - lme(channel0 ~ pos + samplenumber + channel1 + channel2 + channel3,
  random = ~ pos | samplenumber,
  correlation = corAR1(value = 0.5, form = ~ pos | samplenumber),
  data = channel.matrix.grouped)

Is that the right way to express the model in (n)lme-notation?

Cheers, Mikkel.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape to wide format takes extremely long

2010-09-02 Thread Dennis Murphy

Hi:

Thanks, David and Thierry. I knew what I did was inefficient, but I'm not
very adept with cast() yet. Thanks for the lesson!
The time is less than half without the fun = mean statement, too. On my
system, the timing of David's call was 2.06 s elapsed; with Thierry's, it
was 4.88 s. Both big improvements over my band-aid attempt.

Regards,
Dennis

On Thu, Sep 2, 2010 at 3:36 AM, carslaw david.cars...@kcl.ac.uk wrote:


 picking up on Thierry's example, I don't think you need any function
 because
 you are just reshaping
 (not aggregating).  Therefore:

 bigtab2 - cast(data = big, study + subject + cycle + day  ~type, value =
 obs)
  head(bigtab2)
   study subject cycle day ALB ALP ALT AST
 1 1   1 1   1  66  71  83  76
 2 1   1 1   2  66  87  78  58
 3 1   1 1   3  72  84  78  61
 4 1   1 1   4  72  63  68  69
 5 1   1 1   5  64  68  89  89
 6 1   1 2   1  78  65  65  76

 system.time(bigtab2 - cast(data = big, study + subject + cycle + day
 ~type, value = obs))
   user  system elapsed
  0.760   0.000   0.782

 david
 --
 View this message in context:
 http://r.789695.n4.nabble.com/reshape-to-wide-format-takes-extremely-long-tp2487153p2506575.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: could not find function ad.test

2010-09-02 Thread Dennis Murphy

Hi:

One option to search for functions in R is to download the sos package:

library(sos)
findFn('Anderson-Darling')

On my system (Win 7), it found 51 matches. The two likeliest packages are
nortest and ADGofTest. To answer your question, nortest still exists on
CRAN. I can't comment on Ubuntu, but it's possible you may have to install
it manually from the gzip sources. I'm sure that others with hands-on Ubuntu
experience can provide a more complete answer.

HTH,
Dennis

On Thu, Sep 2, 2010 at 2:42 AM, DrCJones matthias.godd...@gmail.com wrote:

Hi,
I'm trying to run an anderson-darling test for normality on a given
variable
'Y':
ad.test(Y)

I think I need the 'nortest' package, but since it does not appear in any
of
the Ubuntu repositories for 2.10.1, I am wondering if it goes by the name
of
something else now?

Thanks
--
View this message in context:
http://r.789695.n4.nabble.com/Error-could-not-find-function-ad-test-tp2501293p2501293.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nlme formula from model specification

2010-09-02 Thread ONKELINX, Thierry

Dear Mikkel,

You need to do some reading on terminology.

In your model the fixed effects are channel 1, 2 and 3. samplenumber is
a random effect and the error term is an error term

The model you described has the notation below. You do not need to
create the grouped data structure.

lme(channel0 ~ pos + samplenumber + channel1 + channel2 + channel3,
   random = ~ 1 | samplenumber,
   correlation = corAR1(value = 0.5, form = ~ pos | samplenumber),
   data = channel.matrix) 

HTH,

Thierry

PS There is a dedicated mailing list for mixed models:
R-sig-mixed-models



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Mikkel Meyer Andersen
 Verzonden: donderdag 2 september 2010 13:30
 Aan: r-help@r-project.org
 Onderwerp: [R] nlme formula from model specification
 
 Dear R-community,
 
 I'm analysing some noise using the nlme-package. I'm writing 
 in order to get my usage of lme verified.
 
 In practise, a number of samples have been processed by a 
 machine measuring the same signal at four different channels. 
 I want to model the noise. I have taken the noise (the signal 
 is from position 1 to 3500, and after that there is only noise).
 
 My data looks like this:
 channel.matrix:
   pos channel0 channel1 channel2 channel3 samplenumber
1 350183   1211
2 3502370   141
3 350391   1331
4 3504373   141
5 350565451
6 350670   1601
 ...
  495 399552991
  496 3996246   101
  497 399732771
  498 399824391
  499 3999316   111
  500 400003671
 2301 350114392
 2302 3502334   132
 2303 350341852
 2304 350431   1022
 2305 350523582
 2306 350605822
 ...
 
 The model is
 channel0 ~ alpha_i + eps_{i, j} + channel1 + channel2 + 
 channel3 where i is sample number, j is position, and:
   alpha_i: fixed effect for each samplenumber
   eps_{i, j}:  random effect, here with correlation
 structure as AR(1)
   channel1, ..., channel3: fixed effect for each channel not 
 depending on
samplenumber nor position
 
 (And then afterwards I would model channel1 ~ ... + channel2 
 + channel3 etc.)
 
 I then use this function call:
 channel.matrix.grouped - groupedData(channel0 ~ pos | samplenumber,
   data = channel.matrix)
 
 fit - lme(channel0 ~ pos + samplenumber + channel1 + 
 channel2 + channel3,
   random = ~ pos | samplenumber,
   correlation = corAR1(value = 0.5, form = ~ pos | samplenumber),
   data = channel.matrix.grouped)
 
 Is that the right way to express the model in (n)lme-notation?
 
 Cheers, Mikkel.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__

[R] how to cluster vectors of factors

2010-09-02 Thread tueken


Hello all

I wonder what can i use to cluster vectors which composed of several
factors.
lets say around 30 different factors compose a vector, and if the factor is
present then it encoded as 1, if not presented then it will be encoded as 0.
I was thinking of using hierarchical clustering, as i know the distance
between two vector were calculated through euclidean distance function, but
i dont think this distance would be correct to separate the data, cause two
vector with different composition, could end up having similar distance to
another vector.
hope someone could give me some clue!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-cluster-vectors-of-factors-tp2514654p2514654.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there any package or function perform stepwise variable selection under GEE method?

2010-09-02 Thread Frank Harrell



The absence of stepwise methods works to your advantage, as these 
yield invalid statistical inference and inflated regression 
coefficients.


Frank E Harrell Jr   Professor and ChairmanSchool of Medicine
 Department of Biostatistics   Vanderbilt University

On Thu, 2 Sep 2010, 黃敏慈 wrote:


Hi ,

I use library(gee),library(geepack),library(yags) perform GEE data analysis
, but all of them cannot do variable selection!

Both step and stepAIC can do variable selection based on AIC criterion under
linear regression and glm,
 but they  cannot work when model is based on GEE.

I want to ask whether any variable selection function or package under GEE
model avaliable now?
Thanks!

Best,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nlme formula from model specification

2010-09-02 Thread Mikkel Meyer Andersen

Dear Thierry,

Thanks for the quick answer. I'm moving this to r-sig-mixed-models
(but also posting on r-help to notify).

I reserved Mixed-effects models in S and S-PLUS by Pinheiro and
Bates, New York : Springer, 2000. Do you know any other good
references?

Cheers, Mikkel.

2010/9/2 ONKELINX, Thierry thierry.onkel...@inbo.be:
 Dear Mikkel,

 You need to do some reading on terminology.

 In your model the fixed effects are channel 1, 2 and 3. samplenumber is
 a random effect and the error term is an error term

 The model you described has the notation below. You do not need to
 create the grouped data structure.

 lme(channel0 ~ pos + samplenumber + channel1 + channel2 + channel3,
   random = ~ 1 | samplenumber,
   correlation = corAR1(value = 0.5, form = ~ pos | samplenumber),
   data = channel.matrix)

 HTH,

 Thierry

 PS There is a dedicated mailing list for mixed models:
 R-sig-mixed-models

 
 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek
 team Biometrie  Kwaliteitszorg
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium

 Research Institute for Nature and Forest
 team Biometrics  Quality Assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium

 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to
 say what the experiment died of.
 ~ Sir Ronald Aylmer Fisher

 The plural of anecdote is not data.
 ~ Roger Brinner

 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of
 data.
 ~ John Tukey


 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] Namens Mikkel Meyer Andersen
 Verzonden: donderdag 2 september 2010 13:30
 Aan: r-help@r-project.org
 Onderwerp: [R] nlme formula from model specification

 Dear R-community,

 I'm analysing some noise using the nlme-package. I'm writing
 in order to get my usage of lme verified.

 In practise, a number of samples have been processed by a
 machine measuring the same signal at four different channels.
 I want to model the noise. I have taken the noise (the signal
 is from position 1 to 3500, and after that there is only noise).

 My data looks like this:
 channel.matrix:
       pos channel0 channel1 channel2 channel3 samplenumber
    1 3501        8        3       12        1            1
    2 3502        3        7        0       14            1
    3 3503        9        1       13        3            1
    4 3504        3        7        3       14            1
    5 3505        6        5        4        5            1
    6 3506        7        0       16        0            1
 ...
  495 3995        5        2        9        9            1
  496 3996        2        4        6       10            1
  497 3997        3        2        7        7            1
  498 3998        2        4        3        9            1
  499 3999        3        1        6       11            1
  500 4000        0        3        6        7            1
 2301 3501        1        4        3        9            2
 2302 3502        3        3        4       13            2
 2303 3503        4        1        8        5            2
 2304 3504        3        1       10        2            2
 2305 3505        2        3        5        8            2
 2306 3506        0        5        8        2            2
 ...

 The model is
 channel0 ~ alpha_i + eps_{i, j} + channel1 + channel2 +
 channel3 where i is sample number, j is position, and:
   alpha_i:                 fixed effect for each samplenumber
   eps_{i, j}:              random effect, here with correlation
 structure as AR(1)
   channel1, ..., channel3: fixed effect for each channel not
 depending on
                            samplenumber nor position

 (And then afterwards I would model channel1 ~ ... + channel2
 + channel3 etc.)

 I then use this function call:
 channel.matrix.grouped - groupedData(channel0 ~ pos | samplenumber,
   data = channel.matrix)

 fit - lme(channel0 ~ pos + samplenumber + channel1 +
 channel2 + channel3,
   random = ~ pos | samplenumber,
   correlation = corAR1(value = 0.5, form = ~ pos | samplenumber),
   data = channel.matrix.grouped)

 Is that the right way to express the model in (n)lme-notation?

 Cheers, Mikkel.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 Druk dit bericht a.u.b. niet onnodig af.
 Please do not print this message unnecessarily.

 Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer
 en binden het INBO onder geen enkel beding, zolang

Re: [R] general question on binomial test / sign test

2010-09-02 Thread Kay Cecil Cichini



thanks a lot for the elaborations.

your explanations clearly brought to me that either  
binom.test(1,1,0.5,two-sided) or binom.test(0,1,0.5) giving a  
p-value of 1 simply indicate i have abolutely no ensurance to reject H0.


considering binom.test(0,1,0.5,alternative=greater) and  
binom.test(1,1,0.5,alternative=less) where i get a p-value of 1 and  
0.5,respectively - am i right in stating that for the first estimate  
0/1 i have no ensurance at all for rejection of H0 and for the second  
estimate = 1/1 i have same chance for beeing wrong in either rejecting  
or keeping H0.


many thanks,
kay



Zitat von ted.hard...@manchester.ac.uk:


You state: in reverse the p-value of 1 says that i can 100% sure
that the estimate of 0.5 is true. This is where your logic about
significance tests goes wrong.

The general logic of a singificance test is that a test statistic
(say T) is chosen such that large values represent a discrepancy
between possible data and the hypothesis under test. When you
have the data, T evaluates to a value (say t0). The null hypothesis
(NH) implies a distribution for the statistic T if the NH is true.

Then the value of Prob(T = t0 | NH) can be calculated. If this is
small, then the probability of obtaining data at least as discrepant
as the data you did obtain is small; if sufficiently small, then
the conjunction of NH and your data (as assessed by the statistic T)
is so unlikely that you can decide to not believe that it is possible.
If you so decide, then you reject the NH because the data are so
discrepant that you can't believe it!

This is on the same lines as the reductio ad absurdum in classical
logic: An hypothesis A implies that an outcome B must occur. But I
have observed that B did not occur. Therefore A cannot be true.

But it does not follow that, if you observe that B did occur
(which is *consistent* with A), then A must be true. A could be
false, yet B still occur -- the only basis on which occurrence
of B could *prove* that A must be true is when you have the prior
information that B will occur *if and only if* A is true. In the
reductio ad absurdum, and in the parallel logic of significance
testing, all you have is B will occur *if* A is true. The only if
part is not there. So you cannot deduce that A is true from
the observation that B occurred, since what you have to start with
allows B to occur if A is false (i.e. B will occur *if* A is true
says nothing about what may or may not happen if A is false).

So, in your single toss of a coin, it is true that I will observe
either 'succ' or 'fail' if the coin is fair. But (as in the above)
you cannot deduce that the coin is fair if you observe either
'succ' or 'fail', since it is possible (indeed necessary) that you
obtain such an observation if the coin is not fair (even if the
coin is the same, either 'succ' or 'fail', on both sides, therefore
completely unfair). This is an attempt to expand Greg Snow's reply!

Your 2-sided test takes the form T=1 if either outcome='succ' or
outcome='fail'. And that is the only possible value for T since
no other outcome is possible. Hence Prob(T==1) = 1 whether the coin
is fair or not. It is not possible for such data to discriminate
between a fair and an unfair coin.

And, as explained above, a P-value of 1 cannot prove that the
null hypothesis is true. All that is possible with a significance
test is that a small P-value can be taken as evidence that the
NH is false.

Hoping this helps!
Ted.

On 02-Sep-10 07:41:17, Kay Cecil Cichini wrote:

i test the null that the coin is fair (p(succ) = p(fail) = 0.5) with
one trail and get a p-value of 1. actually i want to proof the
alternative H that the estimate is different from 0.5, what certainly
can not be aproven here. but in reverse the p-value of 1 says that i
can 100% sure that the estimate of 0.5 is true (??) - that's the point
that astonishes me.

thanks if anybody could clarify this for me,
kay

Zitat von Greg Snow greg.s...@imail.org:


Try thinking this one through from first principles, you are
essentially saying that your null hypothesis is that you are
flipping a fair coin and you want to do a 2-tailed test.  You then
flip the coin exactly once, what do you expect to happen?  The
p-value of 1 just means that what you saw was perfectly consistent
with what is predicted to happen flipping a single time.

Does that help?

If not, please explain what you mean a little better.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of Kay Cichini
Sent: Wednesday, September 01, 2010 3:06 PM
To: r-help@r-project.org
Subject: [R] general question on binomial test / sign test


hello,

i did several binomial tests and noticed for one sparse dataset that
binom.test(1,1,0.5) gives a p-value of 1 for the null, what i can't
quite
grasp. that would say that the a

Re: [R] how to represent error bar in a plot legend?

2010-09-02 Thread Josef . Kardos

I think I found a simple solution, although it requires some tinkering to 
find the x,y coordinates of the plot region to place the text...

barplot ( )
text(x= 2.9, y = 0.43, srt=90, labels = H, cex = 1.5, col=blue)  #srt 
rotates the H character, so that it resembles an error bar
text(x=3.5, y=0.432,  labels = 95% C.I., cex=1.1)
rect (2.62,.41,3.9,.45)#draws box around text

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lower triangle of the correlation matrix with xtable

2010-09-02 Thread Olga Lyashevska

Dear all,

mydata-data.frame(x1=c(1,4,6),x2=c(3,1,2),x3=c(2,1,3))
cor(mydata)
   x1 x2x3
x1  1.000 -0.5960396 0.3973597
x2 -0.5960396  1.000 0.500
x3  0.3973597  0.500 1.000

I wonder if it is possible to fill only lower triangle of this
correlation matrix? Using 'dist' doesn't seem to be useful as it doesnt
allow to convert this table with xtable. 

print(xtable(cor(mydata),digits=3))

Any ideas?
Cheers,
Olga

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Kolmogorov Smirnov p-values

2010-09-02 Thread Samsiddhi Bhattacharjee

I was just testing out ks.test()

 y - runif(1, min=0, max=1)
 ks.test(y, runif, min=0, max=1, alternative=greater)

One-sample Kolmogorov-Smirnov test

data:  y
D^+ = 0.9761, p-value  2.2e-16
alternative hypothesis: the CDF of x lies above the null hypothesis


It seems that everytime I run it, I get a highly significant p-value
(for two sided
or one-sided , exact=TRUE or FALSE). Can anybody tell me what is going on ?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lower triangle of the correlation matrix with xtable

2010-09-02 Thread Sarah Goslee

Like this?

 mydata-data.frame(x1=c(1,4,6),x2=c(3,1,2),x3=c(2,1,3))
 as.dist(cor(mydata))
   x1 x2
x2 -0.5960396
x3  0.3973597  0.500

Sarah

On Thu, Sep 2, 2010 at 9:51 AM, Olga Lyashevska o...@herenstraat.nl wrote:
 Dear all,

 mydata-data.frame(x1=c(1,4,6),x2=c(3,1,2),x3=c(2,1,3))
 cor(mydata)
           x1         x2        x3
 x1  1.000 -0.5960396 0.3973597
 x2 -0.5960396  1.000 0.500
 x3  0.3973597  0.500 1.000

 I wonder if it is possible to fill only lower triangle of this
 correlation matrix? Using 'dist' doesn't seem to be useful as it doesnt
 allow to convert this table with xtable.

 print(xtable(cor(mydata),digits=3))

 Any ideas?
 Cheers,
 Olga




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Kolmogorov Smirnov p-values

2010-09-02 Thread Alain Guillet


 Hi,

Are you sure you don't want to do   ks.test(y, punif, min=0, max=1, 
alternative=greater) instead of what you tried?


Alain


On 02-Sep-10 15:52, Samsiddhi Bhattacharjee wrote:

ks.test(y, runif, min=0, max=1, alternative=greater)


--
Alain Guillet
Statistician and Computer Scientist

SMCS - IMMAQ - Université catholique de Louvain
Bureau c.316
Voie du Roman Pays, 20
B-1348 Louvain-la-Neuve
Belgium

tel: +32 10 47 30 50

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lower triangle of the correlation matrix with xtable

2010-09-02 Thread Steve_Friedman

try lower.tri

and see

??lower.tri

Steve Friedman Ph. D.
Spatial Statistical Analyst
Everglades and Dry Tortugas National Park
950 N Krome Ave (3rd Floor)
Homestead, Florida 33034

steve_fried...@nps.gov
Office (305) 224 - 4282
Fax (305) 224 - 4147


   
 Olga Lyashevska   
 o...@herenstraat 
 .nl   To 
 Sent by:  R mailing list  
 r-help-boun...@r- r-help@r-project.org  
 project.orgcc 
   
   Subject 
 09/02/2010 09:51  [R] lower triangle of the   
 AMcorrelation matrix with xtable  
   
   
   
   
   
   




Dear all,

mydata-data.frame(x1=c(1,4,6),x2=c(3,1,2),x3=c(2,1,3))
cor(mydata)
   x1 x2x3
x1  1.000 -0.5960396 0.3973597
x2 -0.5960396  1.000 0.500
x3  0.3973597  0.500 1.000

I wonder if it is possible to fill only lower triangle of this
correlation matrix? Using 'dist' doesn't seem to be useful as it doesnt
allow to convert this table with xtable.

print(xtable(cor(mydata),digits=3))

Any ideas?
Cheers,
Olga

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Kolmogorov Smirnov p-values

2010-09-02 Thread Samsiddhi Bhattacharjee

oops sorryreally careless. thanks !

On Thu, Sep 2, 2010 at 10:03 AM, Alain Guillet
alain.guil...@uclouvain.be wrote:
  Hi,

 Are you sure you don't want to do   ks.test(y, punif, min=0, max=1,
 alternative=greater) instead of what you tried?

 Alain


 On 02-Sep-10 15:52, Samsiddhi Bhattacharjee wrote:

 ks.test(y, runif, min=0, max=1, alternative=greater)

 --
 Alain Guillet
 Statistician and Computer Scientist

 SMCS - IMMAQ - Université catholique de Louvain
 Bureau c.316
 Voie du Roman Pays, 20
 B-1348 Louvain-la-Neuve
 Belgium

 tel: +32 10 47 30 50



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lower triangle of the correlation matrix with xtable

2010-09-02 Thread Olga Lyashevska

if I try as.dist I get the following error:

On Thu, 2010-09-02 at 09:57 -0400, Sarah Goslee wrote:
  mydata-data.frame(x1=c(1,4,6),x2=c(3,1,2),x3=c(2,1,3))
  as.dist(cor(mydata))
x1 x2
 x2 -0.5960396
 x3  0.3973597  0.500

print(xtable(as.dist(cor(mydata)),digits=3)) 
Error in UseMethod(xtable) : 
  no applicable method for 'xtable' applied to an object of class dist

Jorge's solution with upper.tri works fine!

Thanks everyone for your prompt response.

Cheers
Olga

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] general question on binomial test / sign test

2010-09-02 Thread Kay Cichini


..i'd like to add that i actually wanted to test the location of differences
of paired samples coming from an non-normal asymetric distribution. the
alternative hypothesis was that negative differences are more often than in
0.5 of all cases. thus i tested
(x=nr.diff.under.0,n=all.diffs,0.5,alternative=greater).
then this one of many tests for a sparse dataset came up where x=0, and
n=1). there i thought the H0 is x is less than 0.5, and i then had my
trouble interpreting the p-value of 1.

best,
kay
Kay Cichini wrote:
 
 
 thanks a lot for the elaborations.
 
 your explanations clearly brought to me that either  
 binom.test(1,1,0.5,two-sided) or binom.test(0,1,0.5) giving a  
 p-value of 1 simply indicate i have abolutely no ensurance to reject H0.
 
 considering binom.test(0,1,0.5,alternative=greater) and  
 binom.test(1,1,0.5,alternative=less) where i get a p-value of 1 and  
 0.5,respectively - am i right in stating that for the first estimate  
 0/1 i have no ensurance at all for rejection of H0 and for the second  
 estimate = 1/1 i have same chance for beeing wrong in either rejecting  
 or keeping H0.
 
 many thanks,
 kay
 
 
 
 Zitat von ted.hard...@manchester.ac.uk:
 
 You state: in reverse the p-value of 1 says that i can 100% sure
 that the estimate of 0.5 is true. This is where your logic about
 significance tests goes wrong.

 The general logic of a singificance test is that a test statistic
 (say T) is chosen such that large values represent a discrepancy
 between possible data and the hypothesis under test. When you
 have the data, T evaluates to a value (say t0). The null hypothesis
 (NH) implies a distribution for the statistic T if the NH is true.

 Then the value of Prob(T = t0 | NH) can be calculated. If this is
 small, then the probability of obtaining data at least as discrepant
 as the data you did obtain is small; if sufficiently small, then
 the conjunction of NH and your data (as assessed by the statistic T)
 is so unlikely that you can decide to not believe that it is possible.
 If you so decide, then you reject the NH because the data are so
 discrepant that you can't believe it!

 This is on the same lines as the reductio ad absurdum in classical
 logic: An hypothesis A implies that an outcome B must occur. But I
 have observed that B did not occur. Therefore A cannot be true.

 But it does not follow that, if you observe that B did occur
 (which is *consistent* with A), then A must be true. A could be
 false, yet B still occur -- the only basis on which occurrence
 of B could *prove* that A must be true is when you have the prior
 information that B will occur *if and only if* A is true. In the
 reductio ad absurdum, and in the parallel logic of significance
 testing, all you have is B will occur *if* A is true. The only if
 part is not there. So you cannot deduce that A is true from
 the observation that B occurred, since what you have to start with
 allows B to occur if A is false (i.e. B will occur *if* A is true
 says nothing about what may or may not happen if A is false).

 So, in your single toss of a coin, it is true that I will observe
 either 'succ' or 'fail' if the coin is fair. But (as in the above)
 you cannot deduce that the coin is fair if you observe either
 'succ' or 'fail', since it is possible (indeed necessary) that you
 obtain such an observation if the coin is not fair (even if the
 coin is the same, either 'succ' or 'fail', on both sides, therefore
 completely unfair). This is an attempt to expand Greg Snow's reply!

 Your 2-sided test takes the form T=1 if either outcome='succ' or
 outcome='fail'. And that is the only possible value for T since
 no other outcome is possible. Hence Prob(T==1) = 1 whether the coin
 is fair or not. It is not possible for such data to discriminate
 between a fair and an unfair coin.

 And, as explained above, a P-value of 1 cannot prove that the
 null hypothesis is true. All that is possible with a significance
 test is that a small P-value can be taken as evidence that the
 NH is false.

 Hoping this helps!
 Ted.

 On 02-Sep-10 07:41:17, Kay Cecil Cichini wrote:
 i test the null that the coin is fair (p(succ) = p(fail) = 0.5) with
 one trail and get a p-value of 1. actually i want to proof the
 alternative H that the estimate is different from 0.5, what certainly
 can not be aproven here. but in reverse the p-value of 1 says that i
 can 100% sure that the estimate of 0.5 is true (??) - that's the point
 that astonishes me.

 thanks if anybody could clarify this for me,
 kay

 Zitat von Greg Snow greg.s...@imail.org:

 Try thinking this one through from first principles, you are
 essentially saying that your null hypothesis is that you are
 flipping a fair coin and you want to do a 2-tailed test.  You then
 flip the coin exactly once, what do you expect to happen?  The
 p-value of 1 just means that what you saw was perfectly consistent
 with what is predicted to happen flipping a single time.

 Does that help?

 If not,

[R] date

2010-09-02 Thread Dunia Scheid

Hello all,

I've 2 strings that representing the start and end values of a date and
time.
For example,
time1 - c(21/04/2005,23/05/2005,11/04/2005)
time2 - c(15/07/2009, 03/06/2008, 15/10/2005)
as.difftime(time1,time2)
Time differences in secs
[1] NA NA NA
attr(,tzone)
[1] 

How can i calculate the difference  between this 2 string?

Regards,
Dunia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] star models

2010-09-02 Thread Setlhare Lekgatlhamang

Hi,
Have you gotten help on this?

Lexi

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of ma...@unizar.es
Sent: Saturday, August 28, 2010 5:08 PM
To: r-help@r-project.org
Subject: [R] star models

Hi,
I am traying to implement an STAR model, but I have some problems.
I am following the instruction of the model, that they are in:

http://bm2.genes.nig.ac.jp/RGM2/R_current/library/tsDyn/man/star.html

  that they are from:

http://bm2.genes.nig.ac.jp/RGM2/pkg.php?p=tsDyn

The model is:
star(x, m=2, noRegimes, d = 1, steps = d, series, rob = FALSE, mTh,  
thDelay, thVar, sig=0.05, trace=TRUE, control=list(), ...)

I have implemented the example:
mod.star - star(log10(lynx), m=2, mTh=c(0,1), control=list(maxit=3000))
mod.star

However, when I put my variables from my data.frame, R does not find  
my variables, so, for this reason I have try to create a matrix with  
my data variables, and to implement the model with the matrix, but I  
can not create the matrix, either.

Please, could someone help me to implement this model?

Thanks.
  Mercedes.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

DISCLAIMER:\ Sample Disclaimer added in a VBScript.\ ...{{dropped:3}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparing COXPH models, one with age as a continuous variable, one with age as a three-level factor

2010-09-02 Thread stephenb


sorry to bump in late, but I am doing similar things now and was browsing.

IMHO anova is not appropriate here. it applies when the richer model has p
more variables than the simpler model. this is not the case here. the
competing models use different variables.

you are left with IC. 

by transforming a continuous variable into categorical you are smoothing,
which is the idea of GAM. if you look at what is offered in GAMs you may
find better approximations f(age) as well as tools for testing among
different f(age) transformations.

regards.
S.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/how-to-get-design-matrix-tp891987p2524289.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] date

2010-09-02 Thread Ista Zahn

Hi Dunia,
You need to convert the character strings to Dates.

time1 - as.Date(c(21/04/2005,23/05/2005,11/04/2005), %d/%m/%Y)
time2 - as.Date(c(15/07/2009, 03/06/2008, 15/10/2005), %d/%m/%Y)

time2-time1


Best,
Ista

On Thu, Sep 2, 2010 at 10:32 AM, Dunia Scheid dunia.sch...@gmail.com wrote:
 Hello all,

 I've 2 strings that representing the start and end values of a date and
 time.
 For example,
 time1 - c(21/04/2005,23/05/2005,11/04/2005)
 time2 - c(15/07/2009, 03/06/2008, 15/10/2005)
 as.difftime(time1,time2)
 Time differences in secs
 [1] NA NA NA
 attr(,tzone)
 [1] 

 How can i calculate the difference  between this 2 string?

 Regards,
 Dunia

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How using the weights argument in nls2?

2010-09-02 Thread Ivan Allaman


Good morning gentlemen!

How using a weighted model in nls2? Values with the nls are logical since
values with nls2 are not. I believe that this discrepancy is due to I did
not include the weights argument in nls2.

Here's an example:

MOISTURE - c(28.41640,  28.47340,  29.05821,  28.52201,  30.92055, 
31.07901,  31.35840, 31.69617,  32.07168,  31.87296,  31.35525,  32.66118, 
33.23385,  32.72256,
32.57929,  32.12674,  52.35225,  52.77275,  64.90770,  64.90770,  85.23800,
84.43300,  68.96560,  68.41395,  70.82880,  71.18400,  96.13240,  96.07920,
95.35160,  94.71660,  87.59190,  88.63250,  89.78760,  90.17820, 88.46160,
87.10860, 94.86660,  94.51830,  75.79000,  76.98780, 144.70950, 143.88950,
111.58620, 112.71510, 120.85300, 121.43100, 116.34840, 114.87420, 195.35040,
191.36040, 265.35220, 267.25450, 227.13700, 228.78000, 238.37120, 242.70700,
299.54890, 291.04110, 220.09920, 219.82650, 236.79150, 243.70710, 208.79880,
208.12260, 417.21420, 429.59480, 360.91080, 371.66400, 357.72520, 360.53640,
383.82600, 383.82600, 434.02700, 432.57500, 440.56260, 438.32340, 468.69600,
469.82140, 497.93680, 497.17010)

YEARS -  rep(c(86,109, 132, 158, 184, 220, 254, 276, 310,
337),c(8,8,8,8,8,8,8,8,8,8))

VARIANCE - c(2.0879048 ,   2.0879048,2.0879048,2.0879048,   
2.0879048,
 2.0879048,2.0879048,2.0879048,0.3442724,0.3442724,
0.3442724,0.3442724,0.3442724,0.3442724,   0.3442724,
0.3442724,  151.9481710,  151.9481710,  151.9481710,  151.9481710,
151.9481710,  151.9481710,  151.9481710,  151.9481710,  115.3208995,
115.3208995,  115.3208995,  115.3208995,  115.3208995,  115.3208995,
115.3208995,  115.3208995,   51.9965027,   51.9965027,   51.9965027,
51.9965027,   51.9965027,   51.9965027,  51.9965027,   51.9965027,
180.0496045, 180.0496045,  180.0496045,  180.0496045,  180.0496045,
180.0496045,  180.0496045,  180.0496045,  791.3223240,  791.3223240,
791.3223240,  791.3223240,  791.3223240,  791.3223240,  791.3223240,
791.3223240, 1280.0179973, 1280.0179973, 1280.0179973, 1280.0179973,
1280.0179973, 1280.0179973, 1280.0179973, 1280.0179973,  728.9582154,
728.9582154,  728.9582154,  728.9582154,  728.9582154,  728.9582154,
728.9582154,  728.9582154,  752.4591144,  752.4591144,  752.4591144,
752.4591144,  752.4591144,  752.4591144,  752.4591144,  752.4591144)

test - data.frame(YEARS,MOISTURE,VARIANCE)

mod.nls - nls(MOISTURE ~ A/(1+B*exp(-k*YEARS)),
data = test, 
weights = 1/VARIANCE,
start = list(A=1500, B=200, k=0.03345),
control=list(maxiter = 500),
trace=TRUE)
summary(mod.nls)

Following the example of pdf!

st1 - expand.grid(A = seq(0, 2000, len = 10),
B = seq(0, 500, len = 10), k = seq(-1, 10, len = 10))
st1

mod.nls2 -nls2(MOISTURE ~ A/(1+B*exp(-k*YEARS)),
data = test, 
start = st1,
algorithm=brute-force)
mod.nls2

I appreciate everyone's attention.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-using-the-weights-argument-in-nls2-tp2524328p2524328.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help on glm and optim

2010-09-02 Thread Zhang,Yanwei

Dear all,

I'm trying to use the optim function to replicate the results  from the glm 
using an example from the help page of glm, but I could not get the optim 
function to work. Would you please point out where I did wrong? Thanks a lot.

The following is the code:

# Step 1: fit the glm
clotting - data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
fit1 - glm(lot1 ~ log(u), data=clotting, family=Gamma)

# Step 2: use optim
# define loglikelihood function to be maximized over
# theta is a vector of three parameters: intercept, cofficient for log(u) and 
dispersion parameter
loglik - function(theta,data){
E - 1/(theta[1]+theta[2]*log(data$u))
V - theta[3]*E^2
loglik - 
sum(dgamma(data$lot1,shape=1/theta[3],rate=1/(E*theta[3]),log=T))
return(loglik)
}

# use the glm result as initial values
theta - c(as.vector(coef(fit1)),0.002446059)
fit2 - optim(theta, loglik,  clotting, gr = NULL, hessian = TRUE,
control = list(fnscale = -1))

# Then I got the following error message:
Error in optim(theta, loglik, clotting, gr = NULL, hessian = TRUE, control = 
list(fnscale = -1)) :
  non-finite finite-difference value [3]


Wayne (Yanwei) Zhang
Statistical Research
CNA
Phone: 312-822-6296
Email: yanwei.zh...@cna.commailto:yanwei.zh...@cna.com




NOTICE:  This e-mail message, including any attachments and appended messages, 
is for the sole use of the intended recipients and may contain confidential and 
legally privileged information.
If you are not the intended recipient, any review, dissemination, distribution, 
copying, storage or other use of all or any portion of this message is strictly 
prohibited.
If you received this message in error, please immediately notify the sender by 
reply e-mail and delete this message in its entirety.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linear models (lme4) - basic question

2010-09-02 Thread James Nead

Hi,

Sorry for a basic questions on linear models.

I am trying to adjust raw data for both fixed and mixed effects. The data that 
I 
output should account for these effects, so that I can use the adjusted data 
for 
further analysis.

For example, if I have the blood sugar levels for 30 patients, and I know that 
'weight' is a fixed effect and that 'height' is a random effect, what I'd want 
as output is blood sugar levels that have been adjusted for these effects. 



library(lme4)
sugar - c(1:10,sample(1:100,20))
weight - 1:30
height - rep(sample(1:15,15,replace=F),2)

lm.adj - lmer(sugar ~ weight + (1|height))

adjusted.sugar - fitted(lm.adj)

=

I'm not sure if I'm doing this right...?

thanks!



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on glm and optim

2010-09-02 Thread Thomas Lumley


On Thu, 2 Sep 2010, Zhang,Yanwei wrote:


Dear all,

I'm trying to use the optim function to replicate the results  from the glm using an example 
from the help page of glm, but I could not get the optim function to work. Would you please 
point out where I did wrong? Thanks a lot.

The following is the code:

# Step 1: fit the glm
clotting - data.frame(
   u = c(5,10,15,20,30,40,60,80,100),
   lot1 = c(118,58,42,35,27,25,21,19,18),
   lot2 = c(69,35,26,21,18,16,13,12,12))
fit1 - glm(lot1 ~ log(u), data=clotting, family=Gamma)

# Step 2: use optim
# define loglikelihood function to be maximized over
# theta is a vector of three parameters: intercept, cofficient for log(u) and 
dispersion parameter
loglik - function(theta,data){
   E - 1/(theta[1]+theta[2]*log(data$u))
   V - theta[3]*E^2
   loglik - 
sum(dgamma(data$lot1,shape=1/theta[3],rate=1/(E*theta[3]),log=T))
   return(loglik)
}

# use the glm result as initial values
theta - c(as.vector(coef(fit1)),0.002446059)
fit2 - optim(theta, loglik,  clotting, gr = NULL, hessian = TRUE,
   control = list(fnscale = -1))

# Then I got the following error message:
Error in optim(theta, loglik, clotting, gr = NULL, hessian = TRUE, control = 
list(fnscale = -1)) :
 non-finite finite-difference value [3]



If you use trace(loglik, tracer=quote(print(theta))) to trace the inputs to 
loglik() you will find that it is being called with negative values of theta[3] 
to get finite differences.   One fix is to reparametrize and use the log scale 
rather than the scale as a parameter.

   -thomas


Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help on glm and optim

2010-09-02 Thread Zhang,Yanwei

Thomas, 
Thanks a lot. This solves my problem.  


Wayne (Yanwei) Zhang
Statistical Research 
CNA

-Original Message-
From: Thomas Lumley [mailto:tlum...@u.washington.edu] 
Sent: Thursday, September 02, 2010 11:24 AM
To: Zhang,Yanwei
Cc: r-help@r-project.org
Subject: Re: [R] Help on glm and optim

On Thu, 2 Sep 2010, Zhang,Yanwei wrote:

 Dear all,

 I'm trying to use the optim function to replicate the results  from the 
 glm using an example from the help page of glm, but I could not get the 
 optim function to work. Would you please point out where I did wrong? 
 Thanks a lot.

 The following is the code:

 # Step 1: fit the glm
 clotting - data.frame(
u = c(5,10,15,20,30,40,60,80,100),
lot1 = c(118,58,42,35,27,25,21,19,18),
lot2 = c(69,35,26,21,18,16,13,12,12))
 fit1 - glm(lot1 ~ log(u), data=clotting, family=Gamma)

 # Step 2: use optim
 # define loglikelihood function to be maximized over
 # theta is a vector of three parameters: intercept, cofficient for log(u) and 
 dispersion parameter
 loglik - function(theta,data){
E - 1/(theta[1]+theta[2]*log(data$u))
V - theta[3]*E^2
loglik - 
 sum(dgamma(data$lot1,shape=1/theta[3],rate=1/(E*theta[3]),log=T))
return(loglik)
 }

 # use the glm result as initial values
 theta - c(as.vector(coef(fit1)),0.002446059)
 fit2 - optim(theta, loglik,  clotting, gr = NULL, hessian = TRUE,
control = list(fnscale = -1))

 # Then I got the following error message:
 Error in optim(theta, loglik, clotting, gr = NULL, hessian = TRUE, control = 
 list(fnscale = -1)) :
  non-finite finite-difference value [3]


If you use trace(loglik, tracer=quote(print(theta))) to trace the inputs to 
loglik() you will find that it is being called with negative values of theta[3] 
to get finite differences.   One fix is to reparametrize and use the log scale 
rather than the scale as a parameter.

-thomas


Thomas Lumley
Professor of Biostatistics
University of Washington, Seattle


NOTICE:  This e-mail message, including any attachments and appended messages, 
is for the sole use of the intended recipients and may contain confidential and 
legally privileged information.
If you are not the intended recipient, any review, dissemination, distribution, 
copying, storage or other use of all or any portion of this message is strictly 
prohibited.
If you received this message in error, please immediately notify the sender by 
reply e-mail and delete this message in its entirety.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear models (lme4) - basic question

2010-09-02 Thread Bert Gunter

Why should height be a random effect?

I suspect you may need a tutorial on what a random effect in a mixed
model is. I see no obvious reason why one would cluster on height.
Perhaps if you clarify, it may become obvious either what your
concerns are (and that your model is correct) or that you
misunderstand and need a tutorial on random effects in mixed models.
Some kind soul on this list may provide that (not me, though) -- or
you can talk with your local statistician.

Cheers,
Bert

On Thu, Sep 2, 2010 at 9:22 AM, James Nead james_n...@yahoo.com wrote:
 Hi,

 Sorry for a basic questions on linear models.

 I am trying to adjust raw data for both fixed and mixed effects. The data 
 that I
 output should account for these effects, so that I can use the adjusted data 
 for
 further analysis.

 For example, if I have the blood sugar levels for 30 patients, and I know that
 'weight' is a fixed effect and that 'height' is a random effect, what I'd want
 as output is blood sugar levels that have been adjusted for these effects.


 
 library(lme4)
 sugar - c(1:10,sample(1:100,20))
 weight - 1:30
 height - rep(sample(1:15,15,replace=F),2)

 lm.adj - lmer(sugar ~ weight + (1|height))

 adjusted.sugar - fitted(lm.adj)

 =

 I'm not sure if I'm doing this right...?

 thanks!




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear models (lme4) - basic question

2010-09-02 Thread James Nead

Hi Bert,

Thanks for the reply.

Height, was just as an example. Perhaps, I should have said 'x' instead of 
height.

Essentially, what I want to do is adjust the data for known effects. After I've 
done this, I can conduct further analysis on the data (for example, if another 
variable 'z' has an effect on sugar). So, if I have to adjust the data for a 
fixed effect (weight) and random effect (height/x), then am I using the model 
correctly?

Thanks again for the reply!









From: Bert Gunter gunter.ber...@gene.com

Cc: r-help@r-project.org
Sent: Thu, September 2, 2010 12:46:13 PM
Subject: Re: [R] Linear models (lme4) - basic question

Why should height be a random effect?

I suspect you may need a tutorial on what a random effect in a mixed
model is. I see no obvious reason why one would cluster on height.
Perhaps if you clarify, it may become obvious either what your
concerns are (and that your model is correct) or that you
misunderstand and need a tutorial on random effects in mixed models.
Some kind soul on this list may provide that (not me, though) -- or
you can talk with your local statistician.

Cheers,
Bert


 Hi,

 Sorry for a basic questions on linear models.

 I am trying to adjust raw data for both fixed and mixed effects. The data 
 that 
I
 output should account for these effects, so that I can use the adjusted data 
for
 further analysis.

 For example, if I have the blood sugar levels for 30 patients, and I know that
 'weight' is a fixed effect and that 'height' is a random effect, what I'd want
 as output is blood sugar levels that have been adjusted for these effects.


 
 library(lme4)
 sugar - c(1:10,sample(1:100,20))
 weight - 1:30
 height - rep(sample(1:15,15,replace=F),2)

 lm.adj - lmer(sugar ~ weight + (1|height))

 adjusted.sugar - fitted(lm.adj)

 =

 I'm not sure if I'm doing this right...?

 thanks!




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear models (lme4) - basic question

2010-09-02 Thread James Nead

Sorry, forgot to mention that the processed data will be used as input for a 
classification algorithm. So, I need to adjust for known effects before I can 
use the data.

thanks





From: Bert Gunter gunter.ber...@gene.com

Cc: r-help@r-project.org
Sent: Thu, September 2, 2010 12:46:13 PM
Subject: Re: [R] Linear models (lme4) - basic question

Why should height be a random effect?

I suspect you may need a tutorial on what a random effect in a mixed
model is. I see no obvious reason why one would cluster on height.
Perhaps if you clarify, it may become obvious either what your
concerns are (and that your model is correct) or that you
misunderstand and need a tutorial on random effects in mixed models.
Some kind soul on this list may provide that (not me, though) -- or
you can talk with your local statistician.

Cheers,
Bert


 Hi,

 Sorry for a basic questions on linear models.

 I am trying to adjust raw data for both fixed and mixed effects. The data 
 that 
I
 output should account for these effects, so that I can use the adjusted data 
for
 further analysis.

 For example, if I have the blood sugar levels for 30 patients, and I know that
 'weight' is a fixed effect and that 'height' is a random effect, what I'd want
 as output is blood sugar levels that have been adjusted for these effects.


 
 library(lme4)
 sugar - c(1:10,sample(1:100,20))
 weight - 1:30
 height - rep(sample(1:15,15,replace=F),2)

 lm.adj - lmer(sugar ~ weight + (1|height))

 adjusted.sugar - fitted(lm.adj)

 =

 I'm not sure if I'm doing this right...?

 thanks!




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] general question on binomial test / sign test

2010-09-02 Thread Greg Snow

Just to add to Ted's addition to my response.  I think you are moving towards 
better understanding (and your misunderstandings are common), but to further 
clarify:

First, make sure that you understand that the probability of A given B, p(A|B), 
is different from the probability of B given A, p(B|A).  A simple example from 
probability class:  I have 3 coins in my pocket, one has 2 heads, one has 2 
tails, and the other is fair (p(H)=p(T)=0.5), I take one coin out at random 
(each has probability =1/3 of being chosen) and flip it, I observe Heads, 
what is the probability that it is the 2 headed coin?

So here the probability of Heads given the coin is 2 headed is 1.0, but that 
does not mean that the probability that I chose the 2 headed coin is 1 given I 
saw heads (it is 2/3rds).  

The same applies with testing, the p-value is the probability of the data given 
that the null hypothesis is TRUE.  Many people try to interpret it as the 
probability that the null hypothesis is true given the data, but that is not 
the case (not even for Bayesians unless they use a really weird prior).  I 
think that you are starting to understand this part, high p-values mean that 
the data is consistent with the null, we cannot reject the null, but they do 
not prove the null.

A great article for help in understanding p-values better is:

 Murdock, D, Tsai, Y, and Adcock, J (2008) _P-Values are Random
 Variables_. The American Statistician. (62) 242-245.

Which talks about p-values as random variables.  There are a couple of 
functions in the TeachingDemos package that implement some of the simulations 
discussed in the article, I would suggest playing with run.Pvalue.norm.sim and 
run.Pvalue.binom.sim.  Notice that when the null hypothesis is true (and you 
have a big enough sample size for the binomial case) that the p-values follow a 
uniform distribution, the p-values 1.0, 0.5, 0.1, and 0.001 are all equally 
likely.  As the difference between the null hypothesis and the truth increases 
you get more and more p-values close to 0.

The real tricky bit about hypothesis testing is that we compute a single 
p-value, a single observation from a distribution, and based on that try to 
decide if the process that produced that observation is a uniform distribution 
or something else (that may be close to a uniform or very different).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Kay Cecil Cichini
 Sent: Thursday, September 02, 2010 6:40 AM
 To: ted.hard...@manchester.ac.uk
 Cc: r-help@r-project.org
 Subject: Re: [R] general question on binomial test / sign test
 
 
 thanks a lot for the elaborations.
 
 your explanations clearly brought to me that either
 binom.test(1,1,0.5,two-sided) or binom.test(0,1,0.5) giving a
 p-value of 1 simply indicate i have abolutely no ensurance to reject
 H0.
 
 considering binom.test(0,1,0.5,alternative=greater) and
 binom.test(1,1,0.5,alternative=less) where i get a p-value of 1 and
 0.5,respectively - am i right in stating that for the first estimate
 0/1 i have no ensurance at all for rejection of H0 and for the second
 estimate = 1/1 i have same chance for beeing wrong in either rejecting
 or keeping H0.
 
 many thanks,
 kay
 
 
 
 Zitat von ted.hard...@manchester.ac.uk:
 
  You state: in reverse the p-value of 1 says that i can 100% sure
  that the estimate of 0.5 is true. This is where your logic about
  significance tests goes wrong.
 
  The general logic of a singificance test is that a test statistic
  (say T) is chosen such that large values represent a discrepancy
  between possible data and the hypothesis under test. When you
  have the data, T evaluates to a value (say t0). The null hypothesis
  (NH) implies a distribution for the statistic T if the NH is true.
 
  Then the value of Prob(T = t0 | NH) can be calculated. If this is
  small, then the probability of obtaining data at least as discrepant
  as the data you did obtain is small; if sufficiently small, then
  the conjunction of NH and your data (as assessed by the statistic T)
  is so unlikely that you can decide to not believe that it is
 possible.
  If you so decide, then you reject the NH because the data are so
  discrepant that you can't believe it!
 
  This is on the same lines as the reductio ad absurdum in classical
  logic: An hypothesis A implies that an outcome B must occur. But I
  have observed that B did not occur. Therefore A cannot be true.
 
  But it does not follow that, if you observe that B did occur
  (which is *consistent* with A), then A must be true. A could be
  false, yet B still occur -- the only basis on which occurrence
  of B could *prove* that A must be true is when you have the prior
  information that B will occur *if and only if* A is true. In the
  reductio ad absurdum, and

Re: [R] Linear models (lme4) - basic question

2010-09-02 Thread Ben Bolker

James Nead james_nead at yahoo.com writes:

 
 Sorry, forgot to mention that the processed data will be used as input for a 
 classification algorithm. So, I need to adjust for known effects before I can 
 use the data.
 
  I am trying to adjust raw data for both fixed and mixed effects. 
 The data that I
  output should account for these effects, so that I can use 
 the adjusted data 
 for
  further analysis.
 
  For example, if I have the blood sugar levels for 30 patients, 
 and I know that
  'weight' is a fixed effect and that 'height' is a random effect, 
 what I'd want
  as output is blood sugar levels that have been adjusted for these effects.

  What's not clear to me is what you mean by 'adjusted for'.
fitted(lm.adj) will give predicted values based on the height
and weight. I don't really know what the justification for/meaning
of the adjustment is, so I don't know whether you want to predict
on the basis of the heights, or whether you want to get a 'population-level'
prediction, i.e. one with height effects set to zero.  Maybe you want
residuals(lm.adj) ...?

  I suggest that follow-ups go to r-sig-mixed-mod...@r-project.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] general question on binomial test / sign test

2010-09-02 Thread David Winsemius



On Sep 2, 2010, at 2:01 PM, Greg Snow wrote:

snipped much good material


The real tricky bit about hypothesis testing is that we compute a  
single p-value, a single observation from a distribution, and based  
on that try to decide if the process that produced that observation  
is a uniform distribution or something else (that may be close to a  
uniform or very different).


My friendly addition would be to point the OP in the direction of  
using qqplot() for the examination of distributional properties rather  
than doing any sort of hypothesis testing. There is a learning curve  
for using that tool, but it will pay off in the end.


--
David.


Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of Kay Cecil Cichini
Sent: Thursday, September 02, 2010 6:40 AM
To: ted.hard...@manchester.ac.uk
Cc: r-help@r-project.org
Subject: Re: [R] general question on binomial test / sign test


thanks a lot for the elaborations.

your explanations clearly brought to me that either
binom.test(1,1,0.5,two-sided) or binom.test(0,1,0.5) giving a
p-value of 1 simply indicate i have abolutely no ensurance to reject
H0.

considering binom.test(0,1,0.5,alternative=greater) and
binom.test(1,1,0.5,alternative=less) where i get a p-value of 1 and
0.5,respectively - am i right in stating that for the first estimate
0/1 i have no ensurance at all for rejection of H0 and for the second
estimate = 1/1 i have same chance for beeing wrong in either  
rejecting

or keeping H0.

many thanks,
kay




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear models (lme4) - basic question

2010-09-02 Thread James Nead

My apologies - I have made this more confusing than it needs to be.

I had microarray gene expression data which I want to use for classification 
algorithms. However, I want to 'adjust' the data for all confounding factors 
(such as age, experiment number etc.), before I could use the data as input for 
the classification algorithms. Since the phenotype is known to be effected by 
age, I thought that this would be a fixed effect whereas something like 
'beadchip' would be a random effect.

Should I be looking at something else for this?







From: Ben Bolker bbol...@gmail.com
To: r-h...@stat.math.ethz.ch
Sent: Thu, September 2, 2010 2:06:47 PM
Subject: Re: [R] Linear models (lme4) - basic question

James Nead james_nead at yahoo.com writes:

 
 Sorry, forgot to mention that the processed data will be used as input for a 
 classification algorithm. So, I need to adjust for known effects before I can 
 use the data.
 
  I am trying to adjust raw data for both fixed and mixed effects. 
 The data that I
  output should account for these effects, so that I can use 
 the adjusted data 
 for
  further analysis.
 
  For example, if I have the blood sugar levels for 30 patients, 
 and I know that
  'weight' is a fixed effect and that 'height' is a random effect, 
 what I'd want
  as output is blood sugar levels that have been adjusted for these effects.

  What's not clear to me is what you mean by 'adjusted for'.
fitted(lm.adj) will give predicted values based on the height
and weight. I don't really know what the justification for/meaning
of the adjustment is, so I don't know whether you want to predict
on the basis of the heights, or whether you want to get a 'population-level'
prediction, i.e. one with height effects set to zero.  Maybe you want
residuals(lm.adj) ...?

  I suggest that follow-ups go to r-sig-mixed-mod...@r-project.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear models (lme4) - basic question

2010-09-02 Thread Ben Bolker

On 10-09-02 02:26 PM, James Nead wrote:
 My apologies - I have made this more confusing than it needs to be.

 I had microarray gene expression data which I want to use for
 classification algorithms. However, I want to 'adjust' the data for
 all confounding factors (such as age, experiment number etc.), before
 I could use the data as input for the classification algorithms. Since
 the phenotype is known to be effected by age, I thought that this
 would be a fixed effect whereas something like 'beadchip' would be a
 random effect.

 Should I be looking at something else for this?


  Sounds to me as though you should use residuals() rather than fitted()
if you want to adjust for confounding factors.

  But since you've made up a nice small example, I think you should look
at the results
 of fitted() and residuals()
for your example and see if it's doing what you want.


 
 *From:* Ben Bolker bbol...@gmail.com
 *To:* r-h...@stat.math.ethz.ch
 *Sent:* Thu, September 2, 2010 2:06:47 PM
 *Subject:* Re: [R] Linear models (lme4) - basic question

 James Nead james_nead at yahoo.com http://yahoo.com writes:

 
  Sorry, forgot to mention that the processed data will be used as
 input for a
  classification algorithm. So, I need to adjust for known effects
 before I can
  use the data.
 
   I am trying to adjust raw data for both fixed and mixed effects.
  The data that I
   output should account for these effects, so that I can use
  the adjusted data
  for
   further analysis.
  
   For example, if I have the blood sugar levels for 30 patients,
  and I know that
   'weight' is a fixed effect and that 'height' is a random effect,
  what I'd want
   as output is blood sugar levels that have been adjusted for these
 effects.

   What's not clear to me is what you mean by 'adjusted for'.
 fitted(lm.adj) will give predicted values based on the height
 and weight. I don't really know what the justification for/meaning
 of the adjustment is, so I don't know whether you want to predict
 on the basis of the heights, or whether you want to get a
 'population-level'
 prediction, i.e. one with height effects set to zero.  Maybe you want
 residuals(lm.adj) ...?

   I suggest that follow-ups go to r-sig-mixed-mod...@r-project.org
 mailto:r-sig-mixed-mod...@r-project.org

 __
 R-help@r-project.org mailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] general question on binomial test / sign test

2010-09-02 Thread Greg Snow

David,

The original poster was not looking at distributions and testing distributions, 
I referred to the distribution of the p-value to help them understand (in 
reference to the paper mentioned).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: David Winsemius [mailto:dwinsem...@comcast.net]
 Sent: Thursday, September 02, 2010 12:12 PM
 To: Greg Snow
 Cc: Kay Cecil Cichini; ted.hard...@manchester.ac.uk; r-h...@r-
 project.org
 Subject: Re: [R] general question on binomial test / sign test
 
 
 On Sep 2, 2010, at 2:01 PM, Greg Snow wrote:
 
 snipped much good material
 
  The real tricky bit about hypothesis testing is that we compute a
  single p-value, a single observation from a distribution, and based
  on that try to decide if the process that produced that observation
  is a uniform distribution or something else (that may be close to a
  uniform or very different).
 
 My friendly addition would be to point the OP in the direction of
 using qqplot() for the examination of distributional properties rather
 than doing any sort of hypothesis testing. There is a learning curve
 for using that tool, but it will pay off in the end.
 
 --
 David.
 
  Hope this helps,
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of Kay Cecil Cichini
  Sent: Thursday, September 02, 2010 6:40 AM
  To: ted.hard...@manchester.ac.uk
  Cc: r-help@r-project.org
  Subject: Re: [R] general question on binomial test / sign test
 
 
  thanks a lot for the elaborations.
 
  your explanations clearly brought to me that either
  binom.test(1,1,0.5,two-sided) or binom.test(0,1,0.5) giving a
  p-value of 1 simply indicate i have abolutely no ensurance to reject
  H0.
 
  considering binom.test(0,1,0.5,alternative=greater) and
  binom.test(1,1,0.5,alternative=less) where i get a p-value of 1
 and
  0.5,respectively - am i right in stating that for the first estimate
  0/1 i have no ensurance at all for rejection of H0 and for the
 second
  estimate = 1/1 i have same chance for beeing wrong in either
  rejecting
  or keeping H0.
 
  many thanks,
  kay
 
 
 
 David Winsemius, MD
 West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Linear models (lme4) - basic question

2010-09-02 Thread Bert Gunter

Perhaps even more to the point, covariate adjustment and
classification should not be separate. One should fit the
appropriate model that does both.

-- Bert

On Thu, Sep 2, 2010 at 11:34 AM, Ben Bolker bbol...@gmail.com wrote:
 On 10-09-02 02:26 PM, James Nead wrote:
 My apologies - I have made this more confusing than it needs to be.

 I had microarray gene expression data which I want to use for
 classification algorithms. However, I want to 'adjust' the data for
 all confounding factors (such as age, experiment number etc.), before
 I could use the data as input for the classification algorithms. Since
 the phenotype is known to be effected by age, I thought that this
 would be a fixed effect whereas something like 'beadchip' would be a
 random effect.

 Should I be looking at something else for this?


  Sounds to me as though you should use residuals() rather than fitted()
 if you want to adjust for confounding factors.

  But since you've made up a nice small example, I think you should look
 at the results
  of fitted() and residuals()
 for your example and see if it's doing what you want.


 
 *From:* Ben Bolker bbol...@gmail.com
 *To:* r-h...@stat.math.ethz.ch
 *Sent:* Thu, September 2, 2010 2:06:47 PM
 *Subject:* Re: [R] Linear models (lme4) - basic question

 James Nead james_nead at yahoo.com http://yahoo.com writes:

 
  Sorry, forgot to mention that the processed data will be used as
 input for a
  classification algorithm. So, I need to adjust for known effects
 before I can
  use the data.
 
   I am trying to adjust raw data for both fixed and mixed effects.
  The data that I
   output should account for these effects, so that I can use
  the adjusted data
  for
   further analysis.
  
   For example, if I have the blood sugar levels for 30 patients,
  and I know that
   'weight' is a fixed effect and that 'height' is a random effect,
  what I'd want
   as output is blood sugar levels that have been adjusted for these
 effects.

   What's not clear to me is what you mean by 'adjusted for'.
 fitted(lm.adj) will give predicted values based on the height
 and weight. I don't really know what the justification for/meaning
 of the adjustment is, so I don't know whether you want to predict
 on the basis of the heights, or whether you want to get a
 'population-level'
 prediction, i.e. one with height effects set to zero.  Maybe you want
 residuals(lm.adj) ...?

   I suggest that follow-ups go to r-sig-mixed-mod...@r-project.org
 mailto:r-sig-mixed-mod...@r-project.org

 __
 R-help@r-project.org mailto:R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why is vector assignment in R recreates the entire vector ?

2010-09-02 Thread Norm Matloff


Tal wrote:

 A friend recently brought to my attention that vector assignment actually
 recreates the entire vector on which the assignment is performed.
...

I brought this up in r-devel a few months ago.  You can read my posting,
and the various replies, at

http://www.mail-archive.com/r-de...@r-project.org/msg20089.html

Some of the replies not only explain the process, but list lines in the
source code where this takes place, enabling a closer look at how/when
duplication occurs. 

Norm Matloff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simultaneous equations

2010-09-02 Thread benhartley903


Dear all, 
I am relatively new to R and have had some difficulty in understanding the
user manual for a package that I have downloaded to evaluate non-linear
simultaneous equations. 
The package is called systemfit. 
Does anyone have any experience of this package? 
What I am trying to do is solve a non linear simultaneous equations... 

Could anyone give me an example (please) of the code that would be needed to
return solutions for the following system using systemfit (or any other
better package): 

y=1/x 
y=sin(x) 

within a range of say x=-10 to 10 (x in radians) 

Thanks, I really appreciate your help in advance. 

Ben 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simultaneous-equations-tp2524645p2524645.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simultaneous equations

2010-09-02 Thread Berend Hasselman



benhartley903 wrote:
 
 Dear all, 
 I am relatively new to R and have had some difficulty in understanding the
 user manual for a package that I have downloaded to evaluate non-linear
 simultaneous equations. 
 The package is called systemfit. 
 Does anyone have any experience of this package? 
 What I am trying to do is solve a non linear simultaneous equations... 
 
 Could anyone give me an example (please) of the code that would be needed
 to return solutions for the following system using systemfit (or any other
 better package): 
 
 y=1/x 
 y=sin(x) 
 
 within a range of say x=-10 to 10 (x in radians) 
 
 Thanks, I really appreciate your help in advance. 
 
 Ben 
 

systemfit is intended for estimating a system of simultaneous equations.
You seem to want to solve a system.
I assume you want to solve this system for x and y.

You could use package nleqslv as follows:

library(nleqslv)

fun - function(x) {
 f - numeric(length(x))
 f[1] -  x[2] - 1/x[1]
 f[2] -  x[2] - sin(x[1])
 f
}
x.start - c(1,1)
nleqslv(x.start,fun)  # will give 1.1141571 0.8975395

x.start - c(2,2)
nleqslv(x.start,fun)  # will give 2.7726047 0.3606717


Berend

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simultaneous-equations-tp2524645p2524710.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simultaneous equations

2010-09-02 Thread Peter Dalgaard

On 09/02/2010 09:25 PM, benhartley903 wrote:
 
 Dear all, 
 I am relatively new to R and have had some difficulty in understanding the
 user manual for a package that I have downloaded to evaluate non-linear
 simultaneous equations. 
 The package is called systemfit. 
 Does anyone have any experience of this package? 
 What I am trying to do is solve a non linear simultaneous equations... 
 
 Could anyone give me an example (please) of the code that would be needed to
 return solutions for the following system using systemfit (or any other
 better package): 
 
 y=1/x 
 y=sin(x) 
 
 within a range of say x=-10 to 10 (x in radians) 
 
 Thanks, I really appreciate your help in advance. 
 
 Ben 

Systemfit is not even in the same ballpark...!

I'd just rewrite the above as

x * sin(x) - 1 = 0

make a graph to bracket the roots and then use uniroot.

 f - function(x) x*sin(x)-1
 curve(f, interval=c(-10.10)
 f(c(0,2,5,7,10))
[1] -1.000  0.8185949 -5.7946214  3.5989062 -6.4402111

So the roots are

 uniroot(f,interval=c(7,10))$root
[1] 9.317241
 uniroot(f,interval=c(5,7))$root
[1] 6.43914
 uniroot(f,interval=c(2,5))$root
[1] 2.772631
 uniroot(f,interval=c(0,2))$root
[1] 1.114161

and 4 more symmetrically below zero.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparing COXPH models, one with age as a continuous variable, one with age as a three-level factor

2010-09-02 Thread Frank Harrell


On Thu, 2 Sep 2010, stephenb wrote:



sorry to bump in late, but I am doing similar things now and was browsing.

IMHO anova is not appropriate here. it applies when the richer model has p
more variables than the simpler model. this is not the case here. the
competing models use different variables.


A simple approach is to have the factor variable in the model and to 
formally test for added information given by the continuous variable 
(linear, quadratic, spline, etc).  AIC could also be used.


 

you are left with IC.

by transforming a continuous variable into categorical you are smoothing,
which is the idea of GAM. if you look at what is offered in GAMs you may
find better approximations f(age) as well as tools for testing among
different f(age) transformations.


I don't follow that comment.  Smoothing uses the full continuous 
variable to begin with.


A restricted cubic spline function in age is a simple approach.  E.g.:

require(rms)
dd - datadist(mydata); options(datadist='dd')
f - cph(Surv(dtime,death) ~ rcs(age,4) + sex, data=mydata)
plot(Predict(f, age))

Note that you can always expect the categorized version of age not to 
fit the data except sometimes when behavior is dictated by law 
(driving, drinking, military service, medicare).


Frank



regards.
S.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Power analysis

2010-09-02 Thread Lewis G. Dean

I am aware this is fairly simple, but is currently driving me mad! Could
someone help me out with conducting a post-hoc power analysis on a Wilcoxon
test. I am being driven slightly mad by some conflicting advice!
Thanks in advance,

Lewis

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simultaneous equations

2010-09-02 Thread benhartley903


Thanks Peter
Actually I should have specified. These are not actually the functions I
ultimately want to solve can't be rearranged explicitly and substituted. I
do need a way to solve simultaneously. 

Ben
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Simultaneous-equations-tp2524645p2524751.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ordering data by variable

2010-09-02 Thread Mestat


Hi listers,
I could order a data that like this:
x-c(2,6,8,8,1)
y-c(1,6,3,5,4)
o-order(x)
frame-rbind(x,y)[,o]
But, I would like to know if there is a way to order my data without setting
up a data frame. I would like to keep independent vectors x and y.
Any suggestions?
Thanks in advance,
Marcio
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Ordering-data-by-variable-tp2524754p2524754.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordering data by variable

2010-09-02 Thread Joshua Wiley

Hi Marcio,

Is this what you want?

x - c(2,6,8,8,1)
y - c(1,6,3,5,4)
o - order(x)

# If you want each vector order by x
x[o]
y[o]

You can also use sort(), but then each vector would be sorted by
itself, not both by x.

HTH,

Josh

On Thu, Sep 2, 2010 at 1:48 PM, Mestat mes...@pop.com.br wrote:

 Hi listers,
 I could order a data that like this:
 x-c(2,6,8,8,1)
 y-c(1,6,3,5,4)
 o-order(x)
 frame-rbind(x,y)[,o]
 But, I would like to know if there is a way to order my data without setting
 up a data frame. I would like to keep independent vectors x and y.
 Any suggestions?
 Thanks in advance,
 Marcio
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Ordering-data-by-variable-tp2524754p2524754.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How using the weights argument in nls2?

2010-09-02 Thread Gabor Grothendieck

On Thu, Sep 2, 2010 at 11:09 AM, Ivan Allaman ivanala...@yahoo.com.br wrote:

 Good morning gentlemen!

 How using a weighted model in nls2? Values with the nls are logical since
 values with nls2 are not. I believe that this discrepancy is due to I did
 not include the weights argument in nls2.


Just to follow up, nls2 was ignoring the weights argument.  This is
now fixed in the development version of nls2.  The weights and no
weights versions below give different residual sum of squares showing
that weights is not ignored:

library(nls2)
# grab development version
source(http://nls2.googlecode.com/svn/trunk/R/nls2.R;)
BOD2 - cbind(BOD, w = 1:6)
nls2(demand ~ a + b*Time, data = BOD2, start = c(a = 1, b = 1),
   weights = w, alg = brute)

# compare against
nls2(demand ~ a + b*Time, data = BOD2, start = c(a = 1, b = 1),
   alg = brute)

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparing COXPH models, one with age as a continuous variable, one with age as a three-level factor

2010-09-02 Thread Bond, Stephen

Totally agreed.

I made a mistake in calling the categorization a GAM. If we apply a step 
function to the continuous age we get a limited range ordinal variable. 
Categorizing is creating several binary variables from the continuous (with 
treatment contrasts).

Stephen B
-Original Message-
From: Frank Harrell [mailto:harre...@gmail.com] On Behalf Of Frank Harrell
Sent: Thursday, September 02, 2010 4:23 PM
To: Bond, Stephen
Cc: r-help@r-project.org
Subject: Re: [R] Comparing COXPH models, one with age as a continuous variable, 
one with age as a three-level factor

On Thu, 2 Sep 2010, stephenb wrote:


 sorry to bump in late, but I am doing similar things now and was browsing.

 IMHO anova is not appropriate here. it applies when the richer model has p
 more variables than the simpler model. this is not the case here. the
 competing models use different variables.

A simple approach is to have the factor variable in the model and to 
formally test for added information given by the continuous variable 
(linear, quadratic, spline, etc).  AIC could also be used.

  
 you are left with IC.

 by transforming a continuous variable into categorical you are smoothing,
 which is the idea of GAM. if you look at what is offered in GAMs you may
 find better approximations f(age) as well as tools for testing among
 different f(age) transformations.

I don't follow that comment.  Smoothing uses the full continuous 
variable to begin with.

A restricted cubic spline function in age is a simple approach.  E.g.:

require(rms)
dd - datadist(mydata); options(datadist='dd')
f - cph(Surv(dtime,death) ~ rcs(age,4) + sex, data=mydata)
plot(Predict(f, age))

Note that you can always expect the categorized version of age not to 
fit the data except sometimes when behavior is dictated by law 
(driving, drinking, military service, medicare).

Frank


 regards.
 S.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Power analysis

2010-09-02 Thread Greg Snow

Be happy, don't do post-hoc power analyses.

The standard post-hoc power analysis is actually counterproductive. It is 
much better to just create confidence intervals.  Or give a better 
description/justification showing that your case is not the standard/worse than 
useless version.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Lewis G. Dean
 Sent: Thursday, September 02, 2010 9:45 AM
 To: r-help@r-project.org
 Subject: [R] Power analysis
 
 I am aware this is fairly simple, but is currently driving me mad!
 Could
 someone help me out with conducting a post-hoc power analysis on a
 Wilcoxon
 test. I am being driven slightly mad by some conflicting advice!
 Thanks in advance,
 
 Lewis
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ordering data by variable

2010-09-02 Thread Greg Snow

Suggestion:  use the power of R.

If x and y are independent then sorting y based on x is meaningless.

If sorting y based on x is meaningful, then they are not independent.

Trying to force non-independent things to pretend that they are independent 
just causes future headaches.

Part of the great power of R is the ability to group things together that 
should be grouped.  The wise learn this and use it (in some cases (mine) that 
wisdom comes at the expense of not having properly grouped in the past).

Learn the power of with/within, data= arguments, and apply style functions, 
then you will be eager to combine things into data frames (or lists or ...) 
when appropriate.

descend from soapbox

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Mestat
 Sent: Thursday, September 02, 2010 2:49 PM
 To: r-help@r-project.org
 Subject: [R] Ordering data by variable
 
 
 Hi listers,
 I could order a data that like this:
 x-c(2,6,8,8,1)
 y-c(1,6,3,5,4)
 o-order(x)
 frame-rbind(x,y)[,o]
 But, I would like to know if there is a way to order my data without
 setting
 up a data frame. I would like to keep independent vectors x and y.
 Any suggestions?
 Thanks in advance,
 Marcio
 --
 View this message in context: http://r.789695.n4.nabble.com/Ordering-
 data-by-variable-tp2524754p2524754.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] general question on binomial test / sign test

2010-09-02 Thread Ted Harding

On 02-Sep-10 18:01:55, Greg Snow wrote:
 Just to add to Ted's addition to my response.  I think you are moving
 towards better understanding (and your misunderstandings are common),
 but to further clarify:
 [Wise words about P(A|B), P(B|A), P-values, etc., snipped]
 
 The real tricky bit about hypothesis testing is that we compute a
 single p-value, a single observation from a distribution, and based on
 that try to decide if the process that produced that observation is a
 uniform distribution or something else (that may be close to a uniform
 or very different).

Indeed. And this is precisely why I began my original reply as follows:

 Zitat von ted.hard...@manchester.ac.uk:
 [...]
 The general logic of a singificance test is that a test statistic
 (say T) is chosen such that large values represent a discrepancy
 between possible data and the hypothesis under test. When you
 have the data, T evaluates to a value (say t0). The null hypothesis
 (NH) implies a distribution for the statistic T if the NH is true.

 Then the value of Prob(T = t0 | NH) can be calculated. If this is
 small, then the probability of obtaining data at least as discrepant
 as the data you did obtain is small; if sufficiently small, then
 the conjunction of NH and your data (as assessed by the statistic T)
 is so unlikely that you can decide to not believe that it is
 possible.
 If you so decide, then you reject the NH because the data are so
 discrepant that you can't believe it!

The point is that the test statistic T represents *discrepancy*
between data and NH in some sense. In what sense? That depends on
what you are interested in finding out; and, whatever it is,
there will be some T that represents it.

It might be whether two samples come from distributions with equal
means, or not. Then you might use T = mean(Ysample) - mean(Xsample).
Large values of |T| represent discrepancy (in either direction)
between data and an NH that the true means are equal. Large values
of T, discrepancy in the positive direction, large values of -T
diuscrepancy in the negative direction. Or it might be whether or
not the two samples are drawn from populations with equal variances,
when you might use T = var(Ysample)/var(Xsample). Or it might be
whether the distribution from which X was sampled is symmetric,
in which case you might use skewness(Xsample). Or you might be
interested in whether the numbers falling into disjoint classes
are consistent with hypothetical probabilities p1,...,pk of
falling into these classes -- in which case you might use the
chi-squared statistic T = sum(((ni - N*pi)^2)/(N*pi)). And so on.

Once you have decided on what discrepant means, and chosen a
statistic T to represent discrepancy, then the NH implies a
distribution for T and you can calculate
  P-value = Prob(T = t0 | NH)
where t0 is the value of T calculated from the data.

*THEN* small P-value is in direct correspondence with large T,
i.e. small P is equivalent to large discrepancy. And it is also
the direct measure of how likely you were to get so large a
discrepancy if the NH really was true.

Thus the P-values, calculated from the distribution of (T | NH),
are ordered, not just numerically from small P to large, but also
equivalently by discrepancy (from large discrepancy to small).

Thus the uniform distribution of P under the NH does not just
mean that any value of P is as likely as any other, so So what?
Why prefer on P-value to another?

We also have that different parts of the [0,1] P-scale have
different *meanings* -- the parts near 0 are highly discrepant
from NH, the parts near 1 are highly consistent with NH,
*with respect to the meaning of discrepancy implied by the
choice of test statistic T*.

So it helps to understand hypothesis testing if you keep in
mind what the test statistic T *represents* in real terms.

Greg's point about try to decide if the process that produced that
observation is a uniform distribution or something else (that may
be close to a uniform or very different) is not in the first instance
relevant to the direct interpretation of small P-value as large
discrepancy -- that involves only the Null Hypothesis NH, under
which the P-values have a uniform distribution.

Where it somes into its own is that an Alternative Hypothesis AH
would correspond to some degree of discrepancy of a certain kind,
and if T is well chosen then its distribution under AH will give
large values of T greater probability than they would get under NH.
Thus the AHs that are implied by a large value of a certain test
statistic T are those AHs that give such values of T greater
probability than they would get under NH. Thus we are now getting
into the domain of the Power of the test to detect discrepancy.

Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 02-Sep-10   Time: 22:59:23

[R] specify the covariance matrix for random effect

2010-09-02 Thread Qiu, Weiyu

Hi,

I'm doing a generalized linear mixed model, and I currently use an R function 
called glmm. However, in this function they use a standard normal 
distribution for the random effect, which doesn't satisfy my case, i.e.  my 
random effect also follows a normal distribution, but observations over time 
are somehow correlated, so the covariance matrix would be different the 
identity matrix. Besides, it's also different from the commonly used AR(1), 
AR(2) structures, so actually I need a way to write down the covariance matrix 
for the random effect in GLMM by myself.

Does anyone know how to do this in R? Thank you in advance for any help.

--
Best,
Vicky

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NLS equation self starting non linear

2010-09-02 Thread Marlin Keith Cox

This data are kilojoules of energy that are consumed in starving fish over a
time period (Days).  The KJ reach a lower asymptote and level off and I
would like to use a non-linear plot to show this leveling off.  The data are
noisy and the sample sizes not the largest.  I have tried selfstarting
weibull curves and tried the following, both end with errors.

Days-c(12, 12, 12, 12, 22, 22, 22, 22, 32, 32, 32, 32, 37, 38, 38, 39, 39,
39, 39, 39, 39, 39,  1,  1,  1,  1)
Joules-c(8.782010,  8.540524,  8.507526, 11.296904,  4.232690, 13.026588,
10.213342,  4.771482,  4.560359,  6.146684, 9.651727,  8.064329,  9.419335,
7.129264,  6.079012,  7.095888, 17.996794,  7.028287,  8.028352,  5.497564,
11.607090,  9.987215, 11.065307, 21.433094, 20.366385, 22.475717)
X11()
par(cex=1.6)
plot(Joules~Days,xlab=Days after fasting was initiated,ylab=Mean energy
per fish (KJ))
model-nls(joules~a+b*exp(-c*Days),start=list(a=8,b=9,c=-.229),
control=list(minFactor=1e-12),trace=TRUE)
summary(model)

init - nls(Joules~ SSweibull(Days,Asym,Drop,lrc,pwr),
control=list(minFactor=1e-12),data=.GlobalEnv)
init
Data here is reproducible.

As always, thank you...keith

-- 
M. Keith Cox, Ph.D.
Alaska NOAA Fisheries, National Marine Fisheries Service
Auke Bay Laboratories
17109 Pt. Lena Loop Rd.
Juneau, AK 99801
keith@noaa.gov
marlink...@gmail.com
U.S. (907) 789-6603

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] specify the covariance matrix for random effect

2010-09-02 Thread Ben Bolker

Qiu, Weiyu weiyu at ualberta.ca writes:

 
 Hi,
 
 I'm doing a generalized linear mixed model, and I currently use an 
 R function called glmm. However, in
 this function they use a standard normal distribution for the random 
 effect, which doesn't satisfy my
 case, i.e.  my random effect also follows a normal distribution, 
 but observations over time are somehow
 correlated, so the covariance matrix would be different the
 identity matrix. Besides, it's also
 different from the commonly used AR(1), AR(2) structures, so 
 actually I need a way to write down the
 covariance matrix for the random effect in GLMM by myself.
 

  If you could get by with an AR or ARMA structure then you could
use lme() with the 'correlation' argument from the nlme package.
If you have enough data/are willing to fit a completely unstructured
correlation matrix, you could use corSymm.  See ?corStruct in
the nlme package documentation: it is *in principle* possible to
write functions to implement your own correlation structures, e.g.
see

corSymm
nlme:::corSymm.corMatrix
nlme:::coef.corNatural

  However, this will not be easy unless you are already a good
R programmer and familiar with the nlme package.  If you really
need to do this I definitely recommend that you buy and study
Pinheiro and Bates's 2000 book.
  It might also be worth taking a look at the cor* objects in the
ape package, which are coded slightly more transparently.

  Followups should probably go to r-sig-mixed-mod...@r-project.org

  good luck
Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] testing for emptyenv

2010-09-02 Thread Peng, C


Is there a complete list of these very handy and power functions in the base
R?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/testing-for-emptyenv-tp2432922p2525031.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pairs with same xlim and ylim scale

2010-09-02 Thread Dejian Zhao

When pairs draws plots, lower.panel invokes f.xy. Maybe there is 
something in f.xy incompatible with pairs. You can read the code of 
pairs to see what happens.


pairs has two methods, as you can see in the help message (?pairs). 
According to your code, pairs is supposed to invoke Default S3 method.

 methods(pairs)
[1] pairs.default  pairs.formula*
   Non-visible functions are asterisked
Therefore, you should check the code of the function pairs.default to 
see how error occurs. Just type pairs.default at the R command prompt 
and enter, you can get the source code of pairs.default.




On 2010-9-2 15:15, Shi, Tao wrote:

Hi Dejian,

You're right on this!  Do you know how to pass those two argument into
lower.panel?  Thanks!

...Tao



From: Dejian Zhaozha...@ioz.ac.cn
To:r-help@r-project.org
Sent: Tue, August 31, 2010 6:10:16 PM
Subject: Re: [R] pairs with same xlim and ylim scale

I think you have successfully passed the xlim and ylim into the
function pairs1. Compare the two graphs produced by the codes you
provided, you can find the xlim and ylim in the second graph have been
reset to the assigned value. It seems that the program halted in
producing the second plot after adding xlim and ylim. According to the
error message, the two added parameters were not used in lower.panel, or
the customized function f.xy.

On 2010-9-1 2:26, Shi, Tao wrote:
   

Hi list,

I have a function which basically is a wrapper of pairs with some useful panel
functions.  However, I'm having trouble to pass the xlim and ylim into the
function so the x and y axes are in the same scale and 45 degree lines are
exactly diagonal.   I've looked at some old posts, they didn't help much.  I
 

[[elided Yahoo spam]]
   

Thanks!

...Tao


pairs1- function(x, ...) {
  f.xy- function(x, y, ...) {
  points(x, y, ...)
  abline(0, 1, col = 2)
  }

  panel.cor- function(x, y, digits=2, prefix=, cex.cor) {
   usr- par(usr); on.exit(par(usr))
   par(usr = c(0, 1, 0, 1))
   r- abs(cor(x, y, method=p, use=pairwise.complete.obs))
   txt- format(c(r, 0.123456789), digits=digits)[1]
   txt- paste(prefix, txt, sep=)
   if(missing(cex.cor)) cex- 0.8/strwidth(txt)
   text(0.5, 0.5, txt, cex = cex * r)
   }

   panel.hist- function(x, ...) {
   usr- par(usr); on.exit(par(usr))
   par(usr = c(usr[1:2], 0, 1.5) )
   h- hist(x, plot = FALSE)
   breaks- h$breaks; nB- length(breaks)
   y- h$counts; y- y/max(y)
   rect(breaks[-nB], 0, breaks[-1], y, col=cyan, ...)
   }

  pairs(x, lower.panel=f.xy, upper.panel=panel.cor, diag.panel=panel.hist,
...)
}


 

x- rnorm(100, sd=0.2)
x- cbind(x=x-0.1, y=x+0.1)
pairs1(x)
pairs1(x, xlim=c(-1,1), ylim=c(-1,1))

   

Error in lower.panel(...) :
unused argument(s) (xlim = c(-1, 1), ylim = c(-1, 1))



 [[alternative HTML version deleted]]

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making plots in big scatterplot matrix large enough to see

2010-09-02 Thread Jocelyn Paine


William,

Thanks. I adapted your example by doing:
  library(psych)
  pdf(file=myfile.pdf,width=30,height=30)
  pairs.panels(data,gap=0)
  dev.off()

The R part worked. I could see it doing so when I replaced the call to 
'pdf' by

  windows(width=30,height=30)
. The remaining problem was that Adobe Reader only displayed ten rows of 
my file, and then hung. That doesn't surprise me, because it's an 
unreliable piece of software, often hanging on PDFs I find on Google. 
Annoying, though.


Jocelyn Paine
http://www.j-paine.org
http://www.spreadsheet-parts.org
+44 (0)7768 534 091

On Tue, 31 Aug 2010, William Revelle wrote:


Jocelyn,
 In a partial answer to your question, try setting gap=0 in the calls to 
pairs.  This will make the plots closer together.


(You might also find pairs.panels in the psych package useful,  -- it 
implements one of the help examples for pairs to report the histogram on the 
diagonal and reports the correlations in the upper off diagonal).


On a Mac, I just tried setting
quartz(width=30, height=30)   #make a big graphics window


#then
library(psych)
my.data - sim.item(24)  #create 500 cases of 24 variables
pairs.panels(my.data, gap=0)  #the gap =0 makes the plots right next to each 
other


#And then save the graphics window as a pdf.  I can open this in a pdf  and 
scroll around pretty easily.



Bill

At 5:21 AM +0100 8/31/10, Jocelyn Paine wrote:
I've got a data frame with 23 columns, and wanted to plot a scatterplot 
matrix of it. I called

  pairs( df )
where 'df' is my data frame. This did generate the matrix, but the plotting 
window did not expand to make the individual plots large enough to see. 
Each one was only about 10 pixels high and wide.


I tried sending the plot to a file, with a high and wide image, by doing
  png( plot.png, width = 4000, height = 4000 )
but I got these errors:
  Error in png( plot.png, width = 4000, height = 4000 ) :
  unable to start device
  In addition: Warning messages:
  1: In png( plot.png, width = 4000, height = 4000 ) :
 Unable to allocate bitmap
  2: In png( plot.png, width = 4000, height = 4000 ) :
 opening device failed

The messages aren't helpful, because they don't tell you _why_ R can't 
start the device, allocate it, or open it. The documentation for png says:

  Windows imposes limits on the size of bitmaps: these are not documented
  in the SDK and may depend on the version of Windows. It seems that width
  and height are each limited to 2^15-1.
However, 2^15-1 is 32767, so that isn't the problem here. I tried various 
values for height and width. 2400 was OK, but 2500 wasn't. So it seems R 
can't produce plots that are more than about 2400 pixels square. This is 
with R 2.10.1.


Why is png failing on big images? Also, what's the recommended way to make 
a file containing a scatterplot matrix when you have lots of variables? 
'pairs' is a very useful function, but obviously one does need to be 
careful when doing this, and I don't know what experts would recommend. Do 
you loop round the variables plotting each pair to a different file? I was 
hoping that I could put them all into one very big image and view parts of 
it at a time.


Thanks,

Jocelyn Paine
http://www.j-paine.org
http://www.spreadsheet-parts.org
+44 (0)7768 534 091

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
William Revelle http://personality-project.org/revelle.html
Professor   http://personality-project.org
Department of Psychology http://www.wcas.northwestern.edu/psych/
Northwestern University http://www.northwestern.edu/
Use R for psychology   http://personality-project.org/r
It is 6 minutes to midnight http://www.thebulletin.org



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making plots in big scatterplot matrix large enough to see

2010-09-02 Thread Jocelyn Paine

Greg, thanks for the suggestion. That's useful to know for future work. 
It's not so good in this case, because I'm making the plots for a 
colleague who doesn't know R, and it would be a bother for me to have to 
send him several files and him reassemble them. What I did was to use 
pairs.panels, as suggested by William Revelle on this thread.


I'd like to ask a general question about the interface though. There's a 
size below which individual scatterplots are not legible. It makes no 
sense at all for a scatterplot routine to plot them at that size or 
smaller. So why didn't the author(s) of 'pairs' program it so that it 
always makes them at least legible size, and expands the image window 
until it can fit them all in?


Regards,

Jocelyn Paine
http://www.j-paine.org
+44 (0)7768 534 091

Jocelyn's Cartoons:
http://www.j-paine.org/blog/jocelyns_cartoons/

On Tue, 31 Aug 2010, Greg Snow wrote:

Look at the pairs2 function in the TeachingDemos package, this lets you 
produce smaller portions of the total scatterplot matrix at a time (with 
bigger plots), you could print the smaller portions then assemble the 
full matrix on a large wall, or just use it to look at potentially 
interesting parts.


--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of Jocelyn Paine
Sent: Monday, August 30, 2010 10:21 PM
To: r-help@r-project.org
Subject: [R] Making plots in big scatterplot matrix large enough to see

I've got a data frame with 23 columns, and wanted to plot a scatterplot
matrix of it. I called
   pairs( df )
where 'df' is my data frame. This did generate the matrix, but the
plotting window did not expand to make the individual plots large
enough
to see. Each one was only about 10 pixels high and wide.

I tried sending the plot to a file, with a high and wide image,
by doing
   png( plot.png, width = 4000, height = 4000 )
but I got these errors:
   Error in png( plot.png, width = 4000, height = 4000 ) :
   unable to start device
   In addition: Warning messages:
   1: In png( plot.png, width = 4000, height = 4000 ) :
  Unable to allocate bitmap
   2: In png( plot.png, width = 4000, height = 4000 ) :
  opening device failed

The messages aren't helpful, because they don't tell you _why_ R can't
start the device, allocate it, or open it. The documentation for png
says:
   Windows imposes limits on the size of bitmaps: these are not
documented
   in the SDK and may depend on the version of Windows. It seems that
width
   and height are each limited to 2^15-1.
However, 2^15-1 is 32767, so that isn't the problem here. I tried
various
values for height and width. 2400 was OK, but 2500 wasn't. So it seems
R
can't produce plots that are more than about 2400 pixels square. This
is
with R 2.10.1.

Why is png failing on big images? Also, what's the recommended way to
make
a file containing a scatterplot matrix when you have lots of variables?
'pairs' is a very useful function, but obviously one does need to be
careful when doing this, and I don't know what experts would recommend.
Do
you loop round the variables plotting each pair to a different file? I
was
hoping that I could put them all into one very big image and view parts
of
it at a time.

Thanks,

Jocelyn Paine
http://www.j-paine.org
http://www.spreadsheet-parts.org
+44 (0)7768 534 091

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making plots in big scatterplot matrix large enough to see

2010-09-02 Thread Peter Ehlers


On 2010-08-31 13:49, Greg Snow wrote:

Look at the pairs2 function in the TeachingDemos package, this lets you produce 
smaller portions of the total scatterplot matrix at a time (with bigger plots), 
you could print the smaller portions then assemble the full matrix on a large 
wall, or just use it to look at potentially interesting parts.



If I remember correctly, there's also the xysplom() function
in pkg:HH.

  -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculate monthly mean

2010-09-02 Thread Ali Alsamawi



Hello group


   Im trying to plot 3d  with scatterplot packages, i got error say   
length(color) must be equal length(x) or 1  may data has dimensions 
(lon,lat,lev,time) ,the time in month i want to 
calculate the monthly mean for the time how can i make that , is there any 
function doing that 

Thanks a lot

 







##load rgl package
library(rgl)
library(fields)
library(ncdf)
library(scatterplot3d)
## open binary file to read
 nc - open.ncdf(/srv/ccrc/data05/z3236814/mm5/co2/2000/q.21.mon.nc)

v1 - nc$var [[1]]

v2 - nc$var [[2]]

v3 - nc$var [[3]]

data1 - get.var.ncdf(nc,v1)
data2 - get.var.ncdf(nc,v2)
data3 - get.var.ncdf(nc,v3)


coldat = data1[1:111,1:101,23,1:60]

## creat colour 
hcol = cumsum(coldat)
coldat = hcol
hcol = hcol/max(hcol,na.rm=TRUE)


col - hsv(h=hcol,s=1,v=1)

X 
-scatterplot3d(data3[1:111,1:101],data2[1:111,1:101],data1[1:111,1:101,23,1:60],color=col)
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Power analysis

2010-09-02 Thread C Peng


Agree with Greg's point. In fact it does not make logical sense in many
cases. Similar to the use of the statistically unreliable reliability
measure Cronbach's alpha in some non-statistical fields.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Power-analysis-tp2524729p2524907.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] specify the covariance matrix for random effect

2010-09-02 Thread Qiu, Weiyu

Hi,

I'm doing a generalized linear mixed model, and I currently use an R function 
called glmm. However, in this function they use a standard normal 
distribution for the random effect, which doesn't satisfy my case, i.e.  my 
random effect also follows a normal distribution, but observations over time 
are somehow correlated, so the covariance matrix would be different the 
identity matrix. Besides, it's also different from the commonly used AR(1), 
AR(2) structures, so actually I need a way to write down the covariance matrix 
for the random effect in GLMM by myself.

Does anyone know how to do this in R? Thank you in advance for any help.

--
Best,
Vicky

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] date

2010-09-02 Thread Linlin Yan

try to use difftime() instead of as.difftime().

On Thu, Sep 2, 2010 at 10:32 PM, Dunia Scheid dunia.sch...@gmail.com wrote:
 Hello all,

 I've 2 strings that representing the start and end values of a date and
 time.
 For example,
 time1 - c(21/04/2005,23/05/2005,11/04/2005)
 time2 - c(15/07/2009, 03/06/2008, 15/10/2005)
 as.difftime(time1,time2)
 Time differences in secs
 [1] NA NA NA
 attr(,tzone)
 [1] 

 How can i calculate the difference  between this 2 string?

 Regards,
 Dunia

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NLS equation self starting non linear

2010-09-02 Thread Gabor Grothendieck

On Thu, Sep 2, 2010 at 7:39 PM, Marlin Keith Cox marlink...@gmail.com wrote:
 This data are kilojoules of energy that are consumed in starving fish over a
 time period (Days).  The KJ reach a lower asymptote and level off and I
 would like to use a non-linear plot to show this leveling off.  The data are
 noisy and the sample sizes not the largest.  I have tried selfstarting
 weibull curves and tried the following, both end with errors.

 Days-c(12, 12, 12, 12, 22, 22, 22, 22, 32, 32, 32, 32, 37, 38, 38, 39, 39,
 39, 39, 39, 39, 39,  1,  1,  1,  1)
 Joules-c(8.782010,  8.540524,  8.507526, 11.296904,  4.232690, 13.026588,
 10.213342,  4.771482,  4.560359,  6.146684, 9.651727,  8.064329,  9.419335,
 7.129264,  6.079012,  7.095888, 17.996794,  7.028287,  8.028352,  5.497564,
 11.607090,  9.987215, 11.065307, 21.433094, 20.366385, 22.475717)
 X11()
 par(cex=1.6)
 plot(Joules~Days,xlab=Days after fasting was initiated,ylab=Mean energy
 per fish (KJ))
 model-nls(joules~a+b*exp(-c*Days),start=list(a=8,b=9,c=-.229),
 control=list(minFactor=1e-12),trace=TRUE)
 summary(model)

Note that Joules is defined above but joules is used as well.  We have
assumed they are the same.

Also note that the formula is linear in log(joules) if a=0 so lets
assume a=0 and use lm to solve.  Also switch to the plinear
algorithm which only requires starting values for the nonlinear
parameters -- in this case only c (which we get from the lm).

 joules - Joules
 coef(lm(log(joules) ~ Days))
(Intercept)Days
 2.61442015 -0.01614845

 # so use 0.016 as starting value of c
 fm - nls(joules~cbind(1, exp(-c*Days)), start = list(c = 0.016), alg = 
 plinear); fm
Nonlinear regression model
  model:  joules ~ cbind(1, exp(-c * Days))
   data:  parent.frame()
  c   .lin1   .lin2
 0.2314  8.3630 13.2039
 residual sum-of-squares: 290.3

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Making plots in big scatterplot matrix large enough to see

2010-09-02 Thread Peter Ehlers


On 2010-09-02 22:16, Jocelyn Paine wrote:

Greg, thanks for the suggestion. That's useful to know for future work.
It's not so good in this case, because I'm making the plots for a
colleague who doesn't know R, and it would be a bother for me to have to
send him several files and him reassemble them. What I did was to use
pairs.panels, as suggested by William Revelle on this thread.

I'd like to ask a general question about the interface though. There's a
size below which individual scatterplots are not legible. It makes no
sense at all for a scatterplot routine to plot them at that size or
smaller. So why didn't the author(s) of 'pairs' program it so that it
always makes them at least legible size, and expands the image window
until it can fit them all in?

Regards,

Jocelyn Paine
http://www.j-paine.org
+44 (0)7768 534 091


Hmm, I had no trouble creating and viewing William's pdf file.
I was also able to view the pairs plot fairly well on my screen.
And that's on my laptop. Perhaps my display has better resolution
than yours.

Your suggestion to expand the image window until it can fit them
all in would, at the very least, involve determining the resolution
of the display device. But even then, there's bound to be someone
who'll want a pairs plot for 1000 variables.

As usual with R, improvements are always welcome.
You could submit appropriate code and, if it is deemed useful,
I'm fairly sure that a better pairs() function will become
part of Rx.xx.x.

  -Peter Ehlers



Jocelyn's Cartoons:
http://www.j-paine.org/blog/jocelyns_cartoons/

On Tue, 31 Aug 2010, Greg Snow wrote:


Look at the pairs2 function in the TeachingDemos package, this lets you
produce smaller portions of the total scatterplot matrix at a time (with
bigger plots), you could print the smaller portions then assemble the
full matrix on a large wall, or just use it to look at potentially
interesting parts.

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
project.org] On Behalf Of Jocelyn Paine
Sent: Monday, August 30, 2010 10:21 PM
To: r-help@r-project.org
Subject: [R] Making plots in big scatterplot matrix large enough to see

I've got a data frame with 23 columns, and wanted to plot a scatterplot
matrix of it. I called
pairs( df )
where 'df' is my data frame. This did generate the matrix, but the
plotting window did not expand to make the individual plots large
enough
to see. Each one was only about 10 pixels high and wide.

I tried sending the plot to a file, with a high and wide image,
by doing
png( plot.png, width = 4000, height = 4000 )
but I got these errors:
Error in png( plot.png, width = 4000, height = 4000 ) :
unable to start device
In addition: Warning messages:
1: In png( plot.png, width = 4000, height = 4000 ) :
   Unable to allocate bitmap
2: In png( plot.png, width = 4000, height = 4000 ) :
   opening device failed

The messages aren't helpful, because they don't tell you _why_ R can't
start the device, allocate it, or open it. The documentation for png
says:
Windows imposes limits on the size of bitmaps: these are not
documented
in the SDK and may depend on the version of Windows. It seems that
width
and height are each limited to 2^15-1.
However, 2^15-1 is 32767, so that isn't the problem here. I tried
various
values for height and width. 2400 was OK, but 2500 wasn't. So it seems
R
can't produce plots that are more than about 2400 pixels square. This
is
with R 2.10.1.

Why is png failing on big images? Also, what's the recommended way to
make
a file containing a scatterplot matrix when you have lots of variables?
'pairs' is a very useful function, but obviously one does need to be
careful when doing this, and I don't know what experts would recommend.
Do
you loop round the variables plotting each pair to a different file? I
was
hoping that I could put them all into one very big image and view parts
of
it at a time.

Thanks,

Jocelyn Paine
http://www.j-paine.org
http://www.spreadsheet-parts.org
+44 (0)7768 534 091

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NLS equation self starting non linear

2010-09-02 Thread Peter Ehlers


On 2010-09-02 22:32, Gabor Grothendieck wrote:

On Thu, Sep 2, 2010 at 7:39 PM, Marlin Keith Coxmarlink...@gmail.com  wrote:

This data are kilojoules of energy that are consumed in starving fish over a
time period (Days).  The KJ reach a lower asymptote and level off and I
would like to use a non-linear plot to show this leveling off.  The data are
noisy and the sample sizes not the largest.  I have tried selfstarting
weibull curves and tried the following, both end with errors.

Days-c(12, 12, 12, 12, 22, 22, 22, 22, 32, 32, 32, 32, 37, 38, 38, 39, 39,
39, 39, 39, 39, 39,  1,  1,  1,  1)
Joules-c(8.782010,  8.540524,  8.507526, 11.296904,  4.232690, 13.026588,
10.213342,  4.771482,  4.560359,  6.146684, 9.651727,  8.064329,  9.419335,
7.129264,  6.079012,  7.095888, 17.996794,  7.028287,  8.028352,  5.497564,
11.607090,  9.987215, 11.065307, 21.433094, 20.366385, 22.475717)
X11()
par(cex=1.6)
plot(Joules~Days,xlab=Days after fasting was initiated,ylab=Mean energy
per fish (KJ))
model-nls(joules~a+b*exp(-c*Days),start=list(a=8,b=9,c=-.229),
control=list(minFactor=1e-12),trace=TRUE)
summary(model)


Note that Joules is defined above but joules is used as well.  We have
assumed they are the same.

Also note that the formula is linear in log(joules) if a=0 so lets
assume a=0 and use lm to solve.  Also switch to the plinear
algorithm which only requires starting values for the nonlinear
parameters -- in this case only c (which we get from the lm).


joules- Joules
coef(lm(log(joules) ~ Days))

(Intercept)Days
  2.61442015 -0.01614845


# so use 0.016 as starting value of c
fm- nls(joules~cbind(1, exp(-c*Days)), start = list(c = 0.016), alg = 
plinear); fm

Nonlinear regression model
   model:  joules ~ cbind(1, exp(-c * Days))
data:  parent.frame()
   c   .lin1   .lin2
  0.2314  8.3630 13.2039
  residual sum-of-squares: 290.3



Keith,

You would get Gabor's solution from your nls() call if you did:

1. replace 'Joules' with 'joules'
2. replace 'c=-.229' with 'c=.229' in your start vector. You
already have the 'minus' in your formula.

I find it useful to add the curve you get with your start values
to the scatterplot to see how reasonable the values are:

plot(Joules~Days)
curve(8 + 9 * exp(-.229 * x), add=TRUE)

After you get a convergent solution, you can add a curve with
the model values.

  -Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] running an exe in the background

2010-09-02 Thread raje...@cse.iitm.ac.in


Hi,

I'd like to be able to run a .exe in the background whenever library(package-x) 
is called. Is this possible?

  ~Aks
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

95 matches

Mail list logo