[R] Significance test, Cronbach's Alpha

2006-06-05 Thread Tom Backer Johnsen
Hello:

I am reviewing a paper at the moment where the author reports a 
Cronbach's Alpha and adds a significance test of the value.  I was not 
aware that such a test exists.  Does it?  If so, how is it done in R?

Tom
++
| Tom Backer Johnsen, Psychometrics Unit,  Faculty of Psychology |
| University of Bergen, Christies gt. 12, N-5015 Bergen,  NORWAY |
| Tel : +47-5558-9185Fax : +47-5558-9879 |
| Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ |
++

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Functions starting with underscores

2006-06-05 Thread Gabor Grothendieck
Use backticks:

attr(`_foo`, bar) - pow

On 6/5/06, Joseph Wang [EMAIL PROTECTED] wrote:
 I'm having problems with functions starting with underscores

 '_foo' - function(x) {1}

 seems to work

 but I can't assign an attribute to this function

 attr('_foo', 'bar') - 'pow'

 Any way of doing this?  This is for a C++ - R wrapping system so I'd like to
 keep the C++ names which start with underscores.

 (please cc: responses to me personally)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Functions starting with underscores

2006-06-05 Thread Philippe Grosjean
It is not allowed to start a variable name with an underscore. So, you 
must use ` to call this non-conventional name:

  '_foo' - function(x) 1
  attr('_foo', 'bar') - 'pow'
Error: target of assignment expands to non-language object
  attr(`_foo`, 'bar') - 'pow'
  '_foo'
[1] _foo
  `_foo`
function(x) 1
attr(,bar)
[1] pow
 

Best,

Philippe Grosjean


Joseph Wang wrote:
 I'm having problems with functions starting with underscores
 
 '_foo' - function(x) {1}
 
 seems to work
 
 but I can't assign an attribute to this function
 
 attr('_foo', 'bar') - 'pow'
 
 Any way of doing this?  This is for a C++ - R wrapping system so I'd like to 
 keep the C++ names which start with underscores.
 
 (please cc: responses to me personally)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] understanding the verbose output in nlme

2006-06-05 Thread Greg Distiller
Thanks for the reply...will give some thought to your suggestion about 
stepping through the function.
I have read the Pinheiro and Bates book, in fact its my primary reference 
for getting into the nonlinear mixed models with R.
Lastly wrt the bit under subjects 1-6, I had thought about it being an 
estimated random effect but in this model there are 2 random effects so not 
sure if that holds...
thanks again...

- Original Message - 
From: Spencer Graves [EMAIL PROTECTED]
To: Greg Distiller [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Sunday, June 04, 2006 8:49 PM
Subject: Re: [R] understanding the verbose output in nlme


   I don't know, but if it were my question, I think I could find
 out by making local copies of the functions involved and stepping
 through the algorithm line by line using debug (see, e.g.,
 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/68215.html;).

   Have you read Pinheiro and Bates (2000) Mixed-Effects Models
 in S and S-Plus?  If no, I encourage you to do so.  Over the past 4
 years or so, I've probably spent more time with this book and referred 
 more people to it than any other.  Doug Bates is a leading original 
 contributor in this area, and I believe you will find this book well worth 
 your money and  your time.

   Regarding the numbers under subjectno1-6, I'm guessing that
 these may be the current estimates of the random effects for the first 6 
 of the 103 subjects.  The purpose of verbose is NOT to dump everything
 but only enough to help you evaluate whether the algorithm seems to be
 converging.

   hope this helps.
   Spencer Graves

 Greg Distiller wrote:
 Hi
 I have found some postings referring to the fact that one can try and 
 understand why a particular model is failing to solve/converge from the 
 verbose output one can generate when fitting a nonlinear mixed model. I 
 am trying to understand this output and have not been able to find out 
 much:

 **Iteration 1
 LME step: Loglik: -237.4517 , nlm iterations: 22
 reStruct  parameters:
   subjectno1   subjectno2   subjectno3   subjectno4   subjectno5 
 subjectno6
  -0.87239181   2.75772772  -0.72892919 -10.36636391   0.55290322 
 0.09878685

 PNLS step: RSS =  60.50164
  fixed effects:2.59129  0.00741764  0.57155
  iterations: 7

 Convergence:
fixed reStruct
 5.740688 2.159285

 I know that the Loglik must refer to the value of the log likelihood 
 function, that the values after fixed effects are the parameter 
 estimates, and that the bit after Convergence obviously has something to 
 so with the convergence criteria for the fixed effects and the random 
 effects structure. I did manage to find a posting where somebody said 
 that the restruct parameter is the log of the relative precision of the 
 random effects? The one thing that is a bit confusing to me is that it 
 appears as if the fixed effects convergence must be zero (or close to it) 
 as one would expect but in one of my converged models the output showed a 
 restruct value of 0.72 ?



 Then I have no idea what the numbers under subjectno1-6 are, especially 
 as I have 103 subjects in the data!



 Can anyone help shed some light on this output and how it can be used to 
 diagnose issues with a model?



 Many thanks



 Greg

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Significance test, Cronbach's Alpha

2006-06-05 Thread falissard
This test can be meaningfull, but only in rare circumstancies (are the items
(or raters) more consistent than what could be expected by chance?).
I would do it with R using a bootstrap procedure (Efron  Tibshirani, An
introduction to the bootstrap, p.156).
Best regards,
Bruno

PS : I am not really a specialist...


Bruno Falissard
INSERM U669, PSIGIAM
Paris Sud Innovation Group in Adolescent Mental Health
Maison de Solenn
97 Boulevard de Port Royal
75679 Paris cedex 14, France
tel : (+33) 6 81 82 70 76
fax : (+33) 1 45 59 34 18
web site : http://perso.wanadoo.fr/bruno.falissard/

 

-Message d'origine-
De : [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] De la part de Tom Backer Johnsen
Envoyé : lundi 5 juin 2006 08:05
À : r-help@stat.math.ethz.ch
Objet : [R] Significance test, Cronbach's Alpha

Hello:

I am reviewing a paper at the moment where the author reports a 
Cronbach's Alpha and adds a significance test of the value.  I was not 
aware that such a test exists.  Does it?  If so, how is it done in R?

Tom
++
| Tom Backer Johnsen, Psychometrics Unit,  Faculty of Psychology |
| University of Bergen, Christies gt. 12, N-5015 Bergen,  NORWAY |
| Tel : +47-5558-9185Fax : +47-5558-9879 |
| Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ |
++

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Problem with mle

2006-06-05 Thread Rainer M Krug
Hi Ben

Thanks a lot for your help - it solved my problem and I understand a
little bit more.

Just one more question along this line:
If I want to use another likelihood estimator, e.g. negative binominal,
I guess I have to use
-sum(dpois(SumSeeds,size=???, prob=???,log=TRUE))
instead of
-sum(dnbinom(SumSeeds,lambda=est,log=TRUE))

But what do I use for size and prob?

Rainer

-- 
Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Department of Conservation Ecology and Entomology
University of Stellenbosch
Matieland 7602
South Africa

Tel:+27 - (0)72 808 2975 (w)
Fax:+27 - (0)21 808 3304
Cell:   +27 - (0)83 9479 042

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Calculation of AIC BIC from mle

2006-06-05 Thread Rainer M KRug
R 2.3.0, all packages up to date
Linux, SuSE 10.0

Hi

I want to calculate AIC or BIC from several results from mle calculation.

I found the AIC function, but it does not seem to work with objects of
class mle -
If I execute the following:
ml1 - mle(...)
AIC(ml1)

I get the following error messale:
Error in logLik(object) : no applicable method for logLik

Therefore I am using the following to calculate the AIC:

#AICmle calculate AIC from mle object
AICmle - function( m, k=2)
{
lL - logLik(m)
edf - attr(lL, df)
LL - lL[1]
- 2 * LL + k * edf
}

1) Why is AIC not working with objects of class mle - am I doing
something wrong, is it a bug or by design?

2) Just for confirmation - is my calculation of AIC correct?

Thanks

Rainer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to call a value labels attribute?

2006-06-05 Thread Martin Maechler
 HeinzT == Heinz Tuechler [EMAIL PROTECTED]
 on Sun, 04 Jun 2006 13:07:22 +0100 writes:

HeinzT At 14:12 03.06.2006 +0200, Martin Maechler wrote:
 Heinz == Heinz Tuechler [EMAIL PROTECTED]
 on Tue, 23 May 2006 01:17:21 +0100 writes:
 
Heinz Dear All, after searching on CRAN I got the
Heinz impression that there is no standard way in R to
Heinz label values of a numerical variable.  
 
 Hmm, there's  names(.)  and  names(.) - ..
 Why are those not sufficient?
 
 x - 1:3
 names(x) - c(apple, banana, NA)

HeinzT Martin,

HeinzT I will considere this. For now I am using an
HeinzT attribute value.labels and a corresponding class to
HeinzT preserve this and other attributes after inclusion
HeinzT in a data.frame and indexing/subsetting, but using
HeinzT names should do as well.  My idea was more like
HeinzT defining a set of value labels for a variable and
HeinzT apply it to all the variable, as e.g. in the
HeinzT following _pseudocode_:

HeinzT ### not run
HeinzT ### pseudocode
HeinzT x - c(1, 2, 3, 3, 2, 3, 1)
HeinzT value.labels(x) - c(apple=1, banana=2, NA=3)
HeinzT x
HeinzT ### desired result
HeinzT apple banana  NA  NA banana NA apple
HeinzT 1  2   3   3  2  3 1

HeinzT value.labels(x) - c(Apfel=1, Banane=2, Birne=3) # redefine labels
HeinzT x
HeinzT ### desired result
HeinzT Apfel Banane Birne Birne Banane Birne Apfel
HeinzT 1  2 3 3  2 3 1

HeinzT value.labels(x) # inspect labels
HeinzT ### desired result
HeinzT Apfel Banane Birne
HeinzT 1  2 3

HeinzT These value.labels should persist even after
HeinzT inclusion in a data.frame and after
HeinzT indexing/subsetting.

As far as I can see,  factor()s and their levels/labels provide
all that.

HeinzT I did not yet try your idea concerning these
HeinzT aspects, but I will do it. My final goal is to do
HeinzT all the data handling on numerically coded variables
HeinzT and to transform to factors on the fly when needed
HeinzT for statistical procedures. Given the presence of
HeinzT value.labels a factor function could use them for
HeinzT the conversion.

HeinzT I described my motivation for all this in a previous post, titled:
HeinzT How to represent a metric categorical variable?
HeinzT There was no response at all and I wonder, if this is such a rare 
problem.

Probably.  I don't see why this can't be done by (ordered)
factors.  But I'm really not particularly expert nor interested
here; I just wanted to make sure you didn't overlook the
obvious.

Martin



HeinzT Thanks,
HeinzT Heinz

 
 
Heinz Since this
Heinz would be useful for me I intend to create such an
Heinz attribute, at the moment for my personal use.  Still
Heinz I would like to choose a name which does not conflict
Heinz with names of commonly used attributes.
 
Heinz Would value.labels or vallabs create conflicts?
 
Heinz The attribute should be structured as data.frame with
Heinz two columns, levels (numeric) and labels
Heinz (character). These could then also be used to
Heinz transform from numeric to factor. If the attribute is
Heinz copied to the factor variable it could also serve to
Heinz retransform the factor to the original numerical
Heinz variable.
 
Heinz Comments? Ideas?
 
Heinz Thanks
 
Heinz Heinz Tüchler
 
Heinz __
Heinz R-help@stat.math.ethz.ch mailing list
Heinz https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
Heinz do read the posting guide!
Heinz http://www.R-project.org/posting-guide.html
 
 

 HeinzT == Heinz Tuechler [EMAIL PROTECTED]
 on Sun, 04 Jun 2006 13:07:22 +0100 writes:

HeinzT At 14:12 03.06.2006 +0200, Martin Maechler wrote:
 Heinz == Heinz Tuechler [EMAIL PROTECTED] on Tue,
 23 May 2006 01:17:21 +0100 writes:

Heinz Dear All, after searching on CRAN I got the
Heinz impression that there is no standard way in R to
Heinz label values of a numerical variable.
  Hmm, there's names(.)  and names(.) - ..  Why are
 those not sufficient?
 
 x - 1:3 names(x) - c(apple, banana, NA)

HeinzT Martin,

HeinzT I will considere this. For now I am using an
HeinzT attribute value.labels and a corresponding class to
HeinzT preserve this and other attributes after inclusion
HeinzT in a data.frame and indexing/subsetting, but using
HeinzT names should do as well.  My idea was more like
HeinzT defining a set of value labels for a variable and
HeinzT apply it to all the variable, as e.g. in the
HeinzT following _pseudocode_:

HeinzT ### not run ### pseudocode x - c(1, 2, 3, 3, 2, 3,
HeinzT 1) value.labels(x) - c(apple=1, banana=2, NA=3) x
HeinzT ### desired 

[R] [R-pkgs] new package QCAGUI

2006-06-05 Thread Adrian DUSA

Dear list members,

I'm pleased to let you know there's a new package on CRAN called QCAGUI, a 
graphical user interface for the QCA package.
This is a stripped down version of John Fox's Rcmdr package, plus a couple of 
menus for QCA.

My thanks to John Fox for his encouragement and advice.

Regards,
Adrian

-- 
Adrian DUSA
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
  +40 21 3120210 / int.101

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Significance test, Cronbach's Alpha

2006-06-05 Thread Jim Lemon
Tom Backer Johnsen wrote:
 Hello:
 
 I am reviewing a paper at the moment where the author reports a 
 Cronbach's Alpha and adds a significance test of the value.  I was not 
 aware that such a test exists.  Does it?  If so, how is it done in R?
 
Hi Tom,

This may be due to the fact that some interpret Cronbach's alpha as a 
correlation between items, thus encouraging the unwary to assume that 
the probability of a numerically equivalent correlation can be used as a 
test of significance.

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to call a value labels attribute?

2006-06-05 Thread Heinz Tuechler
Dimitris,

thank you. I have to considere all the responses to this question and then
your functions may prove to be useful.

Heinz

At 17:01 04.06.2006 +0200, Dimitrios Rizopoulos wrote:
maybe you could consider something like the following:

varlabs - function(x){
if (is.null(names(x))) NULL else x[!duplicated(x)]
}
varlabs- - function(x, value){
names(x) - names(value[x])
x
}
###
x - c(1, 2, 3, 3, 2, 3, 1)
x
varlabs(x)
varlabs(x) - c(apple=1, banana=2, NA=3)
x
varlabs(x)
varlabs(x) - c(Apfel=1, Banane=2, Birne=3)
x
varlabs(x)


I hope it helps.

Best,
Dimitris

 
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


Quoting Heinz Tuechler [EMAIL PROTECTED]:

 At 14:12 03.06.2006 +0200, Martin Maechler wrote:
  Heinz == Heinz Tuechler [EMAIL PROTECTED]
  on Tue, 23 May 2006 01:17:21 +0100 writes:
 
 Heinz Dear All, after searching on CRAN I got the
 Heinz impression that there is no standard way in R to
 Heinz label values of a numerical variable.  
 
 Hmm, there's  names(.)  and  names(.) - ..
 Why are those not sufficient?
 
 x - 1:3
 names(x) - c(apple, banana, NA)
 
 Martin,
 
 I will considere this. For now I am using an attribute value.labels
 and a
 corresponding class to preserve this and other attributes after
 inclusion
 in a data.frame and indexing/subsetting, but using names should do as
 well.
 My idea was more like defining a set of value labels for a variable
 and
 apply it to all the variable, as e.g. in the following _pseudocode_:
 
 ### not run
 ### pseudocode
 x - c(1, 2, 3, 3, 2, 3, 1)
 value.labels(x) - c(apple=1, banana=2, NA=3)
 x
 ### desired result
 apple banana  NA  NA banana NA apple
 1  2   3   3  2  3 1
 
 value.labels(x) - c(Apfel=1, Banane=2, Birne=3) # redefine labels
 x
 ### desired result
 Apfel Banane Birne Birne Banane Birne Apfel
 1  2 3 3  2 3 1
 
 value.labels(x) # inspect labels
 ### desired result
 Apfel Banane Birne
 1  2 3
 
 These value.labels should persist even after inclusion in a
 data.frame and
 after indexing/subsetting.
 I did not yet try your idea concerning these aspects, but I will do
 it. My
 final goal is to do all the data handling on numerically coded
 variables
 and to transform to factors on the fly when needed for statistical
 procedures. Given the presence of value.labels a factor function
 could use
 them for the conversion.
 
 I described my motivation for all this in a previous post, titled:
 How to represent a metric categorical variable?
 There was no response at all and I wonder, if this is such a rare
 problem.
 
 Thanks,
 Heinz
 
 
 
 Heinz Since this
 Heinz would be useful for me I intend to create such an
 Heinz attribute, at the moment for my personal use.  Still
 Heinz I would like to choose a name which does not conflict
 Heinz with names of commonly used attributes.
 
 Heinz Would value.labels or vallabs create conflicts?
 
 Heinz The attribute should be structured as data.frame with
 Heinz two columns, levels (numeric) and labels
 Heinz (character). These could then also be used to
 Heinz transform from numeric to factor. If the attribute is
 Heinz copied to the factor variable it could also serve to
 Heinz retransform the factor to the original numerical
 Heinz variable.
 
 Heinz Comments? Ideas?
 
 Heinz Thanks
 
 Heinz Heinz Tüchler
 
 Heinz __
 Heinz R-help@stat.math.ethz.ch mailing list
 Heinz https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
 Heinz do read the posting guide!
 Heinz http://www.R-project.org/posting-guide.html
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to call a value labels attribute?

2006-06-05 Thread Heinz Tuechler
Richard, Martin,

the example is not ideal, I see. I was strating from the question, how to
represent a metric categorical variable.
By metric categorical variable I intend a variable, which has only few
distinct values and an inherent metric. An example would be a risk score,
which classifies patients in several groups like low, intermediate, high,
extreme with corresponding risk estimates of 0, 1, 2, 5.5.
Other examples could be items in a questionnaire. These items often have
numerical values that may range from -5 to 5.
In some cases, like tables and box-plots these scores/items should be
treated like a factor (with labelled values), in other cases like
cox-regression or when forming an overall score they should be treated like
numeric variables.
I was asking for a convenient way to represent a variable like this, but
there was no response.
The crucial point is that the variables should retain their numerical
values and their value labels.
Without value labels they could be defined as factor and used or directly
or by as.numeric(), because the levels still represent the numerical
values, but as soon as labels are used, the original numerical values get
lost.


Thanks,

Heinz Tüchler

At 12:12 04.06.2006 -0400, Richard M. Heiberger wrote:
How is what you are doing any different from factors?


 x - factor(c(1, 2, 3, 3, 2, 3, 1), labels=c(apple, banana, other))
 x
[1] apple  banana other  other  banana other  apple 
Levels: apple banana other
 as.numeric(x)
[1] 1 2 3 3 2 3 1
 levels(x)[3] - birne
 x
[1] apple  banana birne  birne  banana birne  apple 
Levels: apple banana birne
 


 Original message 
### not run
### pseudocode
x - c(1, 2, 3, 3, 2, 3, 1)
value.labels(x) - c(apple=1, banana=2, NA=3)
x
### desired result
apple banana  NA  NA banana NA apple
1  2   3   3  2  3 1

value.labels(x) - c(Apfel=1, Banane=2, Birne=3) # redefine labels
x
### desired result
Apfel Banane Birne Birne Banane Birne Apfel
1  2 3 3  2 3 1

value.labels(x) # inspect labels
### desired result
Apfel Banane Birne
1  2 3



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nested design

2006-06-05 Thread Mark Difford
Jorn,

For your model,

model-lme(Biomass~Age,random=~1|Age/Stand)

think about nesting age in stand (doesn't that makes more sense, anyway ?).  If 
you're lucky the NaN will zip.  So, do

model - lme( Biomass ~ Age, random = ~ 1 | Stand/Age

I've had a similar problem with unbalanced data when I've tried to nest 
Season%in%Site (which is what I wanted), with Site as a fixed effect as well.  
Then, like you, I've got NaN's for my Site(s).  Changing the nesting to 
Site%in%Season has fixed the problem.  However, random effects are now being 
specified somewhat differently.  Hopefully a statistician will come in and tell 
you and me quite what the difference is, because I get lost in the logic of it.

Note that lmer() copes with the problematic Season%in%Site nesting without 
problems, using the same dataset.


Regards,
Mark

 
Mark DiffordPh.D. candidate, Botany Department,
Nelson Mandela Metropolitan University,
Port Elizabeth, SA.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] strange (to me) ncolumns default value in 'write'

2006-06-05 Thread Martin Maechler
 AntonioF == Antonio, Fabio Di Narzo [EMAIL PROTECTED]
 on Sun, 4 Jun 2006 18:42:35 +0200 writes:

AntonioF Hi all.
AntonioF In 'write' documentation, one can read:

AntonioF write(x, file = data,
AntonioF   ncolumns = if(is.character(x)) 1 else 5,
AntonioF   append = FALSE, sep =  )

AntonioF Now my question is: why the default value of 1column for 
character vectors
AntonioF and the special value '5' for non character vectors?

well, most of us have five fingers per hand, so many consider
'5' to be a 'nice' number :-)
For R, the main reason of using this default, was that S already
had it.  And S has had it not only for the above reason, but I
guess also because nicely, 5 * 16 = 80, and 80 used to be the
customary terminal width in the 70s and 80s, i.e., the good old
times S was created.

AntonioF Nice sunday to all,

and Pentecoste Monday, for those with an extra free day 
like me.

Martin Maechler, ETH Zurich

AntonioF Antonio, Fabio Di Narzo.

AntonioF Note: this is just curiosity. Don't read as: WHY? WHY!?!?, but 
just as
AntonioF why?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] slanted ends of horizontal lines for certain line widths

2006-06-05 Thread Martin Maechler
 Charles == Charles Annis, P E [EMAIL PROTECTED]
 on Sun, 4 Jun 2006 11:43:23 -0400 writes:

Charles I think you want to change par()$lend
Charles Type 

Charles par()  enter

Charles to see the defaults.

Actually, that's one place, where using str(.) is particularly
beneficial:  Use

   str(par())

Charles []

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculation of AIC BIC from mle

2006-06-05 Thread Martin Maechler
 Rainer == Rainer M KRug [EMAIL PROTECTED]
 on Mon, 05 Jun 2006 10:22:02 +0200 writes:

Rainer R 2.3.0, all packages up to date
Rainer Linux, SuSE 10.0

Rainer Hi

Rainer I want to calculate AIC or BIC from several results from mle 
calculation.

Rainer I found the AIC function, but it does not seem to work with objects 
of
Rainer class mle -
Rainer If I execute the following:
Rainer ml1 - mle(...)
Rainer AIC(ml1)

Rainer I get the following error messale:
Rainer Error in logLik(object) : no applicable method for logLik

which is really not helpful as error message, since it is *wrong*:
There is a logLik(.) method for mle objects, and it even
works.

There's some embarassing bug here, since if you quickly look at
the 'stats4' package source, it's very clear that it was designed
to have AIC(), BIC(), logLik() all working, but only the last
one does.


Rainer Therefore I am using the following to calculate the AIC:

Rainer #AICmle calculate AIC from mle object
Rainer AICmle - function( m, k=2)
Rainer {
Rainer lL - logLik(m)
Rainer edf - attr(lL, df)
Rainer LL - lL[1]
Rainer - 2 * LL + k * edf
Rainer }

Rainer 1) Why is AIC not working with objects of class mle - am I doing
Rainer something wrong, is it a bug or by design?

a bug.

Rainer 2) Just for confirmation - is my calculation of AIC correct?

it looks so, but I didn't check.
Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] random sample from a file

2006-06-05 Thread Tibi Codilean
Dear All,


Thanks to all that responded to my earlier query.


I have a file with two columns: an ID field and a Data field. I wish to extract 
multiple random samples of 20 numbers from this file in such a way that the ID 
field is extracted with the data as well. At the moment I am using the 'sample' 
function but this only extracts the data without the IDs. The file is a comma 
separated text file and I read it in using read.csv


Could you please tell me if there is a way of doing this such that every random 
sample contains both the ID and the data.


Thanks

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Significance test, Cronbach's Alpha

2006-06-05 Thread Tom Backer Johnsen
Jim Lemon wrote:
 Tom Backer Johnsen wrote:
 Hello:

 I am reviewing a paper at the moment where the author reports a 
 Cronbach's Alpha and adds a significance test of the value.  I was not 
 aware that such a test exists.  Does it?  If so, how is it done in R?

 Hi Tom,
 
 This may be due to the fact that some interpret Cronbach's alpha as a 
 correlation between items, thus encouraging the unwary to assume that 
 the probability of a numerically equivalent correlation can be used as a 
 test of significance.

That is exactly what I suspect.  Actually, it is not too far off, but 
definitely not the same.

Tom

-- 
++
| Tom Backer Johnsen, Psychometrics Unit,  Faculty of Psychology |
| University of Bergen, Christies gt. 12, N-5015 Bergen,  NORWAY |
| Tel : +47-5558-9185Fax : +47-5558-9879 |
| Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ |
++

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculation of AIC BIC from mle

2006-06-05 Thread Martin Maechler
 MM == Martin Maechler [EMAIL PROTECTED]
 on Mon, 5 Jun 2006 12:04:32 +0200 writes:

 Rainer == Rainer M KRug [EMAIL PROTECTED]
 on Mon, 05 Jun 2006 10:22:02 +0200 writes:

Rainer R 2.3.0, all packages up to date
Rainer Linux, SuSE 10.0

Rainer Hi

Rainer I want to calculate AIC or BIC from several results from mle 
calculation.

Rainer I found the AIC function, but it does not seem to work with objects 
of
Rainer class mle -
Rainer If I execute the following:
Rainer ml1 - mle(...)
Rainer AIC(ml1)

Rainer I get the following error messale:
Rainer Error in logLik(object) : no applicable method for logLik

MM which is really not helpful as error message, since it is *wrong*:
MM There is a logLik(.) method for mle objects, and it even
MM works.

Hence -- as I forgot to mention in my last e-mail --
for now, you can just use

AIC(logLik(ml1)) ## AIC
or  AIC(logLik(ml1), k = log(nobs) ## BIC

where you have to fill in  nobs  yourself.
Martin

MM There's some embarassing bug here, since if you quickly look at
MM the 'stats4' package source, it's very clear that it was designed
MM to have AIC(), BIC(), logLik() all working, but only the last
MM one does.


Rainer Therefore I am using the following to calculate the AIC:

Rainer #AICmle calculate AIC from mle object
Rainer AICmle - function( m, k=2)
Rainer {
Rainer lL - logLik(m)
Rainer edf - attr(lL, df)
Rainer LL - lL[1]
Rainer - 2 * LL + k * edf
Rainer }

Rainer 1) Why is AIC not working with objects of class mle - am I doing
Rainer something wrong, is it a bug or by design?

MM a bug.

Rainer 2) Just for confirmation - is my calculation of AIC correct?

MM it looks so, but I didn't check.
MM Martin

MM __
MM R-help@stat.math.ethz.ch mailing list
MM https://stat.ethz.ch/mailman/listinfo/r-help
MM PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculation of AIC BIC from mle

2006-06-05 Thread Prof Brian Ripley
You are mixing S3 and S4 classes and generics here.  AIC(logLik(ml1)) will 
work.

This a namespace issue: stats4 contains a version of logLik, but the 
default method for AIC is from stats, and so dispatches on stats::logLik 
and not stats4::logLik.  There is something fundamentally unsatisfactory 
about converting S3 generics to S4 generics where namespaces are involved.

I have put a workaround in R-devel.

On Mon, 5 Jun 2006, Rainer M KRug wrote:

 R 2.3.0, all packages up to date
 Linux, SuSE 10.0

 Hi

 I want to calculate AIC or BIC from several results from mle calculation.

 I found the AIC function, but it does not seem to work with objects of
 class mle -
 If I execute the following:
 ml1 - mle(...)
 AIC(ml1)

 I get the following error messale:
 Error in logLik(object) : no applicable method for logLik

 Therefore I am using the following to calculate the AIC:

 #AICmle calculate AIC from mle object
 AICmle - function( m, k=2)
 {
   lL - logLik(m)
   edf - attr(lL, df)
   LL - lL[1]
   - 2 * LL + k * edf
 }

 1) Why is AIC not working with objects of class mle - am I doing
 something wrong, is it a bug or by design?

 2) Just for confirmation - is my calculation of AIC correct?

Not quite.  The correct function is

 stats:::AIC.logLik
function (object, ..., k = 2)
-2 * c(object) + k * attr(object, df)
environment: namespace:stats

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] plot with different color

2006-06-05 Thread Jim Lemon
array chip wrote:
 Hi how can I plot a series of number as a line, but
 with lines above a threshould as one color, and with
 lines below the threshold as another color. for
 example, a numeric vector: rnorm(1:100), and plot
 these numbers in the original order, but lines above
 the horizontal line at 0 are in red, and lines below
 in blue?
 
Hi array (or is it, Hi chip?),

Is this what you want?

x-rnorm(100)
x11(width=7,height=7)
oldpar-par(no.readonly=TRUE)
xmai-par(mai)
par(yaxs=i)
plot(x,type=n,ylim=c(-3,3),xlab=Observation,ylab=Value)
par(mai=c(3.5,xmai[2:4]),yaxs=i,new=TRUE)
plot(x,type=l,col=blue,ylim=c(0,3),axes=FALSE,xlab=,ylab=)
par(mai=c(xmai[1:2],3.5,xmai[4]),new=TRUE)
plot(x,type=l,col=red,ylim=c(-3,0),axes=FALSE,xlab=,ylab=)
par(oldpar)

This _is_ an ugly kludge.

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] significance test and cronbach

2006-06-05 Thread Tom Backer Johnsen
Erin Hodgess wrote:
 Date: Mon, 05 Jun 2006 08:05:04 +0200
 From: Tom Backer Johnsen [EMAIL PROTECTED]
 User-Agent: Thunderbird 1.5.0.4 (Windows/20060516)
 MIME-Version: 1.0
 To: r-help@stat.math.ethz.ch
 References: [EMAIL PROTECTED]
 In-Reply-To: [EMAIL PROTECTED]
 X-checked-clean: by exiscan on noralf
 X-Scanner: adae8ccb25ab4e5e1861f38a682801f6 http://tjinfo.uib.no/virus.html
 X-UiB-SpamFlag: NO UIB: -20.4 hits, 8.0 required
 X-UiB-SpamReport: spamassassin found; -15 From is listed in 'whitelist_SA'
   -9.0 Message received from UIB
   -0.4 Did not pass through any untrusted hosts
   4.0 BODY: Probably more lottery
 X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch
 Subject: [R] Significance test, Cronbach's Alpha
 X-BeenThere: r-help@stat.math.ethz.ch
 X-Mailman-Version: 2.1.8
 Precedence: list
 Hi Tom!
 
 There is a package in R called psy
 
 One of the functions is cronbach

Yes, and that function computes Cronbach's Alpha.
 
 I don't know about psychology, so I don't know really
 how it is all constructed.

That is a standard procedure, and rather easy to write in R.  My 
problem is different and has to do with a signigicance test of the value.

Tom

++
| Tom Backer Johnsen, Psychometrics Unit,  Faculty of Psychology |
| University of Bergen, Christies gt. 12, N-5015 Bergen,  NORWAY |
| Tel : +47-5558-9185Fax : +47-5558-9879 |
| Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ |
++

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] FW: How to create a new package?

2006-06-05 Thread Rita Sousa
Hi,
Thanks very much for your help. I have already created a test package with
the package.skeleton() function and the procedures in the article R Help
Desk: Make R CMD work under Windows - R News 5 (2), 27-28.
Now, my question is:
How can I install this package in others PC's without install there the Perl
and the remaining components?
Thanks, 
---
Rita Sousa
DME - ME: Departamento de Metodologia Estatística - Métodos Estatísticos
INE - DRP: Instituto Nacional de Estatística - Delegação Regional do Porto 
Tel.: 22 6072016 (Extensão: 4116)
--- 

-Original Message-
From: Uwe Ligges [mailto:[EMAIL PROTECTED] 
Sent: quinta-feira, 1 de Junho de 2006 12:43
To: michael watson (IAH-C)
Cc: Gabor Grothendieck; Rita Sousa; r-help@stat.math.ethz.ch
Subject: Re: [R] FW: How to create a new package?

michael watson (IAH-C) wrote:

 ?package.skeleton 


Folks, please!


Rita Sousa told you she already has a DESCRIPTION file.

Obviously, Rita forgot to build and *install* the package using
R CMD build
and
R CMD INSTALL

(the Meta directory is creating during the installation procedure)

Please note that using the Windows GUI, you can only install binary 
packages, but you have got a source package so far. Hence you need to 
install from the Windows command shell using R CMD INSTALL.

Please see the R Installation and Administration manual on how to 
install packages. For some examples of what is mentioned in that manual 
(how to set stuff up in Windows), you might additionally want to take a 
look into the article R Help Desk: Make `R CMD' Work under Windows - an 
Example in R News 5 (2), 27-28.

Best,
Uwe Ligges



 -Original Message-
 From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck
 Sent: 01 June 2006 12:20
 To: Rita Sousa
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] FW: How to create a new package?
 
 The minimum is to create a DESCRIPTION file, plus R and man directories
containing R code and .Rd files respectively.
 It might help to run  Rcmd CHECK mypkg  before installation and fix any
problems it finds.
 
 Googling for   creating R package   will locate some tutorials.
 
 
 On 6/1/06, Rita Sousa [EMAIL PROTECTED] wrote:
 
Hi,



I'm a group of functions and I would like to create a package for load in
R.
I have created a directory named INE and a directory below that named 
R, for the files of R functions. A have created the files DESCRIPTION 
and INDEX in the INE directory. The installation from local zip files, 
in the R 2.3.0, results but to load the package I get an error like:



'INE' is not a valid package -- installed  2.0.0?



I think that is necessary create a Meta directory with package.rds 
file, but I don't know make it! I have read the manual 'Writing R
Extensions - 1.
Creating R packages' but I don't understand the procedure...

Can I create it automatically?



Could you help me with this?



Thanks,

---
Rita Sousa
DME - ME: Departamento de Metodologia Estatística - Métodos 
Estatísticos INE - DRP: Instituto Nacional de Estatística - Delegação 
Regional do Porto
Tel.: 22 6072016 (Extensão: 4116)
---




   [[alternative HTML version deleted]]



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Significance test, Cronbach's Alpha

2006-06-05 Thread Tom Backer Johnsen
falissard wrote:
 This test can be meaningfull, but only in rare circumstancies (are the items
 (or raters) more consistent than what could be expected by chance?).

Do you have a reference to such a test?

 I would do it with R using a bootstrap procedure (Efron  Tibshirani, An
 introduction to the bootstrap, p.156).

I agree.  That should be possible, actually quite simple.  Run through 
the data a large number of times, randomize the sequence of the 
observations for each case and compute alpha for each run.

Tom
++
| Tom Backer Johnsen, Psychometrics Unit,  Faculty of Psychology |
| University of Bergen, Christies gt. 12, N-5015 Bergen,  NORWAY |
| Tel : +47-5558-9185Fax : +47-5558-9879 |
| Email : [EMAIL PROTECTED]URL : http://www.galton.uib.no/ |
++

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Significance test, Cronbach's Alpha

2006-06-05 Thread Jonathan Baron
On 06/05/06 19:19, Jim Lemon wrote:
 Tom Backer Johnsen wrote:
  Hello:
 
  I am reviewing a paper at the moment where the author reports a
  Cronbach's Alpha and adds a significance test of the value.  I was not
  aware that such a test exists.  Does it?  If so, how is it done in R?
 
 Hi Tom,
 
 This may be due to the fact that some interpret Cronbach's alpha as a
 correlation between items, thus encouraging the unwary to assume that
 the probability of a numerically equivalent correlation can be used as a
 test of significance.

SPSS has a test for alpha.  I don't know what it does.

It also seems to me that, if the assumptions of alpha are met,
then the assumptions of analysis of variance are also met.  In
particular, alpha equals the reliability if the measurements
(items) are parallel (Lord and Novick, Statistical theories of
mental test scores, 1968, pp. 47 and 90 in particular).  That
means (among other things) that they have equal true variances.
If this can be assumed, then you can do an ANOVA using items and
subjects, and look for a significant effect of subjects.

If the measures are not parallel, then alpha is a lower bound on
the reliability of the test, so an ANOVA might be conservative,
but I have not thought this through.

It is rare to see anyone report a test for alpha because it is
usually used descriptively.  If it isn't .7 or higher, people get 
upset, yet even .5 would be wildly significant in most cases.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] FW: How to create a new package?

2006-06-05 Thread Prof Brian Ripley

R CMD INSTALL --build  produces a binary package in a zip file.
See the `R Installation and Administration manual'.

On Mon, 5 Jun 2006, Rita Sousa wrote:


Hi,
Thanks very much for your help. I have already created a test package with
the package.skeleton() function and the procedures in the article R Help
Desk: Make R CMD work under Windows - R News 5 (2), 27-28.
Now, my question is:
How can I install this package in others PC's without install there the Perl
and the remaining components?
Thanks,
---
Rita Sousa
DME - ME: Departamento de Metodologia Estatística - Métodos Estatísticos
INE - DRP: Instituto Nacional de Estatística - Delegação Regional do Porto
Tel.: 22 6072016 (Extensão: 4116)
---

-Original Message-
From: Uwe Ligges [mailto:[EMAIL PROTECTED]
Sent: quinta-feira, 1 de Junho de 2006 12:43
To: michael watson (IAH-C)
Cc: Gabor Grothendieck; Rita Sousa; r-help@stat.math.ethz.ch
Subject: Re: [R] FW: How to create a new package?

michael watson (IAH-C) wrote:


?package.skeleton



Folks, please!


Rita Sousa told you she already has a DESCRIPTION file.

Obviously, Rita forgot to build and *install* the package using
R CMD build
and
R CMD INSTALL

(the Meta directory is creating during the installation procedure)

Please note that using the Windows GUI, you can only install binary
packages, but you have got a source package so far. Hence you need to
install from the Windows command shell using R CMD INSTALL.

Please see the R Installation and Administration manual on how to
install packages. For some examples of what is mentioned in that manual
(how to set stuff up in Windows), you might additionally want to take a
look into the article R Help Desk: Make `R CMD' Work under Windows - an
Example in R News 5 (2), 27-28.

Best,
Uwe Ligges




-Original Message-
From: [EMAIL PROTECTED]

[mailto:[EMAIL PROTECTED] On Behalf Of Gabor Grothendieck

Sent: 01 June 2006 12:20
To: Rita Sousa
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] FW: How to create a new package?

The minimum is to create a DESCRIPTION file, plus R and man directories

containing R code and .Rd files respectively.

It might help to run  Rcmd CHECK mypkg  before installation and fix any

problems it finds.


Googling for   creating R package   will locate some tutorials.


On 6/1/06, Rita Sousa [EMAIL PROTECTED] wrote:


Hi,



I'm a group of functions and I would like to create a package for load in

R.

I have created a directory named INE and a directory below that named
R, for the files of R functions. A have created the files DESCRIPTION
and INDEX in the INE directory. The installation from local zip files,
in the R 2.3.0, results but to load the package I get an error like:



'INE' is not a valid package -- installed  2.0.0?



I think that is necessary create a Meta directory with package.rds
file, but I don't know make it! I have read the manual 'Writing R

Extensions - 1.

Creating R packages' but I don't understand the procedure...

Can I create it automatically?



Could you help me with this?



Thanks,

---
Rita Sousa
DME - ME: Departamento de Metodologia Estatística - Métodos
Estatísticos INE - DRP: Instituto Nacional de Estatística - Delegação
Regional do Porto
Tel.: 22 6072016 (Extensão: 4116)
---




  [[alternative HTML version deleted]]



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html





__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!

http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] making matrix monotonous

2006-06-05 Thread vincent
Dear all,

I'm currently working on correlation matrix.
The function image() is quite useful for visualization.

Before using image(), I'd like to sort the matrix rows/columns,
in order to make the matrix the more monotonous/smooth possible.

Before reinventing the wheel, is there somebody here aware if
such a function already exists ?

Thanks.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random sample from a file

2006-06-05 Thread Petr Pikal
Hi

sample a vector from 1 to the number of rows in of your file and 
subset a file with sampled values

 iris[sample(1:nrow(iris), 20),]
Sepal.Length Sepal.Width Petal.Length Petal.WidthSpecies
99   5.1 2.5  3.0 1.1 versicolor
53   6.9 3.1  4.9 1.5 versicolor
40   5.1 3.4  1.5 0.2 setosa
146  6.7 3.0  5.2 2.3  virginica
93   5.8 2.6  4.0 1.2 versicolor


HTH
Petr

On 5 Jun 2006 at 11:09, Tibi Codilean wrote:

Date sent:  Mon, 5 Jun 2006 11:09:50 +0100
From:   Tibi Codilean [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] random sample from a file

 Dear All,
 
 
 Thanks to all that responded to my earlier query.
 
 
 I have a file with two columns: an ID field and a Data field. I wish
 to extract multiple random samples of 20 numbers from this file in
 such a way that the ID field is extracted with the data as well. At
 the moment I am using the 'sample' function but this only extracts the
 data without the IDs. The file is a comma separated text file and I
 read it in using read.csv
 
 
 Could you please tell me if there is a way of doing this such that
 every random sample contains both the ID and the data.
 
 
 Thanks
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculation of AIC BIC from mle

2006-06-05 Thread Rainer M Krug
Thanks a lot for your and Martin's help - I understand now why it is not
working and am using AIC(logLik(ml1)).

Thanks

Rainer


Prof Brian Ripley wrote:
 You are mixing S3 and S4 classes and generics here.  AIC(logLik(ml1))
 will work.
 
 This a namespace issue: stats4 contains a version of logLik, but the
 default method for AIC is from stats, and so dispatches on stats::logLik
 and not stats4::logLik.  There is something fundamentally unsatisfactory
 about converting S3 generics to S4 generics where namespaces are involved.
 
 I have put a workaround in R-devel.
 
 On Mon, 5 Jun 2006, Rainer M KRug wrote:
 
 R 2.3.0, all packages up to date
 Linux, SuSE 10.0

 Hi

 I want to calculate AIC or BIC from several results from mle calculation.

 I found the AIC function, but it does not seem to work with objects of
 class mle -
 If I execute the following:
 ml1 - mle(...)
 AIC(ml1)

 I get the following error messale:
 Error in logLik(object) : no applicable method for logLik

 Therefore I am using the following to calculate the AIC:

 #AICmle calculate AIC from mle object
 AICmle - function( m, k=2)
 {
 lL - logLik(m)
 edf - attr(lL, df)
 LL - lL[1]
 - 2 * LL + k * edf
 }

 1) Why is AIC not working with objects of class mle - am I doing
 something wrong, is it a bug or by design?

 2) Just for confirmation - is my calculation of AIC correct?
 
 Not quite.  The correct function is
 
 stats:::AIC.logLik
 function (object, ..., k = 2)
 -2 * c(object) + k * attr(object, df)
 environment: namespace:stats
 


-- 
Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Department of Conservation Ecology and Entomology
University of Stellenbosch
Matieland 7602
South Africa

Tel:+27 - (0)72 808 2975 (w)
Fax:+27 - (0)21 808 3304
Cell:   +27 - (0)83 9479 042

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Status of data.table package

2006-06-05 Thread Eric Rexstad
List:

This package was described in R News of May 2006.  However, I cannot 
find it on CRAN mirrors, and use of RSiteSearch is unsatisfying.  I am 
likewise unable to find an email address for the maintainer of the 
package.  Thank you in advance for assistance.

-- 
Eric Rexstad
Research Unit for Wildlife Population Assessment
Centre for Research into Ecological and Environmental Modelling
University of St. Andrews
St. Andrews Scotland KY16 9LZ
+44 (0)1334 461833

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Tcltk Default Background Color Question

2006-06-05 Thread Dirk Eddelbuettel
On Sun, Jun 04, 2006 at 09:55:20PM -0700, Crimson Turquoise wrote:
 Hello,
 
 I am new at tcltk and would like to identify the default background
 color for widgets.  I have found things like systemBackground or
 tk_setPalette, but am not able to get them to work in R.  Is there a
 name for this color such as lightgray, etc?

This works for me:

   tkcmd(tk_setPalette,gray93) # set am overall background color

Dirk


-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] evir: generalized pareto dist

2006-06-05 Thread Werner Tell
Hi,

I'm fitting a generalized Pareto distribution to POT exceedances of a data 
set. The practical stuff works ok, but I have a question regarding theory.
Is there an equation relating parameters of a gpd tail to its (first) 
moments? According to theory for certain parameters either the first moment 
does not exist or the distribution has an upper bound, but I haven't found 
the mentioned relation about moments in the literatur.

Thanks a lot.
wt

_
Die Vielfalt der Optionen lässt Sie im Internet erfolgreich recherchieren.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] negative binomial: expected number of events?

2006-06-05 Thread Werner Tell
Hi

I'm fitting poisson and negative binomial distributions to event data. I'm 
interested in the expected number of events occuring in a time period. For 
the poisson this is determined by the parameter lambda only.
For the neg. binomial, is the expected number of events determined by the 
parameter mu only or does parameter size influence the first moment as 
well?

thank you,
wt

_
Die MSN Homepage liefert Ihnen alle Infos zu Ihren Lieblingsthemen.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Status of data.table package

2006-06-05 Thread Liaw, Andy
It was announced on the R-devel list:
http://tolstoy.newcastle.edu.au/R/devel/06/04/4923.html

I don't see it on CRAN, either, nor could I find mention of it in the R News
you cited.

Andy

From: Eric Rexstad
 
 List:
 
 This package was described in R News of May 2006.  However, I 
 cannot find it on CRAN mirrors, and use of RSiteSearch is 
 unsatisfying.  I am likewise unable to find an email address 
 for the maintainer of the package.  Thank you in advance for 
 assistance.
 
 --
 Eric Rexstad
 Research Unit for Wildlife Population Assessment Centre for 
 Research into Ecological and Environmental Modelling 
 University of St. Andrews St. Andrews Scotland KY16 9LZ
 +44 (0)1334 461833
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] case insensitive regular expression

2006-06-05 Thread JeeBee
Can anybody fix the following, to match a string
disregarding case?

I tried: 
with(my.data.frame, regexpr(pattern = '(?i)J\. Test', 
  text=TARGET_COLUMN), perl=T) != -1)

R yields: invalid regular expression

with(my.data.frame, regexpr(pattern = 'J\. Test', 
  text=TARGET_COLUMN), perl=T) != -1)
works, but it does not match J. TEST.
It does match J. Test correctly.

I have the feeling I'm not reading the documentation
good enough, but cannot seem to figure this out, sorry.

Thanks in advance,
JeeBee.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] case insensitive regular expression

2006-06-05 Thread JeeBee
Found the problem already.
Without the 'with' it works, so I know where to look for the problem now.

Thanks anyways,
JeeBee.

On Mon, 05 Jun 2006 15:16:34 +0200, JeeBee wrote:

 Can anybody fix the following, to match a string
 disregarding case?
 
 I tried: 
 with(my.data.frame, regexpr(pattern = '(?i)J\. Test', 
   text=TARGET_COLUMN), perl=T) != -1)
 
 R yields: invalid regular expression
 
 with(my.data.frame, regexpr(pattern = 'J\. Test', 
   text=TARGET_COLUMN), perl=T) != -1)
 works, but it does not match J. TEST.
 It does match J. Test correctly.
 
 I have the feeling I'm not reading the documentation
 good enough, but cannot seem to figure this out, sorry.
 
 Thanks in advance,
 JeeBee.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Status of data.table package

2006-06-05 Thread Roger Bivand
On Mon, 5 Jun 2006, Liaw, Andy wrote:

 It was announced on the R-devel list:
 http://tolstoy.newcastle.edu.au/R/devel/06/04/4923.html
 
 I don't see it on CRAN, either, nor could I find mention of it in the R News
 you cited.

In the archive:

http://cran.r-project.org/src/contrib/Archive/D/data.table_1.0.tar.gz

Roger

 
 Andy
 
 From: Eric Rexstad
  
  List:
  
  This package was described in R News of May 2006.  However, I 
  cannot find it on CRAN mirrors, and use of RSiteSearch is 
  unsatisfying.  I am likewise unable to find an email address 
  for the maintainer of the package.  Thank you in advance for 
  assistance.
  
  --
  Eric Rexstad
  Research Unit for Wildlife Population Assessment Centre for 
  Research into Ecological and Environmental Modelling 
  University of St. Andrews St. Andrews Scotland KY16 9LZ
  +44 (0)1334 461833
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] negative binomial: expected number of events?

2006-06-05 Thread Ritwik Sinha
Hi,

This page should answer your questions.

http://mathworld.wolfram.com/NegativeBinomialDistribution.html

Ritwik Sinha
http://darwin.cwru.edu

On 6/5/06, Werner Tell [EMAIL PROTECTED] wrote:

 Hi

 I'm fitting poisson and negative binomial distributions to event data. I'm
 interested in the expected number of events occuring in a time period. For
 the poisson this is determined by the parameter lambda only.
 For the neg. binomial, is the expected number of events determined by the
 parameter mu only or does parameter size influence the first moment as
 well?

 thank you,
 wt

 _
 Die MSN Homepage liefert Ihnen alle Infos zu Ihren Lieblingsthemen.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Status of data.table package

2006-06-05 Thread Bjørn-Helge Mevik
Liaw, Andy wrote:

 I don't see it on CRAN, either, nor could I find mention of it in the R News
 you cited.

p. 66

-- 
Bjørn-Helge Mevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] negative binomial: expected number of events?

2006-06-05 Thread Ritwik Sinha
Hi,

The wikipedia page http://en.wikipedia.org/wiki/Negative_binomial also does
a great job of explaining the negative binomial distribution.

-- 
Ritwik Sinha
Graduate Student
Epidemiology and Biostatistics
Case Western Reserve University

http://darwin.cwru.edu/~rsinha

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to call a value labels attribute?

2006-06-05 Thread Richard M. Heiberger
Aha!  Thank you for the more detailed example.

My solution for that situation is an attribute position and function
as.position().  I use this in my book

Statistical Analysis and Data Display
Richard M. Heiberger and Burt Holland

The online files for the book are available at
http://springeronline.com/0-387-40270-5



For this example, you need the function as.position() included in this
email.


### example 
x - ordered(c(1,2,3,2,4,3,1,2,4,3,2,1,3),
 labels=c(small, medium, large, very.large))
x
attr(x, position) - c(1,2,4,8)
x
as.position(x)

y - rnorm(length(x))
y

xyplot(y ~ x)
source(~/h2/library/code/as.position.s)
xyplot(y ~ as.position(x))
xyplot(y ~ as.position(x),
   scale=list(x=list(at=attr(x,position), labels=levels(x
xyplot(y ~ as.position(x),
   scale=list(x=list(at=attr(x,position), labels=levels(x))),
   xlab=x)
### end example 



### as.position.s #
as.position - function(x) {
  if (is.numeric(x))
x
  else {
if (!is.factor(x)) stop(x must be either numeric or factor.)

if (!is.null(attr(x, position)))
  x - attr(x, position)[x]
else {
  lev.x - levels(x)
  if (inherits(x, ordered)) {
on.exit(options(old.warn))
old.warn - options(warn=-1)
if (!any(is.na(as.numeric(lev.x
  x - as.numeric(lev.x)[x]
else
  x - as.numeric(ordered(lev.x, lev.x))[x]
  }
  else
x - as.numeric(x)
}
  }
  x
}


## tmp - ordered(c(c,b,f,f,c,b), c(c,b,f))
## as.numeric(tmp)
## as.position(tmp)
## 
## tmp - factor(c(c,b,f,f,c,b))
## as.numeric(tmp)
## as.position(tmp)
## 
## tmp - factor(c(1,3,5,3,5,1))
## as.numeric(tmp)
## as.position(tmp)
## 
## tmp - ordered(c(1,3,5,3,5,1))
## as.numeric(tmp)
## as.position(tmp)
## 
## tmp - c(1,3,5,3,5,1)
## as.numeric(tmp)
## as.position(tmp)

### end as.position.s #

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to compute Commutation Matrix With R

2006-06-05 Thread justin bem
Hi 
Is there as specific function to compute a commutation matrix ?
 
Sincerly !  
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Problem with Sweave

2006-06-05 Thread David Hajage
I was trying to use Sweave function.

Do you know why the following code in my .Rnw file:

results=tex, echo=FALSE=
print(xtable(stat.table(list(Couverture du livret = couv),  list(n =
count(), \\% = percent(couv, floating = F)
@
\\

NA = \Sexpr{table(is.na(couv))[2]}

gives me this result in my .tex file :

% latex table generated in R 2.3.1 by xtable 1.3-2 package
% Mon Jun  5 13:24:15 2006
\begin{tabular}{rrr}
\hline
  n  \% \\
\hline
Plutot negatif  8.00  21.62 \\
Plutot positif  29.00  78.38 \\
\hline
\end{tabular}\\

NA = \Sexpr{table( is.na(couv))[2]}

and not :

NA = 26

(where 26 is the result of table(is.na(couv))[2])

?

I don't understand !

-- 
David

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] merge function in R

2006-06-05 Thread Don MacQueen
(See insertions below)

At 8:49 PM -0300 6/2/06, Juan Santiago Ramseyer wrote:
Em Qui, 2006-06-01 às 05:59 -0700, Ahamarshan jn escreveu:
  hi list,

  This question must be very basic but I am just 3 days
  old to R. I am trying to find a function to merge two
  tables of data in two different files as one.

   Does merge function only fills in colums between two
  table where data is missing or is there a way that
  merge can be used to merge data between two matrixes
  with common dimensions.

  say i have

 v1 v2 v3 v4  
  h1
  h2
  h3
  h4
  h5

  and and another table with

 x1 x2 x3 x4
  h1
  h2
  h3
  h4
  h5


  can i merge the data as

 v1 x1 v2 x2 v3 x3 v4 x4
  h1
  h2
  h3
  h4
  h5

  Thanks

hi

testing the following sequence

# matrix examples
x - seq(from=-1,to=-20,step =-1)
v - seq(1:20)
dim(x) - c(5,4)
dim(v) - c(5,4)

vx - rep(NA,times=40)
dim(vx) - c(5,8)

# answer init
for (i in 1:4) {
   vx[,2*i-1] - v[,i]
   vx[,2*i] - x[,i]
}
# answer end

The loop is more complicated than necessary.

vx2 - matrix(numeric(40),ncol=8)
vx2[,c(1,3,5,7)] - v
vx2[,c(2,4,6,8)] - x

## or more generally
vx2[ ,seq(1,by=2,len=4)] - v
vx2[ ,seq(2,by=2,len=4)] - x

## if the order of columns did not matter then the simplest is
vx3 - cbind(v,x)

-Don

Juan Santiago Ramseyer
Eng. Recursos Hídricos


  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-- 
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] anova for the objects of list

2006-06-05 Thread Hai Lin
Dear R experts,

I have two list objects from runing lme model for each
subjects. And my.x is the full model with all terms I
am interested in and my.x1 is the null model with only
intercept.

I am trying to compare two models at each subject
level. (Doing anova(my.x[[1]], my.x1[[1]]) is to
produce stats for the first subject). My goal is to
obtain stats. for an entire set with hundreds of
subjects. I did something like this, lapply(my.Tem,
function(x)anova(x)), but my.Tem actually is not a
list after combining. It did not work the way as I
expected.

I am very thankful if anyone here could point me out.

Kevin


5 subjects extracted as my pilot data.
 class(my.x)
[1] list
 length(my.x)
[1] 5
 class(my.x1)
[1] list
 length(my.x1)
[1] 5 


my.Tem - cbind(my.x, my.x1)
 class(my.Tem)
[1] matrix
 my.Tem
   my.xmy.x1  
s1 List,15 List,15
s2 List,15 List,15
s3 List,15 List,15
s4 List,15 List,15
s5 List,15 List,15

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Survey - twophase

2006-06-05 Thread Mark Hempelmann
Dear WizaRds,

I am struggling with the use of twophase in package survey. My goal 
is to compute a simple example in two phase sampling:

phase 1: I sample n1=1000 circuit boards and find 80 non functional
phase 2: Given the n1=1000 sample I sample n2=100 and find 15 non 
functional. Let's say, phase 2 shows this result together with phase 1:
...phase1
...ok defunct
phase2 ok..850.85
...defunct..5...10.15
sum90...10100

That is in R:
fail - data.frame(id=1:1000 , x=c(rep(0,920), rep(1,80)), 
y=c(rep(0,985), rep(1,15)), n1=rep(1000,1000), n2=rep(100,1000), 
N=rep(5000,1000))

des.fail- twophase(id=list(~id,~id), data=fail, subset=~I(x==1)) 
#fpc=list(~n1,~n2)
svymean(~y, des.fail)

gives mean y 0.1875, SE 0.0196, but theoretically,
we have x.bar1 (phase1)=0.08 and y.bar2 (phase2)=0.15 defect boards.

Two phase sampling assumes some relation between the easily/ fast 
received x-information and the elaborate/ time-consuming y-information, 
say a ratio r=sum y (phase2)/ sum x (phase2)=15/10=1.5 (out of the above 
table)
Ergo, the y.ratio estimator = r*x.bar(phase1) = 1.5*0.08 = 0.12 with 
variance = (n1-n2)/n1 * s_regression^2/n2 + s_y^2/n1 = 900/1000 * 
0.0765/100 + 0.129/1000 = .00081 SE .02846
with s_regression^2 =
yk=c(rep(0,85), rep(1,15)); xk=c(rep(0,90), rep(1,10))
1/98*sum((yk-1.5*xk)^2)
and
s_yk^2 =
1/99 * sum( (yk-.15)^2)=0.1287879


I am sorry to bother you with my false calculations, but I just don't 
know how to receive the correct results. Please help. My example is 
taken from Kauermann/ Kuechenhoff 2006, p. 111f.

thank you so much
yours always

mark

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Survey - twophase

2006-06-05 Thread Thomas Lumley
On Mon, 5 Jun 2006, Mark Hempelmann wrote:

 Dear WizaRds,

I am struggling with the use of twophase in package survey. My goal
 is to compute a simple example in two phase sampling:

 phase 1: I sample n1=1000 circuit boards and find 80 non functional
 phase 2: Given the n1=1000 sample I sample n2=100 and find 15 non
 functional. Let's say, phase 2 shows this result together with phase 1:
 ...phase1
 ...ok defunct
 phase2 ok..850.85
 ...defunct..5...10.15
 sum90...10100

 That is in R:
 fail - data.frame(id=1:1000 , x=c(rep(0,920), rep(1,80)),
 y=c(rep(0,985), rep(1,15)), n1=rep(1000,1000), n2=rep(100,1000),
 N=rep(5000,1000))

 des.fail- twophase(id=list(~id,~id), data=fail, subset=~I(x==1))
 #fpc=list(~n1,~n2)

The second-phase sample is described by subset=~I(x==1), so you have 
sampled only 80 in phase two, not 100.

 svymean(~y, des.fail)

 gives mean y 0.1875, SE 0.0196, but theoretically,
 we have x.bar1 (phase1)=0.08 and y.bar2 (phase2)=0.15 defect boards.

15/80=0.1875

 Two phase sampling assumes some relation between the easily/ fast
 received x-information and the elaborate/ time-consuming y-information,
 say a ratio r=sum y (phase2)/ sum x (phase2)=15/10=1.5 (out of the above
 table)

Not quite. Two-phase sampling is *useful* only where there is a 
relationship. No relationship is *assumed*.

There are two ways you can take advantage of a relationship. The first is 
to stratify the phase-two sampling based on phase one information.  In 
this case you need a strata= argument to twophase().

The second way to use a relationship is to calibrate phase two to phase 
one, using the calibrate() function.  This is analogous to the regression 
estimator you describe.

A good example to look at is in vignette(epi).  This describes a 
two-phase sample where about 4000 people are in the first stage (a cancer 
clinical trial) and then the second phase is sampled based on relapse and 
on disease type (histology) determined at the local hospital.
  Disease type is determined more accurately at a central lab for everyone 
who relapses, everyone whose locally-determined disease type is bad, and 
20% of the rest.

There is also an example of calibration, post-stratifying the second phase 
to the first phase on disease stage, for the same data.

Finally, note that twophase() does not use the unbiased estimator of 
variance. It uses a modification that is easier to compute for cluster 
samples, as described in vignette(phase1).  There is no difference if 
the first phase is sampled from an infinite population (or with 
replacement), which is the case in vignette(epi).


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Multivariate skew-t cdf

2006-06-05 Thread Spencer Graves
  You want to evaluate skewed t probabilities in how many dimensions? 
If 27 is your maximum, the problem won't be as difficult as if 27 is 
your minimum.  Also, do you want to compute the multivariate cumulative 
probability function for arbitrary points, location, covariance, shape 
and degrees of freedom?  Or are you really only interested in certain 
special case(s)?  If you have a simpler special case, it might be easier 
to get a solution.

  I was able to replicate your result, even when I reduced the 
dimensionality down to 20;  with 19 dimensions, the function seemed to 
return a sensible answer.  If it were my problem, I might first make a 
local copy of the pmst function and modify it to use the mvtnorm package 
in place of mnormt.  That might get you answers with somewhat higher 
dimensionality, though it might not be adequate -- and I wouldn't trust 
the numbers I got without serious independent testing.  I'd try to think 
how I could modify the skewness so I could check the accuracy that way.

  Have you studied the reference mentioned in the dmst help page, and 
reviewed some of their sources?  Computing multivariate probabilities 
like this is still a research project, I believe.  In this regard, I 
found the following two books helpful:

  * Evans and Schwarz (2000) Approximating Integrals via Monte Carlo 
and Deterministic Methods (Oxford)

  * Kythe and Schaeferkotter (2005) Handbook of Computational Methods 
for Integration (Chapman and Hall).

  Also, have you asked about this directly to the maintainers of the 
sn, mnormt and mvtnorm packages?  They might have other suggestions.

  Hope this helps.
  Spencer Graves
p.s.  Thanks for the self-contained example.  There seems to be a typo 
in your example:  Omega = diag(0, 27) is the matrix of all zeros, which 
produces a point mass at the center of the distribution.  I got your 
answers after changing it to diag(1, 27).

  Making the dimension a variable, I found a sharp transition between k 
= 19 and 20:

  k - 19
  xi - alpha - x - rep(0,k)
  Omega - diag(1,k)
  (p19 - pmst(x, xi, Omega, alpha, df = 5))
[1] 1.907349e-06
attr(,error)
[1] 2.413537e-20
attr(,status)
[1] normal completion
  k - 20
  xi - alpha - x - rep(0,k)
  Omega - diag(1,k)
  (p20 - pmst(x, xi, Omega, alpha, df = 5))
[1] 0
attr(,error)
[1] 1
attr(,status)
[1] oversize

Konrad Banachewicz wrote:
 Dear All,
 I am using the pmst function from the sn package (version 0.4-0). After
 inserting the example from the help page, I get non-trivial answers, so
 everything is fine. However, when I try to extend it to higher dimension:
 xi - alpha - x - rep(0,27)
 Omega - diag(0,27)
 p1 - pmst(x, xi, Omega, alpha, df = 5)
 
 I get the following result:
 
 p1
 [1] 0
 attr(,error)
 [1] 1
 attr(,status)
 [1] oversize
 
 So it seems like the dimension is a problem here (and not the syntax or type
 mismatch, as I inititally thought - the function is evaluated) - although I
 found no warning about it in the help page.
 
 Can anyone give me a hint as to how to work around this problem and evaluate
 the skew-t cdf in a large-dimensional space? It's pretty crucial to my
 current research. Thanks in advance,
 
 Konrad
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Multivariate skew-t cdf

2006-06-05 Thread Adelchi Azzalini
On Mon, Jun 05, 2006 at 09:43:24AM -0700, Spencer Graves wrote:


Thanks to Spencer Graves for provinding this clarification about pmst.
As the author of pmst, I was really the one expected to answer the
query, but  I was on travel in the last two weeks and did not read 
this query.

As a complement to what already explained, the reason of the sharp change
from k=19 to k=20 is as follows: pmst (package sn) calls pmt (package
mnormt) with k increased by 1, and pmt calls a Fortran routine (written by
Alan Genz), which issues an error if k20 or k1. Hence, the effective
maximum number of dimensions allowed by pmst is 19. 

best wishes,

Adelchi Azzalini



 You want to evaluate skewed t probabilities in how many dimensions? 
 If 27 is your maximum, the problem won't be as difficult as if 27 is 
 your minimum.  Also, do you want to compute the multivariate cumulative 
 probability function for arbitrary points, location, covariance, shape 
 and degrees of freedom?  Or are you really only interested in certain 
 special case(s)?  If you have a simpler special case, it might be easier 
 to get a solution.
 
 I was able to replicate your result, even when I reduced the 
 dimensionality down to 20;  with 19 dimensions, the function seemed to 
 return a sensible answer.  If it were my problem, I might first make a 
 local copy of the pmst function and modify it to use the mvtnorm package 
 in place of mnormt.  That might get you answers with somewhat higher 
 dimensionality, though it might not be adequate -- and I wouldn't trust 
 the numbers I got without serious independent testing.  I'd try to think 
 how I could modify the skewness so I could check the accuracy that way.
 
 Have you studied the reference mentioned in the dmst help page, and 
 reviewed some of their sources?  Computing multivariate probabilities 
 like this is still a research project, I believe.  In this regard, I 
 found the following two books helpful:
 
 * Evans and Schwarz (2000) Approximating Integrals via Monte Carlo 
 and Deterministic Methods (Oxford)
 
 * Kythe and Schaeferkotter (2005) Handbook of Computational Methods 
 for Integration (Chapman and Hall).
 
 Also, have you asked about this directly to the maintainers of the 
 sn, mnormt and mvtnorm packages?  They might have other suggestions.
 
 Hope this helps.
 Spencer Graves
 p.s.  Thanks for the self-contained example.  There seems to be a typo 
 in your example:  Omega = diag(0, 27) is the matrix of all zeros, which 
 produces a point mass at the center of the distribution.  I got your 
 answers after changing it to diag(1, 27).
 
 Making the dimension a variable, I found a sharp transition between k 
 = 19 and 20:

 
   k - 19
   xi - alpha - x - rep(0,k)
   Omega - diag(1,k)
   (p19 - pmst(x, xi, Omega, alpha, df = 5))
 [1] 1.907349e-06
 attr(,error)
 [1] 2.413537e-20
 attr(,status)
 [1] normal completion
   k - 20
   xi - alpha - x - rep(0,k)
   Omega - diag(1,k)
   (p20 - pmst(x, xi, Omega, alpha, df = 5))
 [1] 0
 attr(,error)
 [1] 1
 attr(,status)
 [1] oversize
 
 Konrad Banachewicz wrote:
  Dear All,
  I am using the pmst function from the sn package (version 0.4-0). After
  inserting the example from the help page, I get non-trivial answers, so
  everything is fine. However, when I try to extend it to higher dimension:
  xi - alpha - x - rep(0,27)
  Omega - diag(0,27)
  p1 - pmst(x, xi, Omega, alpha, df = 5)
  
  I get the following result:
  
  p1
  [1] 0
  attr(,error)
  [1] 1
  attr(,status)
  [1] oversize
  
  So it seems like the dimension is a problem here (and not the syntax or type
  mismatch, as I inititally thought - the function is evaluated) - although I
  found no warning about it in the help page.
  
  Can anyone give me a hint as to how to work around this problem and evaluate
  the skew-t cdf in a large-dimensional space? It's pretty crucial to my
  current research. Thanks in advance,
  
  Konrad
  
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Adelchi Azzalini  [EMAIL PROTECTED]
Dipart.Scienze Statistiche, Università di Padova, Italia
tel. +39 049 8274147,  http://azzalini.stat.unipd.it/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anova for the objects of list

2006-06-05 Thread Hai Lin
Dear R experts,

Sorry for the previous email. I just figured it out
myself. I would welcome other solutions.

Thanks.
Kevin

my.anova - sapply(names(my.x),
   function(x)
   {
   anova(my.x[[x]],
my.x1[[x]])
   }
 )

--- Hai Lin [EMAIL PROTECTED] wrote:

 Dear R experts,
 
 I have two list objects from runing lme model for
 each
 subjects. And my.x is the full model with all terms
 I
 am interested in and my.x1 is the null model with
 only
 intercept.
 
 I am trying to compare two models at each subject
 level. (Doing anova(my.x[[1]], my.x1[[1]]) is to
 produce stats for the first subject). My goal is to
 obtain stats. for an entire set with hundreds of
 subjects. I did something like this, lapply(my.Tem,
 function(x)anova(x)), but my.Tem actually is not a
 list after combining. It did not work the way as I
 expected.
 
 I am very thankful if anyone here could point me
 out.
 
 Kevin
 
 
 5 subjects extracted as my pilot data.
  class(my.x)
 [1] list
  length(my.x)
 [1] 5
  class(my.x1)
 [1] list
  length(my.x1)
 [1] 5 
 
 
 my.Tem - cbind(my.x, my.x1)
  class(my.Tem)
 [1] matrix
  my.Tem
my.xmy.x1  
 s1 List,15 List,15
 s2 List,15 List,15
 s3 List,15 List,15
 s4 List,15 List,15
 s5 List,15 List,15
 
 
 __
 Do You Yahoo!?

 protection around 
 http://mail.yahoo.com 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Significance test, Cronbach's Alpha

2006-06-05 Thread Chris Evans
Jonathan Baron sent the following  at 05/06/2006 12:05:

... some snipped ...

 It is rare to see anyone report a test for alpha because it is
 usually used descriptively.  If it isn't .7 or higher, people get 
 upset, yet even .5 would be wildly significant in most cases.
 
 Jon

Feldt did a lot of good work on ANOVA models of alpha, well summarised
in his paper with Salih.  Here are some functions.  Please don't
ridicule my R style, I'm a psychotherapist first, researcher second, and
R enthusiast third.  Amused advice on how to write better code warmly
received.

I'm sure that jackknifing or bootstrapping would give much more
distributionally robust CIs and p values but, as Jon's point above makes
so simply, the real problem is that people don't think through what
they're looking for from an alpha.  I find there are situations in which
I'm genuinely interested in the null: is there some evidence of
covariance here?; and other situations where I want a high alpha because
I'm postulating that we have a useful measure, in the latter case, all
these totemistic values that are acceptable, excellent etc. are
often misleading and a CI around the observed alpha and some exploration
of the factor structure, EFA or CFA, or IRT model explorations, will be
far more important than exactly what alpha you got.

Oops /FLAME  not quite sure where I should have put the starter on that!

I'll go back to enjoying the fact that I think this is the first time
I've posted something that might be useful to someone!

Very best all:



Chris

feldt1.return - function(obs.a, n, k, ci = 0.95, null.a = 0)
{
if(obs.a  null.a)
f - (1 - obs.a)/(1 - null.a)
else f - (1 - null.a)/(1 - obs.a)  
  # allows for testing against a higher null
n.den - (n - 1) * (k - 1)
n.num - n - 1
null.p - pf(f, n.num, n.den)   
  # set the upper and lower p values for the desired C.I.
p1 - (1 - ci)/2
p2 - ci + p1   # corresponding F values
f1 - qf(p1, n.num, n.den)
f2 - qf(p2, n.num, n.den)  # confidence interval
lwr - 1 - (1 - obs.a) * f2
upr - 1 - (1 - obs.a) * f1
cat(round(lwr,2), to,round(upr,2),\n)
interval - list(lwr,upr)
return(interval)
}

feldt1.lwr - function(obs.a, n, k, ci = 0.95, null.a = 0)
{
if(obs.a  null.a)
f - (1 - obs.a)/(1 - null.a)
else f - (1 - null.a)/(1 - obs.a)  
 # allows for testing against a higher null
n.den - (n - 1) * (k - 1)
n.num - n - 1
null.p - pf(f, n.num, n.den)   
 # set the upper and lower p values for the desired C.I.
p1 - (1 - ci)/2
p2 - ci + p1   # corresponding F values
f1 - qf(p1, n.num, n.den)
f2 - qf(p2, n.num, n.den)  # confidence interval
lwr - 1 - (1 - obs.a) * f2
return(lwr)
}

feldt1.upr - function(obs.a, n, k, ci = 0.95, null.a = 0)
{
if(obs.a  null.a)
f - (1 - obs.a)/(1 - null.a)
else f - (1 - null.a)/(1 - obs.a)  
  # allows for testing against a higher null
n.den - (n - 1) * (k - 1)
n.num - n - 1
null.p - pf(f, n.num, n.den)   
  # set the upper and lower p values for the desired C.I.
p1 - (1 - ci)/2
p2 - ci + p1   # corresponding F values
f1 - qf(p1, n.num, n.den)
f2 - qf(p2, n.num, n.den)  # confidence interval
upr - 1 - (1 - obs.a) * f1
return(upr)
}



-- 
Chris Evans [EMAIL PROTECTED]
Hon. Professor of Psychotherapy, Nottingham University;
Consultant Psychiatrist in Psychotherapy, Rampton Hospital;
Research Programmes Director, Nottinghamshire NHS Trust;
Hon. SL Institute of Psychiatry, Hon. Con., Tavistock  Portman Trust
**If I am writing from one of those roles, it will be clear. Otherwise**

**my views are my own and not representative of those institutions**

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Editor/environment problem

2006-06-05 Thread Richard Valliant
I have encountered an odd problem in editing a function. I found no
mention of this in the archives but may have missed something. I'm using
Version 2.3.0 (2006-04-24). The problem is that after creating a
function with fix, making some typos and correcting them, my function
loses contact with everything in the search path except (apparently)
package:base. Here is an example:  fix(tmp) tmpfunction (x) {   
var(x)} tmp(1:5)[1] 2.5 fix(tmp) # Typos (,/ at end of var(x) ) were
intentionally introduced# line is now: var(x),/ Error in edit(name,
file, title, editor) : an error occurred on line 3 use a command
like x - edit() to recover tmp - edit() # Typos corrected but new
ones introduced# Line is now var(x) Error in edit(name, file, title,
editor) : an error occurred on line 3 use a command like x -
edit() to recover tmp - edit() # Typos fixed.# List the function. Note
the environment: base at end. Not there before.  tmpfunction (x) {   
var(x)}environment: base tmp(1:5)Error in tmp(1:5) : could not
find function var TIARichard 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Fastest way to do HWE.exact test on 100K SNP data?

2006-06-05 Thread Anna Pluzhnikov
Hi everyone,

I'm using the function 'HWE.exact' of 'genetics' package to compute p-values of
the HWE test. My data set consists of ~600 subjects (cases and controls) typed
at ~ 10K SNP markers; the test is applied separately to cases and controls. The
genotypes are stored in a list of 'genotype' objects, all.geno, and p-values are
calculated inside the loop over all SNP markers. 

I wish to repeat this procedure multiple times (~1000) permuting the cases and
controls (affection status). It seems straightforward to implement it like this:

#

for (iter in 1:1000) {
  set.seed(iter)
# get the permuted affection status
  permut - sample(affSt)
  for (j in 1:nSNPs) {
test - tapply(all.geno[[j]], permut, HWE.exact)
pvalControls[j] - test$1$p.value
pvalCases[j] - test$2$p.value
  }
}

##

The problem is that it takes ~1 min/iteration (on AMD Opteron 252 processor
running Linux). 

Is there a faster/more efficient way to do this? 

Thanks,
-- 
Anna Pluzhnikov, PhD
Section of Genetic Medicine
Department of Medicine
The University of Chicago


-
This email is intended only for the use of the individual or...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Editor/environment problem

2006-06-05 Thread Duncan Murdoch
This has been fixed in 2.3.1.  See

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76657.html

or

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76953.html

Duncan Murdoch



On 6/5/2006 3:19 PM, Richard Valliant wrote:
 I have encountered an odd problem in editing a function. I found no
 mention of this in the archives but may have missed something. I'm using
 Version 2.3.0 (2006-04-24). The problem is that after creating a
 function with fix, making some typos and correcting them, my function
 loses contact with everything in the search path except (apparently)
 package:base. Here is an example:  fix(tmp) tmpfunction (x) {   
 var(x)} tmp(1:5)[1] 2.5 fix(tmp) # Typos (,/ at end of var(x) ) were
 intentionally introduced# line is now: var(x),/ Error in edit(name,
 file, title, editor) : an error occurred on line 3 use a command
 like x - edit() to recover tmp - edit() # Typos corrected but new
 ones introduced# Line is now var(x) Error in edit(name, file, title,
 editor) : an error occurred on line 3 use a command like x -
 edit() to recover tmp - edit() # Typos fixed.# List the function. Note
 the environment: base at end. Not there before.  tmpfunction (x) {   
 var(x)}environment: base tmp(1:5)Error in tmp(1:5) : could not
 find function var TIARichard 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Constrained regression

2006-06-05 Thread Daniel Bogod
Hi,
I would like to run a constrained OLS, with the following constraints:
1. sum of coefficients must be 1
2. all the coefficients have to be positive.

Is there an eas way to specify that in R

Thank you,
 Daniel Bogod
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Status of data.table package

2006-06-05 Thread Uwe Ligges
Roger Bivand wrote:
 On Mon, 5 Jun 2006, Liaw, Andy wrote:
 
 It was announced on the R-devel list:
 http://tolstoy.newcastle.edu.au/R/devel/06/04/4923.html

 I don't see it on CRAN, either, nor could I find mention of it in the R News
 you cited.
 
 In the archive:
 
 http://cran.r-project.org/src/contrib/Archive/D/data.table_1.0.tar.gz

AFAIR, the author (I am CCing) decided to remove it from CRAN's main 
repository because R has improved issues the package tries to address. 
Hence the package is no longer that beneficial in newer versions of R.

Uwe Ligges


 Roger
 
 Andy

 From: Eric Rexstad
 List:

 This package was described in R News of May 2006.  However, I 
 cannot find it on CRAN mirrors, and use of RSiteSearch is 
 unsatisfying.  I am likewise unable to find an email address 
 for the maintainer of the package.  Thank you in advance for 
 assistance.

 --
 Eric Rexstad
 Research Unit for Wildlife Population Assessment Centre for 
 Research into Ecological and Environmental Modelling 
 University of St. Andrews St. Andrews Scotland KY16 9LZ
 +44 (0)1334 461833

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] simpel POSIXlt question

2006-06-05 Thread Omar Lakkis
How can I access, either to get or set, the  sec, min, hour,
 fields of a POSIXlt value as in this example?

 t= as.POSIXlt(2004-11-01)
 t
[1] 2004-11-01
 unclass(t)
$sec
[1] 0

$min
[1] 0

$hour
[1] 0

$mday
[1] 1

$mon
[1] 10

$year
[1] 104

$wday
[1] 1

$yday
[1] 305

$isdst
[1] -1

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



Re: [R] Constrained regression

2006-06-05 Thread Berton Gunter
If you haven't already done so, please make use of R's search capabilities
before posting.

help.search('constrained regression')
RSiteSearch('constrained regression') ## also available through CRAN's
search functionality.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Bogod
 Sent: Monday, June 05, 2006 1:01 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Constrained regression
 
 Hi,
 I would like to run a constrained OLS, with the following constraints:
 1. sum of coefficients must be 1
 2. all the coefficients have to be positive.
 
 Is there an eas way to specify that in R
 
 Thank you,
  Daniel Bogod
 [EMAIL PROTECTED]
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] simpel POSIXlt question

2006-06-05 Thread Prof Brian Ripley
On Mon, 5 Jun 2006, Omar Lakkis wrote:

 How can I access, either to get or set, the  sec, min, hour,
  fields of a POSIXlt value as in this example?

Just as for any other list:

 t - as.POSIXlt(2004-11-01)
 t$sec
[1] 0
 t$sec - 30
 t
[1] 2004-11-01 00:00:30

and via [[sec]] if you prefer.

[ ... ]

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] CCA Plot

2006-06-05 Thread Virgil Hawkes
I am a beginner with R and have managed to use vegan to run a CCA on 
my data. I am evaluating the habitat relationships of select species 
in riparian areas around creeks before and after logging across three 
sampling intervals (pre-logging, two-years post-logging,. and 10 
years post-logging). I sampled from 18 sites of 3 treatments (n = 6 
sites for each treatment).  When I plot my pre-treatment data, I am 
plotting data from all 18 sites and have selected 12 species and 
boiled my habitat variables down to 10.  The plot shows all 18 sites, 
all 12 species, and all 10 habitat variable - this is good. However, 
when I do the CCA on a sub-set of my data, (e.g., Control sites only 
[n = 6]) the plot displays all 6 sites, all 12 species, but only 5 of 
the 10 habitat variables.  Any ideas why the tri-plot of the sub-set 
of my data will not show all of the habitat variables? Am I 
encountering a minimum sample size problem? How can I plot all 
environmental variables?

Thanks in advance,

Virgil Hawkes 
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Selective Survival Statistics with R

2006-06-05 Thread Gregory Pierce
Hello friends and fellow R users,

I have a problem to which I have been unable to find a solution: I am
gathering survival data on patients undergoing treatment with a new kind
of stent. I want to generate survival data and plot survival curves of
these patients based (among other things) on the treating physician. My
data set has been tabulated in the following manner:

Date(the date the stent was implanted)   
Description (diameter of the stent)
Patient (name)
MRN (ID number)
Age (age in years of the patient)
Physician (last name of the implanting physician)
status (0 or 1)
days (days alive since the stenting procedure)
Cr 
INR
BR
MELD

The data set has over ten physicians. Following the examples in
Introductory Statistics with R I have been able to draw cumulative
survival curves and even the survival curves by physician but there are
so many curves on the graph when I do, it is uninterpretable.

I would just like to plot the survival curves of say the top two or
three. That would be much more useful. I have been able to create a
Surv object out of my own survival data and that of a colleague in the
following way using indexing:

Surv(days[viatorr[6]==Pierce|
viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1)
[1] 558+ 474+ 472+ 446+ 443+ 429+ 401  395+ 390  390+ 362+ 354  344+ 342
326+
[16] 300+ 284+ 280+ 246+ 237+ 233+ 233  230+ 230+ 225+ 215  199+ 191+
191  184+
[31] 161+ 153  150  129+  69+  52+  38+

I can get even further:

surv.ee.gp-Surv(days[viatorr[6]==Pierce|
viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1)
 survfit(surv.ee.gp)
Call: survfit(formula = surv.ee.gp)

  n  events  median 0.95LCL 0.95UCL
 37   9 Inf 390 Inf

But now if I want to plot the data using the following command

plot(survfit(surv.ee.gp)~Physician)

I receive an error:

Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  :
invalid variable type

I have tried indexing Physician, but that fails as well:

 plot(survfit(surv.ee.gp)~Physician[viatorr[6]==Pierce|
viatorr[6]==Ed]==1)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  :
invalid variable type

I apologize for this long-winded explanation but I am new to this
program and even newer to Statistics and I wanted to make sure the
context and problem were clear (hopefully).

I would appreciate any guidance in a way forward. 

Thank you,
Greg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Selective Survival Statistics with R

2006-06-05 Thread Gregory Pierce
On Mon, 2006-06-05 at 18:54 -0400, Barker, Chris [SCIUS] wrote:
 Its probably easiest/fastest for you either to subset your dataset
 first, or else simply use the subset option in survfit()
 
  e.g. 
 
 survfit( ) has a subset option,  
 survfit( Surv( , ) ~ physician , subset=='Jones)
 
Chris,

Thank you very much for your kind reply. Using subset worked. I had to
modify the syntax a little from what you posted:

survfit(Surv(days,status==1)~Physician,subset(viatorr,viatorr[6]==Pierce)

where viatorr is the name of my data set.

Applying plot to the function above generated a survival curve for my
patients. I was also able to plot survival curves for other physicians
as well. This is great!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Survival Statistics

2006-06-05 Thread Gregory Pierce
Hello friends and fellow R users,

I have a problem to which I have been unable to find a solution: I am
gathering survival data on patients undergoing treatment with a new kind
of stent. I want to generate survival data and plot survival curves of
these patients based (among other things) on the treating physician. My
data set has been tabulated in the following manner:

Date(the date the stent was implanted)   
Description (diameter of the stent)
Patient (name)
MRN (ID number)
Age (age in years of the patient)
Physician (last name of the implanting physician)
status (0 or 1)
days (days alive since the stenting procedure)
Cr 
INR
BR
MELD

The data set has over ten physicians. Following the examples in
Introductory Statistics with R I have been able to draw cumulative
survival curves and even the survival curves by physician but there are
so many curves on the graph when I do, it is uninterpretable.

I would just like to plot the survival curves of say the top two or
three. That would be much more useful. I have been able to create a
Surv object out of my own survival data and that of a colleague in the
following way using indexing:

Surv(days[viatorr[6]==Pierce|
viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1)
[1] 558+ 474+ 472+ 446+ 443+ 429+ 401  395+ 390  390+ 362+ 354  344+ 342
326+
[16] 300+ 284+ 280+ 246+ 237+ 233+ 233  230+ 230+ 225+ 215  199+ 191+
191  184+
[31] 161+ 153  150  129+  69+  52+  38+

I can get even further:

surv.ee.gp-Surv(days[viatorr[6]==Pierce|
viatorr[6]==Ed],status[viatorr[6]==Pierce|viatorr[6]==Ed]==1)
 survfit(surv.ee.gp)
Call: survfit(formula = surv.ee.gp)

  n  events  median 0.95LCL 0.95UCL
 37   9 Inf 390 Inf

But now if I want to plot the data using the following command

plot(survfit(surv.ee.gp)~Physician)

I receive an error:

Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  :
invalid variable type

I have tried indexing Physician, but that fails as well:

 plot(survfit(surv.ee.gp)~Physician[viatorr[6]==Pierce|
viatorr[6]==Ed]==1)
Error in model.frame(formula, rownames, variables, varnames, extras,
extranames,  :
invalid variable type

I apologize for this long-winded explanation but I am new to this
program and even newer to Statistics and I wanted to make sure the
context and problem were clear (hopefully).

I would appreciate any guidance in a way forward. 

Thank you,
Greg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Selective Survival Statistics with R

2006-06-05 Thread Thomas Lumley
On Mon, 5 Jun 2006, Gregory Pierce wrote:
 On Mon, 2006-06-05 at 18:54 -0400, Barker, Chris [SCIUS] wrote:
 Its probably easiest/fastest for you either to subset your dataset
 first, or else simply use the subset option in survfit()

  e.g.

 survfit( ) has a subset option,
 survfit( Surv( , ) ~ physician , subset=='Jones)

 Chris,

 Thank you very much for your kind reply. Using subset worked. I had to
 modify the syntax a little from what you posted:

 survfit(Surv(days,status==1)~Physician,subset(viatorr,viatorr[6]==Pierce)

 where viatorr is the name of my data set.

An example of another approach is given in the example on the help page 
for survfit

You could do
   curves - survfit(Surv(days,status==1)~Physician, data=viatorr)
   plot(curves[1])
   plot(curves[2])
etc.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Selective Survival Statistics with R

2006-06-05 Thread Gregory Pierce
On Mon, 2006-06-05 at 17:04 -0700, Thomas Lumley wrote:
 On Mon, 5 Jun 2006, Gregory Pierce wrote:
  On Mon, 2006-06-05 at 18:54 -0400, Barker, Chris [SCIUS] wrote:
  Its probably easiest/fastest for you either to subset your dataset
  first, or else simply use the subset option in survfit()
 
   e.g.
 
  survfit( ) has a subset option,
  survfit( Surv( , ) ~ physician , subset=='Jones)
 
  Chris,
 
  Thank you very much for your kind reply. Using subset worked. I had to
  modify the syntax a little from what you posted:
 
  survfit(Surv(days,status==1)~Physician,subset(viatorr,viatorr[6]==Pierce)
 
  where viatorr is the name of my data set.
 
 An example of another approach is given in the example on the help page 
 for survfit
 
 You could do
curves - survfit(Surv(days,status==1)~Physician, data=viatorr)
plot(curves[1])
plot(curves[2])
 etc.
 
   -thomas

That also worked great! Thanks. This makes me very,very happy!!! I
can now go to our statistician without looking like a complete idiot and
I can get to work collecting survival data on our patients who had the
old-fashioned bare-metal Wallstent. Looking back in this way provides a
very interesting and humbling perspective.

Greg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] time series clustering

2006-06-05 Thread Spencer Graves
  I know of no software for time series clustering in R.  Google 
produced some interesting hits for time series clustering.  If you 
find an algorithm you like, the author might have software. 
Alternatively, the algorithm might be a modification of something 
already available in R.  If that's the case, you wouldn't need to start 
from scratch to program something since R is open source.

  hope this helps.
  Spencer Graves

Weiwei Shi wrote:
 Dear Listers:
 
 I happened to have a problem requiring time-series clustering since the
 clusters will change with time (too old data need to be removed from data
 while new data comes in). I am wondering if there is some paper or reference
 on this topic and there is some kind of implementation in R?
 
 Thanks,
 
 Weiwei


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] error bars in lattice xyplot *with groups*

2006-06-05 Thread Mike Lawrence
Hi all,

I'm trying to plot error bars in a lattice plot generated with xyplot. Deepayan
Sarkar has provided a very useful solution for simple circumstances
(https://stat.ethz.ch/pipermail/r-help/2005-October/081571.html), yet I am
having trouble getting it to work when the groups setting is enabled in
xyplot (i.e. multiple lines). To illustrate this, consider the singer data
generated by the above linked solution previously submitted:

#
library(lattice)
singer.split -
with(singer,
 split(height, voice.part))

singer.ucl -
sapply(singer.split,
   function(x) {
   st - boxplot.stats(x)
   c(st$stats[3], st$conf)
   })

singer.ucl - as.data.frame(t(singer.ucl))
names(singer.ucl) - c(median, lower, upper)
singer.ucl$voice.part -
factor(rownames(singer.ucl),
   levels = rownames(singer.ucl))

#now let's split up the voice.part factor into two factors,
singer.ucl$voice=factor(rep(c(1,2),4))
singer.ucl$range=factor(rep(c(Bass,Tenor,Alto,Soprano),each=2))

#here's Deepayan's previous solution, slightly modified to depict
#  the dependent variable (median) and the error bars on the y-axis
#  and the independent variable (voice.part) on the x-axis
prepanel.ci - function(x, y, ly, uy, subscripts, ...)
{
x - as.numeric(x)
ly - as.numeric(ly[subscripts])
uy - as.numeric(uy[subscripts])
list(ylim = range(y, uy, ly, finite = TRUE))
}
panel.ci - function(x, y, ly, uy, subscripts, pch = 16, ...)
{
x - as.numeric(x)
y - as.numeric(y)
ly - as.numeric(ly[subscripts])
uy - as.numeric(uy[subscripts])
panel.arrows(x, ly, x, uy, col = black,
 length = 0.25, unit = native,
 angle = 90, code = 3)
panel.xyplot(x, y, pch = pch, ...)
}


#this graph works
xyplot(median ~ voice.part,
data=singer.ucl,
ly = singer.ucl$lower,
uy = singer.ucl$upper,
prepanel = prepanel.ci,
panel = panel.ci,
type=b
)

#this one does not (it will plot, but will not seperate the groups)
xyplot(median ~ voice,
groups=range,
data=singer.ucl,
ly = singer.ucl$lower,
uy = singer.ucl$upper,
prepanel = prepanel.ci,
panel = panel.ci,
type=b
)



Any suggestions?

-- 

Mike Lawrence

[EMAIL PROTECTED]

The road to Wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Fastest way to do HWE.exact test on 100K SNP data?

2006-06-05 Thread Martin Morgan
Anna --

This might be a little faster ...

all.geno.df - data.frame(all.geno) # exploit shared structure
set.seed(123)   # once only
for (iter in 1:1000) {
permut - sample(affSt)
control - all.geno.df[permut==1,]
case - all.geno.df[permut==2,]
pvalControls - unlist(sapply(control, HWE.exact)[p.value,])
pvalCases - unlist(sapply(case, HWE.exact)[p.value,])
# presmuably do something with these, before next iteration
}

but probably most of the time is being spent in HWE.exact so these
cosmetic changes won't help much.

Looking at the code for HWE.exact, it looks like it generates a table
based on numbers of alleles, and then uses this table to determine the
probability. Any time two SNPs have the same number of alleles, the
same table gets generated. It seems like this will happen alot. So one
strategy is to only calculate the table once, 'remember' it, and then
the next time a value is needed from that table look it up rather than
calculate it.

This is a bad strategy if most SNPs have different numbers of alleles
(because then the tables get calculated, and a lot of space and some
time is used to store results that are never accessed again).

Here is the code (not tested extensively, so please use with care!):

HWE.exact.lookup - function() {
HWE.lookup.tbl - new.env(hash=TRUE)
dhwe2 - function(n11, n12, n22) {
## from body of HWE.exact
f - function(x) lgamma(x + 1)
n - n11 + n12 + n22
n1 - 2 * n11 + n12
n2 - 2 * n22 + n12
exp(log(2) * (n12) + f(n) - f(n11) - f(n12) - f(n22) - 
f(2 * n) + f(n1) + f(n2))
}
function (x) {
## adopted from HWE.exact
if (!is.genotype(x)) 
  stop(x must be of class 'genotype' or 'haplotype')
nallele - length(na.omit(allele.names(x)))
if (nallele != 2) 
  stop(Exact HWE test can only be computed for 2 markers with 2 
alleles)
allele.tab - table(factor(allele(x, 1), levels = allele.names(x)), 
factor(allele(x, 2), levels = allele.names(x)))
n11 - allele.tab[1, 1]
n12 - allele.tab[1, 2] + allele.tab[2, 1]
n22 - allele.tab[2, 2]
n1 - 2 * n11 + n12
n2 - 2 * n22 + n12

nm - paste(n1,n2,sep=.)
if (!exists(nm, envir=HWE.lookup.tbl)) { # 'remember'
x12 - seq(n1%%2, min(n1, n2), 2)
x11 - (n1 - x12)/2
x22 - (n2 - x12)/2
dist - data.frame(n11 = x11, n12 = x12, n22 = x22,
   density = dhwe2(x11, x12, x22))
dist - dist[order(dist$density),]
dist$density - cumsum(dist$density)
assign(nm,dist, envir=HWE.lookup.tbl)
}

dist - get(nm, HWE.lookup.tbl)
dist$density[dist$n11 == n11  dist$n12 == n12  dist$n22 == n22]
}
}

calcHWE - HWE.exact.lookup()   # create a new lookup table  function
all.geno.df - data.frame(all.geno) # exploit shared structure
set.seed(123)   # once only
for (iter in 1:1000) {
permut - sample(affSt)
control - all.geno.df[permut==1,]
case - all.geno.df[permut==2,]
pvalControls - sapply(control, calcHWE)
pvalCases - sapply(case, calcHWE)
}

Let me know how it goes if you adopt this approach!

Martin

Anna Pluzhnikov [EMAIL PROTECTED] writes:

 Hi everyone,

 I'm using the function 'HWE.exact' of 'genetics' package to compute p-values 
 of
 the HWE test. My data set consists of ~600 subjects (cases and controls) typed
 at ~ 10K SNP markers; the test is applied separately to cases and controls. 
 The
 genotypes are stored in a list of 'genotype' objects, all.geno, and p-values 
 are
 calculated inside the loop over all SNP markers. 

 I wish to repeat this procedure multiple times (~1000) permuting the cases and
 controls (affection status). It seems straightforward to implement it like 
 this:

 #

 for (iter in 1:1000) {
   set.seed(iter)
 # get the permuted affection status
   permut - sample(affSt)
   for (j in 1:nSNPs) {
 test - tapply(all.geno[[j]], permut, HWE.exact)
 pvalControls[j] - test$1$p.value
 pvalCases[j] - test$2$p.value
   }
 }

 ##

 The problem is that it takes ~1 min/iteration (on AMD Opteron 252 processor
 running Linux). 

 Is there a faster/more efficient way to do this? 

 Thanks,
 -- 
 Anna Pluzhnikov, PhD
 Section of Genetic Medicine
 Department of Medicine
 The University of Chicago


 -
 This email is intended only for the use of the individual or...{{dropped}}

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing 

Re: [R] Fastest way to do HWE.exact test on 100K SNP data?

2006-06-05 Thread Ritwik Sinha

Hi Anna,

I am not really answering your question, but, here goes my (unsolicited)
suggestion. Some time back I did some simulations to see how the HWE tests
performed. In particular I compared the exact test and the chi squared test.
Look at the attached figure Rplots.ps. I saw that for null simulations with
40 individuals, the HWE chi squared test was reasonably close to the
(expected) uniform distribution. However this was obtained using the
function HWE.chisq(X, simulate.p.value=F), the default seemed to have some
issues. Hence my suggestion would be, if you feel comfortable, to replace
the exact test with a chi sq test, at least at the screening level. Once you
identify a set of SNPs with small p-values, you could follow them up with
the exact test.

--
Ritwik Sinha
Graduate Student
Epidemiology and Biostatistics
Case Western Reserve University

http://darwin.cwru.edu/~rsinha http://darwin.cwru.edu/%7Ersinha


Rplots.ps
Description: PostScript document
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] default value for cutoff in gstat variogram()

2006-06-05 Thread Patrick Giraudoux


Edzer J Pebesma a écrit :
 Patrick Giraudoux wrote:

 I wonder what is the default value for the argument 'cutoff' when not 
 specified in the variogram.formula function of gstat. Computing 
 variogram envelops within gstat, I am comparing the results obtained 
 with variog in geoR and variogram in gstat, and it took me a while 
 before understanding that the cutoff default value is not the maximum 
 distance.

 Can Edzer tell us about it?

 Yes, of course :

 the default value is computed in the c code. Without checking
 (meaning: from 10 years memory) I do recall that gstat uses
 one third of the diagional of the rectangular (or block for 3D)
 that spans the data locations.

 Why? In time series you compute ACF's up to one half
 of the length of the series; after this things start to oscillate
 because you lack independent replication at large distance;
 look at what is meant by ergodicity for further reading.
 Variograms are basically flipped  unscaled acf's for higher
 dimensions.

 Some books (Journel  Huijbregts?) gave suggestions
 that half the max. distance in the data set is a good guideline,
 back in 1978. I used  one third of the diagonal because
 I thought finding the maximum distance between any
 too point pairs may be expensive to find for large data
 sets. The parameter one third can be overridden by
 those who don't like it.

 Please keep us updated about your milage comparing
 gstat and geoR; I once spent an afternoon on this, trying
 to reproduce sample variogram across the packages and
 found this hard (but not impossible). I had the feeling
 it had to do with using  or = to decide whether a
 point pairs falls in a distance interval or not, but didn't 100%
 assure myself.
 -- 
 Edzer


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html