date:20060803

Re: [R] (robust) mixed-effects model with covariate

2006-08-03 Thread Spencer Graves

  What do you get from the following:

  max(with(df1, table(Subj, Time)))?

  With cases like this, lme gives an answer with a bogus distinction 
between variance components for time and residuals.  I don't know 
about aov or JMP, but I know that varcomp in S-Plus also produces 
garbage answers in such cases as well.

  With mixed models, there seem to be many ways to specify models with 
random effects that are not estimable.  Some some standard software 
(like varcomp in S-Plus or lme) does not (adequately?) test for 
these situations, with the result that the algorithm sometimes returns 
answers that are not correct, at least with the distinction between 
time and residual.

  I don't know if this applies to your example since it is not self 
contained.

  Your code raises other question.  For example, what is the class of 
Time in df1?  Might it be treated differently between aov and 
lme?  How many levels does Group have, etc.?

  Hope this helps.
  Spencer Graves
p.s.  If you'd like more help from this listserve, please submit another 
post that includes commented, minimal, self-contained, reproducible 
code, as suggested in  the posting guide 
www.R-project.org/posting-guide.html.

Giuseppe Pagnoni wrote:
 Dear Thilo,
 
 many thanks for your reply.  I realized that there was an error in my 
 formula which should have been:
 
 aov(y ~ Group * (Time + Age) + Error (Subj/Time), data=df1)
 
 or alternatively:
 
 lme(RVP.A ~ Group*(Time+Age), random = ~ 1|Subj/Time,data=df1)
 
 but I get different results in each case, and different still from the 
 results of another stat program (JMP).
 The problem is that I am not sure which one (if one indeed is) correct!
 
 Also, in the model you proposed:
 
 lme(y~Group*Time, random ~ age | Subj, data = df1)
 
 it appears that age is not between the effects of interests, so I do not 
 get an estimate of the significance of the Age or the Age*Group effect.
 
 I have Pinheiro  Bates, and I read the first chapter but it didn't seem 
 to provide an example analogous to my case.  Also, it looks like it 
 would take me some months to study the book thoroughly and frankly that 
 seems a bit excessive for such a (apparently?) simple problem  I was 
 hoping somebody would magically provide the correct syntax :-)  !
 
 thanks again anyway for your help
 
 best regards
 
giuseppe
 
 
 
 Thilo Kellermann wrote:
 On Monday 24 July 2006 20:16, Giuseppe Pagnoni wrote:
   
 Dear all,

 First of all I apologize if you received this twice: I was checking the
 archive and I noticed that the text was scrubbed from the message,
 probably due to some setting in my e-mail program.


 I am unsure about how to specify a model in R and I thought of asking
 some advice to the list. I have two groups (Group= A, B) of subjects,
 with each subject undertaking a test before and after a certain
 treatment (Time= pre, post). Additionally, I want to enter
 the age of the subject as a covariate (the performance on the test is
 affected by age), and I also want to allow different slopes for the
 effect of age in the two groups of subjects (age might affect the
 performance of the two groups differentially).

 Is the right model to use something like the following?

 aov (y ~ Group*Time + Group*Age + Error(Subj/Group), data=df1 )

 (If I enter that command, within summary, I get the following:
 Error() model is singular in: aov(y ~ Group * Time + Group * Age +
 Error(Subj/Group), data = df1))

 
 try:
 aov(y~Group*Time*Age + Error(Subj*Time*Age), data = df1)
 which specifies an ANOVA (but not with mixed effects) with three main 
 effects 
 and all interaction terms plus an error term that is independent between 
 groups (!) and relates to within subjects variability.

 For a real mixed effects analysis you should use the (n)lme function from 
 the nlme package and one possible model could look like this:

 lme(y~Group*Time, random ~ age | Subj, data = df1)

 but the exact specification depends on your assumptions, in which it is 
 possible to specify two or three models and compare their fits with anova(). 
 For more information on mixed effects you should consult:
 Jose C. Pinheiro  Douglas M. Bates (2000) Mixed-Effects Models in S and 
 S-PLUS. Springer, New York.

 Good luck,
 Thilo

   
 As a second question: I have an outlier in one of the two groups. The
 outlier is not due to a measurement error but simply to the performance
 of the subject (possibly related to his medical history, but I have no
 way to determine that with certainty). This subject is
 signaled to be an outlier within its group: averaging the pre and post
 values for the performance of the subjects in his group, the Grubbs test
 yields a probability of 0.002 for the subject to be an outlier (the
 subject is marked as a significant outlier also if I
 perform the test separately on the pre and the post data).

 If I remove this subject from its group, I

[R] run self written functions

2006-08-03 Thread Antje

Hello,

I'm not sure if I'm in the right place with my question...
I'm running R on Windows and wrote a function and saved it as .R file.
It looks like this:

bmi - function(weight, height) {
bmi - weight / height^2
bmi
}

If I want to use this function, I have to mark everything and then press 
Ctrl-R. But then everything single line is executed on the command line, 
which means that I will loose my history when the code becomes longer.
Further, I wonder if there is any way to do some output for control 
within the function (or any other possibilities to debug in a way).

Maybe, I have chosen a completely wrong way? I only want to make it easy 
to create some graphical visualizations of data which will be read in by 
csv. files, has to be converted and then displayed depending on some 
displaying parameters.

Ciao,
Antje

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] From 2.2.1 to 2.3

2006-08-03 Thread Antonio_Paredes

Hello everyone.

Currently I am running R 2.2.1 (windows), and I will like to update to 
2.3. I wanted to ask if it is possible to update without having to removed 
2.2.1; or do I first need to delete 2.2.1?

Thank you very much.

Tony


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] spatstat 1.9-4

2006-08-03 Thread Adrian Baddeley


Version 1.9-4 of package 'spatstat' has been sent to CRAN.

It includes new code for perfect simulation of point processes
and various improvements.

The release notes are available at
http://www.spatstat.org/spatstat/current/spatstatRELEASE-NOTES-1.9-4

Adrian Baddeley

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Math elements in panel headers of lattice plots?

2006-08-03 Thread Fredrik Karlsson

Hi,

I would like to put a math expression in the header of a panel.
Acctually, I need to substitute the 'a' to a script-a in a
transcription contained in the factor by which the data set is
divided. So, /spak/ should be /spAk/ where A should be a script-a.
I guessed that math expressions would be the way to go on this..


Is this possible?

/Fredrik

-- 

Give up learning, and put an end to your troubles.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] From 2.2.1 to 2.3

2006-08-03 Thread Prof Brian Ripley

On Wed, 2 Aug 2006, [EMAIL PROTECTED] wrote:

 Hello everyone.
 
 Currently I am running R 2.2.1 (windows), and I will like to update to 
 2.3. I wanted to ask if it is possible to update without having to removed 
 2.2.1; or do I first need to delete 2.2.1?

There is no such version as `2.3', as the posting guide points out.

Updating is discussed in detail the rw-FAQ, to which the posting guide 
referred you.

 
 Thank you very much.
 
 Tony
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

PLEASE do!


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] From 2.2.1 to 2.3

2006-08-03 Thread Ulrich Keller

http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What_0027s-the-best-way-to-upgrade_003f

[EMAIL PROTECTED] wrote:
 Hello everyone.

 Currently I am running R 2.2.1 (windows), and I will like to update to 
 2.3. I wanted to ask if it is possible to update without having to removed 
 2.2.1; or do I first need to delete 2.2.1?

 Thank you very much.

 Tony


   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss and umlaut

2006-08-03 Thread Thomas Kuster

Hello

Am Mittwoch, 2. August 2006 17.11 schrieb Thomas Lumley:
 This sounds like a conflict between encodings -- eg if R is assuming UTF-8
 and the file is encoding in Latin-1 then the sequence
 U+00FC : LATIN SMALL LETTER U WITH DIAERESIS
 U+0072 : LATIN SMALL LETTER R
 is coded as FC72 in the file, which is an illegal byte sequence in UTF-8.

Hex:  74 65 20 66 fc 72 20 61 6c 6c 65 53 45 2f 31 36
Text:  t  e f  ü  r a  l  l  e  S  E  /  1  6

 The underlying C code (being written in the US quite a long time ago)
 doesn't know about encodings, and I don't know what the rules are in SPSS
 for valid characters (I suspect that in these old portable file formats it
 probably just reads and writes bytes, leaving it up to the OS to interpret
 them.

But why stopp the C code reading? Is / not the endmark of the string? What 
is the problem, if I chance that in the source?

 You could try running R in a non-UTF-8 locale to see if it helps.

I think my local is non-UTF-8 (de_CH, isolatin). How can I check that, and set 
an other temporary?

A dirty hack like this:
sed s/ä/ae/g | sed s/ö/oe/g | sed s/ü/ue/g | sed s/Ä/Ae/g | sed s/Ö/Oe/g | sed 
s/Ü/Ue/g
didn't work (file 'projets_non_umlaut.por' is not in any supported SPSS 
format).

Thomas

 If anyone has definitive information about how SPSS represents strings and
 decides on valid characters that might be useful too.

   -thomas

  library(foreign)
  spssdaten - read.spss(projets.por)
  attr(spssdaten$PROJETX, value.labels)[1:20]
 
   Bg Stammzellenforschung  Bb
   863  
  862 Bb Neugestaltung des Finanzausgleichs
   861  
  854 EV Postdienste f   Bb 853
852 Bb Bg Steuerpaket 851  
  843 Bb Anhebung der Mehrwertsteuer s 
  11. AHV-Revision 842  
  841 Volkinitiative Lebenslange Verwahrung
   833  
  832 Gegenentwurf zur Avanti EV Lehrstellen-Initiative 831
824 EV Moratorium Plus   
  EV Strom ohne Atom 823   822 EV Ja zu
  fairen Mieten   EV Gleiche Rechte f 821  
  815 EV GesundheitsinitiativeEV
  Sonntags-Initiative 814   813
 
  The SPSS-File is okay:
  system(cat projets.por |grep Postdienste)
 
  echtserwerb 3. GenerationSD/N/EV Postdienste für alleSE/16/Änderrung Bg 
  EOG Mut
 
  How can I read the SPSS-File with the Umlaut?
 
  Bye
  Thomas Kuster
 
  R: 2.1.0 (2005-04-18)
  OS: Debian Linux, 2.6.10-isgee-neptun-1
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
  minimal, self-contained, reproducible code.

 Thomas Lumley Assoc. Professor, Biostatistics
 [EMAIL PROTECTED] University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] run self written functions

2006-08-03 Thread Thomas Kuster

Hello

Am Donnerstag, 3. August 2006 09.36 schrieb Antje:
 Hello,

 I'm not sure if I'm in the right place with my question...
 I'm running R on Windows and wrote a function and saved it as .R file.
 It looks like this:

 bmi - function(weight, height) {
 bmi - weight / height^2
 bmi
 }

 If I want to use this function, I have to mark everything and then press
 Ctrl-R. But then everything single line is executed on the command line,
 which means that I will loose my history when the code becomes longer.
 Further, I wonder if there is any way to do some output for control
 within the function (or any other possibilities to debug in a way).

source(your-r-file.R)
mybmi - bmi(180, 70)

 Maybe, I have chosen a completely wrong way? I only want to make it easy
 to create some graphical visualizations of data which will be read in by
 csv. files, has to be converted and then displayed depending on some
 displaying parameters.

look here (directory R):
http://tomix.homelinux.org/~thomas/eth/5_semester/semesterarbeit_WS_2005_2006/
auswertung.R is the main file.


 Ciao,
 Antje

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented, minimal,
 self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summary method needed?

2006-08-03 Thread Heinz Tuechler

At 21:04 02.08.2006 +0100, Prof Brian Ripley wrote:
On Wed, 2 Aug 2006, Christian Hennig wrote:

 Thank you Brian!
 
   I'm updating my fpc package at the moment and will add some new
functions.
   I learned that there should be print and summary methods for the key
   functions.
 
  for 'classes', I think.
 
 Yes.
 
   But in some cases the print method will make use of more or less all
the
   output information of the function. Is there any reason to implement a
   summary method in these cases?
 
  Would a more concise print() method be useful?  If so the existing
print()
  could become summary().
 
 :-)
 What I initially did some years ago was to write summary methods to
print out
 the required informations. Then M. Maechler told me that this is not the
 purpose of a summary method and I should write a print.summary method for
 this. Now I realise that I actually just want to print, and I don't really
 need the extra synopsis to be done by summary().
 
 Now is there any recommendation on this? My intuition would be to write a
 print, but not a summary method.

That sounds fine for your purposes.


Maybe I am wrong, but as far as I see, print() has the disadvantage that it
has to return x and must not return the summarized results as an object.
You remember the difficulties with print.survfit.
Instead it seems to be allowed that summary does not only print but also
return summarized results.
Is that right?

Greetings,
Heinz

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] geodesic distance

2006-08-03 Thread stefano iacus

Hi,
has anyone ever seen implemented in R the following geodesic  
distance between positive definite pxp matrices A and B?

d(A,B) = \sum_{i=1}^p (\log \lambda_i)^2

were \lambda is the solution of det(A -\lambda B)  = 0

thanks
stefano

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summary method needed?

2006-08-03 Thread Martin Maechler

 HeinzT == Heinz Tuechler [EMAIL PROTECTED]
 on Thu, 03 Aug 2006 09:39:35 +0100 writes:

HeinzT At 21:04 02.08.2006 +0100, Prof Brian Ripley wrote:
 On Wed, 2 Aug 2006, Christian Hennig wrote:
 
 Thank you Brian!
 
   I'm updating my fpc package at the moment and will add some new 
functions.
   I learned that there should be print and summary methods for the key
   functions.
 
  for 'classes', I think.
 
 Yes.
 
   But in some cases the print method will make use of
   more or less all the output information of the
   function. Is there any reason to implement a summary
   method in these cases?
 
  Would a more concise print() method be useful?  If so
  the existing print() could become summary().
 
 :-)

 What I initially did some years ago was to write summary
 methods to print out the required informations. Then
 M. Maechler told me that this is not the purpose of a
 summary method and I should write a print.summary method
 for this. Now I realise that I actually just want to
 print, and I don't really need the extra synopsis to
 be done by summary().
 
 Now is there any recommendation on this? My intuition
 would be to write a print, but not a summary method.
 
 That sounds fine for your purposes.
 

HeinzT Maybe I am wrong, but as far as I see, print() has
HeinzT the disadvantage that it has to return x and must
HeinzT not return the summarized results as an object.  You
HeinzT remember the difficulties with print.survfit.

HeinzT Instead it seems to be allowed that summary does not
HeinzT only print but also return summarized results.

HeinzT Is that right?

yes, that's right,  But let me tell more on the story:

I think I had recommended  summary.FOO() and print.summary.FOO()
to Christian because he *did* compute a few `useful' things on
his FOO object. In such a situation, we (R-core) recommend and
usually (very rarely not; for back-compatibility reasons only) implement
the following:

print.FOO: gives a one (sometimes two) paragraph overview of
   the fitted model object, and
   --yes-- *should* return its *unchanged* argument invisibly.

summary.FOO: computes more interesting things from the original
 FOO object, does *NOT print* anything (explicitly)
 and returns an object of class summary.FOO.

print.summary.FOO: now prints (an overview of) the summary.FOO
 object and -- since it's a print() method -- also
 returns its argument unchanged and invisibly.

  Examples of the above, can be inspected e.g., for 
  'FOO' in  { lm, glm, aov, nls, loess, princomp, prcomp, .. }

For Joe Average User, of course it *looks* like 
summary( FOO ) would print , but that's just because it
returns a summary.FOO object which is auto-printed subsequently
-- unless it's assigned or used in another expression.

My reasons for *not* providing a  summary.* and print.summary.*
method would have to be either *both* of 1) and 2)  *or* 3) :

1) print(obj) gives enough information 

2) the 'obj' already contains the interesting quantities, or
   these are either trivially computable from the contents of 'obj'
   or are already provided by other FOO methods, e.g., vcov.FOO().

3) lack of time and/or motivation; lazyness


Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NLME: Problem with plotting ranef vs a factor

2006-08-03 Thread Greg Distiller

Hi
I am following the model building strategy that is outlined in the Pinheiro and 
Bates book wrt including covariates but am having a problem with the plot. 
Basically I am using 4 covariates (1 of them is continuous) and 3 of them are 
fine but the 4th one is being shown as a scatterplot despite the fact that it 
is a factor. I have explicitly declared this to be a factor 
(pcat-as.factor(pcat)) and have also checked by using the is.factor and the 
levels command that it is a factor. Yet despite this the plot command is not 
recognising it as a factor.
 
Here is more information about my problem:

I am reading in the data by:

Data-read.csv(Data1_93_2.csv,header=T)
attach(Data)
Data1_93-transform(Data,log2game=log2(gamedens+1))
pcat-as.factor(pcat)
Data1_93-groupedData(log2game ~ day | subjectno, data=Data1_93)
detach(Data)

Here is the code to check that the covariate called pcat is indeed a factor:
 levels(pcat)
[1] 1 2 3

 is.factor(pcat)
[1] TRUE

and then after the model is fitted I extract the random effects:

D1C2.ran - ranef(mod11.103nlme,augFrame=T)

and here is an extract from the object:

C R  day   gamedens pcat   site   
mutcat1  pdens0  log2game
NA02_259 -1.016987007  0.0162825099 15.75000   23.51   Namaacha 
Mixed   15018  3.761099
NA02_073 -0.939355374  0.0132589702 10.5   23.750001   Namaacha 
Resistant6170  3.675543
M00_12   -0.775048474  0.0047124742 10.5   25.01 Mpumulanga 
Sensitive   17525  3.768326
M00_93   -0.555801118  0.0053872868 14.0   37.52 Mpumulanga 
Sensitive  332000  4.254319
NA02_053 -0.327990343 -0.0037659864  6.0   39.250001   Namaacha 
Resistant   65529  4.292481

Note that this output also seems to indicate that pcat is a factor as it is 
summarised correctly.

I then generate plots for my random effects:

plot(D1C2.ran,form= C ~site+mutcat2+pcat+pdens0)

and the problem is that the panel for my random effects vs pcat is displayed as 
a scatterplot rather than as a boxplot.
I am getting told to check warnings and these warnings look like:

Warning messages:
1: at  0.99
2: radius  0.0001
3: all data on boundary of neighborhood. make span bigger
4: pseudoinverse used at 0.99
5: neighborhood radius 0.01
6: reciprocal condition number  -1.#IND
7: zero-width neighborhood. make span bigger

I do not get these warnings if I exclude the problematic variable pcat so must 
be something to do with this. Any ideas?

Many thanks

Greg

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question about dll crashing R

2006-08-03 Thread Benn Fine

I have ported some R code to C to make it faster.

I can perform .Call(foobar,) once and it
works fine. Absolutely correct answer.

 If I put a loop inside foobar and run the  main code
routine more than 100 times, it crashes R.

Or if I call .Call(foobar) seperately more than
two tims it crashes R.


For the most part I am doing matirx multiplies using

EXP matrixprod (SEXP x , SEXP y )
{ int nrx , ncx , nry , ncy , mode;
SEXP xdims , ydims , ans ;
char *transa = N , *transb = N ;
double one = 1.0 , zero = 0.0 ;
xdims = getAttrib (x , R_DimSymbol ) ;
ydims = getAttrib (y , R_DimSymbol ) ;
mode = REALSXP;
nrx = INTEGER( xdims ) [ 0 ] ;
ncx = INTEGER( xdims ) [ 1 ] ;
nry = INTEGER( ydims ) [ 0 ] ;
ncy = INTEGER( ydims ) [ 1 ] ;
PROTECT( ans = allocMatrix (mode, nrx , ncy ) ) ;
F77_CALL(dgemm) ( transa , transb , nrx , ncy , ncx
, one ,
REAL( x ) , nrx , REAL( y ) , nry , zero ,
REAL( ans ) , nrx ) ;
UNPROTECT( 1 ) ;
return( ans ) ;
}

I am also generating random multiavriate normals using
the (not pretty) code

*
 1) Generate P independent standard normal deviates -
Ei ~ N(0,1)
 2) Using Cholesky decomposition find A s.t.
trans(A)*A = COVM
 3) trans(A)E + MEANV ~ N(MEANV,COVM)
*/

SEXP mvntest (SEXP mean, SEXP cov, SEXP temp)


{ int nrx , ncx , nry , ncy ,info,mode;
SEXP xdims , ydims , ans;

int i,j, one=1;
info = 1;

xdims = getAttrib (mean , R_DimSymbol ) ;
ydims = getAttrib (cov , R_DimSymbol ) ;
mode = REALSXP;
nrx = INTEGER( xdims ) [ 0 ] ;
ncx = INTEGER( xdims ) [ 1 ] ;
nry = INTEGER( ydims ) [ 0 ] ;
ncy = INTEGER( ydims ) [ 1 ] ;



/* create the upper trianglular matrix A */

/* such that t(A) %*% A = Sigma */

GetRNGstate();  

F77_CALL(dpofa) ( REAL( cov ), nry , ncy , info);
Rprintf(Info = %d\n,info);


for(i=0;inry;i++)
  for(j=0;ji;j++)
REAL(cov)[i+j*ncy] = 0.0;


PROTECT( ans = allocMatrix (mode, nrx , one ) ) ;
for(i=0;inry;i++)
  REAL(temp)[i] = rnorm(0,1);
ans = tmatrixprod(cov,temp);
for(i=0;inry;i++)
  REAL(ans)[i] = REAL(ans)[i]+REAL(mean)[i];
UNPROTECT( 1 ) ;
PutRNGstate();
return( ans ) ;


}


I have a feeling I am messing up memory usage
somewhere but haven't a clue. Do I need to do garbage
collecting inside the C program ?The fact that the
code
runs a few times before R crashes is driving me nuts. 
I send most of what I need into the C routine from R,
so I am not creating that many SEXP objects within the
program.

Any hints or ideas ?

Thanks!

Benn

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] best way to calculate per-parameter differences in across-subject means

2006-08-03 Thread René J.V. Bertin

Thanks, I'll look at that.

In the meantime, the code below is what I came up with myself. It does
what I want

# SelectCases(dat,crit) == subset(dat, crit, drop=FALSE)

SENSICK.AvScores- function( dat=SENSICK.items.tr )
{
n-nlevels(dat$Symptom)
data.frame(
Patient=c( rep(1,n), rep(0, 3*n)),
WasSick=c( rep(1,2*n), rep(NA,2*n)),
StrictSick=c( rep(NA,2*n), rep(-1,n), rep(1,n)),
Symptom=rep(levels(dat$Symptom),4),
AvScore=c(
with( SelectCases(dat, 'Patient==1  WasSick==1'), 
tapply(Score,
Symptom , mean) ),
with( SelectCases(dat, 'Patient==0  WasSick==1'), 
tapply(Score,
Symptom , mean) ),
with( SelectCases(dat, 'Patient==0  StrictSick==-1'),
tapply(Score, Symptom , mean) ),
with( SelectCases(dat, 'Patient==0  StrictSick==1'), 
tapply(Score,
Symptom , mean) )
)
)
}

AvScores-SENSICK.AvScores

with( AvScores, (barchart( AvScore[Patient==1] - AvScore[Patient==0 
WasSick==1]) ~ Symptom, scales=list( rot=c(45,0)) )

-- Dieter wrote:
Maybe it's a bit more than you want, but possibly you are happy with it: see the
example under TukeyHSD.

summary(fm1 - aov(breaks ~ wool + tension, data = warpbreaks))
TukeyHSD(fm1, tension, ordered = TRUE)
plot(TukeyHSD(fm1, tension))


Dieter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mult comp significance

2006-08-03 Thread Spencer Graves

  I read two questions:  physical meaning and how to display only 
significant differences.


PHYSICAL MEANING

  A p value is the probability of obtaining by chance alone a result at 
least as extreme as what we observe.  A simultaneous or adjusted p 
value is the probability that at least one of the multiple tests was at 
least as extreme as what observed.

  A (1-alpha) confidence interval or region is a random set that 
includes the true but unknown value with probability at least (1-alpha). 
  A set of (1-alpha) simultaneous confidence intervals is a random set 
in the multidimensional space that includes the true but unknown values 
of all parameter comparisons with probability at least (1-alpha).


SELECTING ONLY SIGNIFICANT DIFFERENCES

  Consider the following modification of the first example in the 
simint help page:

library(multcomp)
data(recovery)
RecInts - simint(minutes~blanket, data=recovery, type=Dunnett,
 conf.level=0.9, alternative=less,eps=0.0001)

  To see the structure of RecInts, look at str(RecInts).  This 
indicates that RecInts is a list, and one of its components is a named 
vector called p.value.adj.  This suggests we try the following:

  with(RecInts, p.value.adj[p.value.adj0.05])
blanketb2-blanketb0
5.484862e-05

  Hope this helps.
  Spencer Graves

Nair, Murlidharan T wrote:
 This has a stats question and a R question.  I am sure there are many
 core statisticians here how would know the answer to this simple
 question. In determining the significant comparisons using the methods
 in multcomp, the ones that are designated as significant are the ones
 that do not intersect the zero line. What is the physical meaning of
 this and why are those considered significant? I can sort those out and
 pick out by their adjusted pvalues to pick the top ones correct? Is
 there an method the multcomp that will output only the significant ones?
 
 Thanks ../Murli
 

 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tcltk package

2006-08-03 Thread Adrian DUSA

On Tuesday 01 August 2006 19:24, John McHenry wrote:
 [...]
 Yes, I built R myself. I couldn't find a debian package for R 2.3.1. The
 latest available is 2.2.1.

Oh, but there is... right on CRAN. For Dapper just add this line to your 
sources.list:

deb http://cran.R-project.org/bin/linux/ubuntu/ dapper/

This repository has lots of other packages compiled for Ubuntu, feel free to 
take a look.

HTH,
Adrian

-- 
Adrian DUSA
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
  +40 21 3120210 / int.101

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Default first argument in assignment function possible?

2006-08-03 Thread Heinz Tuechler

Dear All,

is there a possibility to provide a default for the first argument in an
assignment function?
I could not find out if and how. You see in the example below that not
explicitly stating the first argument leads to errors.

Thanks,
Heinz Tüchler

### example of assignment function
'testfun-' - function(x=a, y=b, value)
  { x - x+ y+ value
print(x)
x }

a - 0; b - 1; c - 2
testfun(c, 2) - 2  # result: 6
c - 2
testfun(c   ) - 2  # result: 5
testfun( , 2) - 2  # Error: argument is missing, with no default
testfun(y=2) - 2   # Error: target of assignment expands to non-language
object
testfun() - 2  # Error: invalid (NULL) left side of assignment

   _
platform   i386-pc-mingw32  
arch   i386 
os mingw32  
system i386, mingw32
status Patched  
major  2
minor  3.1  
year   2006 
month  07   
day23   
svn rev38687
language   R
version.string Version 2.3.1 Patched (2006-07-23 r38687)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NLME: Problem with plotting ranef vs a factor

2006-08-03 Thread Dirk Enzmann

Greg,

be careful using attach() and detach(). From the syntax snippets you 
showed it seems that you did create an object pcat (factor 
variable), but you did not change the respective variable in your 
data frame.

Try to remove pcat and see what happens do the results of lme()!

Dirk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] random effects with lmer() and lme(), three random factors

2006-08-03 Thread Spencer Graves

  In theory, 'lme' may be able to handle crossed random effects, but I 
don't know how to do it.  I've used 'lmer' for such models, and your 
'lmer' code looks plausible to me.  Therefore, I would be inclined to 
believe the 'lmer' results.

  To remove any lingering doubt, I suggest you construct Monte Carlo 
simulations where you know the structure.  Then feed the simulations to 
'lme' and 'lmer' and see what you get.

  Hope this helps.
  Spencer Graves

Xianqun (Wilson) Wang wrote:
 Hi, all,
 
  
 
 I have a question about random effects model. I am dealing with a
 three-factor experiment dataset. The response variable y is modeled
 against three factors: Samples, Operators, and Runs. The experimental
 design is as follow:
 
  
 
 4 samples were randomly chosen from a large pool of test samples. Each
 of the 4 samples was analyzed by 4 operators, randomly selected from a
 group of operators. Each operator independently analyzed same samples
 over 5 runs (runs nested in operator). I would like to know the
 following things:
 
  
 
 (1) the standard deviation within each run;
 
 (2) the standard deviation between runs;
 
 (3) the standard deviation within operator
 
 (4) the standard deviation between operator.
 
  
 
 With this data, I assumed the three factors are all random effects. So
 the model I am looking for is
 
  
 
 Model:  y  = Samples(random) + Operator(random) + Operator:Run(random) +
 Error(Operator) + Error(Operator:Run)  + Residuals
 
  
 
 I am using lme function in nlme package. Here is the R code I have
 
  
 
 1. using lme:
 
 First I created a grouped data using
 
 gx - groupedData(y ~ 1 | Sample, data=x)
 
 gx$dummy - factor(rep(1,nrow(gx)))
 
  
 
 then I run the lme
 
  
 
 fm- lme(y ~ 1, data=gx,
 random=list(dummy=pdBlocked(list(pdIdent(~Sample-1),
 
 pdIdent(~Operator-1), 
 
 pdIdent(~Operator:Run-1)
 
  
 
 finally, I use VarCorr to extract the variance components
 
  
 
vc - VarCorr(fm)
 
  
 
  Variance   StdDev  
 
 Operator:Run 1.595713e-10:20   1.263215e-05:20  
 
 Sample   5.035235e+00: 4   2.243933e+00: 4  
 
 Operator 5.483145e-04: 4   2.341612e-02: 4  
 
 Residuals8.543601e-02: 1   2.922944e-01: 1  
 
  
 
  
 
 2.Using lmer in Matrix package:
 
  
 
 fm - lmer(y ~ (1 | Sample) + (1 | Operator) + 
 
(1|Operator:Run), data=x)
 
  summary(fm)
 
  
 
 Linear mixed-effects model fit by REML 
 
 Formula: H.I.Index ~ (1 | Sample.Name) + (1 | Operator) + (1 |
 Operator:Run) 
 
   Data: x 
 
   AIC  BIClogLik MLdeviance REMLdeviance
 
  96.73522 109.0108 -44.36761   90.80064 88.73522
 
 Random effects:
 
  Groups   NameVariance   Std.Dev.  
 
  Operator:Run (Intercept) 4.2718e-11 6.5359e-06
 
  Operator (Intercept) 5.4821e-04 2.3414e-02
 
  Sample   (Intercept) 5.0352e+00 2.2439e+00
 
  Residual 8.5436e-02 2.9229e-01
 
 number of obs: 159, groups: Operator:Run, 20; Operator, 4; Sample.Name,
 4
 
  
 
 Fixed effects:
 
  Estimate Std. Error  t value
 
 (Intercept) 0.0020818  1.1222683 0.001855
 
  
 
  
 
 There is a difference between lmer and lme is for the factor
 Operator:Run.  I cannot find where the problem is. Could anyone point me
 out if my model specification is correct for the problem I am dealing
 with. I am pretty new user to lme and lmer. Thanks for your help!
 
  
 
  
 
 Wilson Wang
 
  
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tcl/tk bind destroy event

2006-08-03 Thread Jean Coursol

Perhaps Destroy key is unknown by Tcl; it is not in the
Event modifiers table in Welch Book...
But try with Control-L or Shift-Control_L, it runs
(but not with Control_L-Shift ??).

Use xmodmap to see the current mappings from keys to modifiers.

Jean Coursol

On Wed, 2 Aug 2006, Franco Mendolia wrote:

 Hello!

 I want to create a messagebox whenever the user wants to destroy the window
 (e.g. Alt-F4 or the 'x' in the right top corner) and ask if a modified file
 should be saved or not.

 If 'cancel' is chosen then nothing should happen and the windows still should 
 be
 existing. This doesn't work. When I press cancel the window will be destroyed
 although.

 I also implemented a menu item 'Quit' where I show the same messagebox and 
 there
 it works fine.

 How can I make it work or is there another method to do this? I'm very new to 
 R
 and tcl/tk.

 Here is part of my code:


exitProg - function()
{
  returnVal - tkmessageBox(title=Question,
message=Save modified file?,
icon=question, type=yesnocancel, default=yes)

  returnVal - as.character(returnVal)

  if( returnVal == yes )
  {
# save file
value - saveFile()
# destroy window when save was successfull
if( value == 1 )
  tkdestroy(mw)
  }
  if( returnVal == no )
  {
tkdestroy(mw)
  }
  if( returnVal == cancel )
  {
# do nothing
cat(Cancel was pressed.\n)
  }
}

# bind the destroy event in order to show a message box
tkbind(mw,Destroy,exitProg)

# menu item which works fine
tkadd(fileMenu, command, label=Quit, command=exitProg)



 Thank you.

 Franco Mendolia

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] between-within anova: aov and lme

2006-08-03 Thread William Simpson

I have 2 questions on ANOVA with 1 between subjects factor and 2 within factors.

1. I am confused on how to do the analysis with aov because I have seen two 
examples
on the web with different solutions.

a) Jon Baron (http://www.psych.upenn.edu/~baron/rpsych/rpsych.html) does
6.8.5 Example 5: Stevens pp. 468 - 474 (one between, two within)

between: gp
within: drug, dose
aov(effect ~ gp * drug * dose + Error(subj/(dose*drug)), data=Ela.uni)

b) Bill Venables answered a question on R help as follows.

- factor A between subjects
- factors B*C within subjects.

aov(response ~ A*B*C + Error(subject), Kirk)
An alternative formula would be response ~ A/(B*C) + Error(subject), which
would only change things by grouping together some of the sums of squares.

---
SO: which should I do
aov(response ~ A*B*C + Error(subject), Kirk)
aov(response ~ A/(B*C) + Error(subject), Kirk)
aov(response ~ A*B*C + Error(subject/(B*C)), Kirk)


2. How would I do the analysis in lme()?
Something like
lme(response~A*B*C,random=~1|subject/(B*C))???


Thanks very much for any help!
Bill Simpson

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tcl/tk bind destroy event

2006-08-03 Thread Franco Mendolia

Hi!

 Perhaps Destroy key is unknown by Tcl; it is not in the
 Event modifiers table in Welch Book...

I think the Destroy key is known, because when destroying the window with 
Alt-F4 
or the littel x in the topcorner my function exitProg is executed.

What I actually search for is a possibility to let the window not be destroyed 
when user closes the window by Alt-F4 or the little x.

Franco Mendolia

exitProg - function()
{
  returnVal - tkmessageBox(title=Question,
message=Save modified file?,
icon=question, type=yesnocancel, default=yes)

  returnVal - as.character(returnVal)

  if( returnVal == yes )
  {
# save file
value - saveFile()
# destroy window when save was successfull
if( value == 1 )
  tkdestroy(mw)
  }
  if( returnVal == no )
  {
tkdestroy(mw)
  }
  if( returnVal == cancel )
  {
# do nothing
cat(Cancel was pressed.\n)
  }
}

# bind the destroy event in order to show a message box
tkbind(mw,Destroy,exitProg)

# menu item which works fine
tkadd(fileMenu, command, label=Quit, command=exitProg)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Invoking R on UNIX Command line from a PERL CGI

2006-08-03 Thread Shalini Sridhar


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tcl/tk bind destroy event

2006-08-03 Thread Peter Dalgaard

Jean Coursol [EMAIL PROTECTED] writes:

 Perhaps Destroy key is unknown by Tcl; it is not in the
 Event modifiers table in Welch Book...
 But try with Control-L or Shift-Control_L, it runs
 (but not with Control_L-Shift ??).

I don't think that is it. As I read it, Tk catches the Destroy event
alright, but the problem is that it is too late: the window
destruction is already in motion at the time and there's no way to
undestroy it from within binding. I think you need to create a wm
protocol handler for WM_DELETE_WINDOW. As in
 
x - tktoplevel()
tcl(wm, protocol, x, WM_DELETE_WINDOW, quote(cat(I'm staying!\n)))

--- now try to destroy the window using window controls ---

tcl(wm, protocol, x, WM_DELETE_WINDOW, )

--- things should be back to normal now ---


 Use xmodmap to see the current mappings from keys to modifiers.
 
 Jean Coursol
 
 On Wed, 2 Aug 2006, Franco Mendolia wrote:
 
  Hello!
 
  I want to create a messagebox whenever the user wants to destroy the window
  (e.g. Alt-F4 or the 'x' in the right top corner) and ask if a modified 
  file
  should be saved or not.
 
  If 'cancel' is chosen then nothing should happen and the windows still 
  should be
  existing. This doesn't work. When I press cancel the window will be 
  destroyed
  although.
 
  I also implemented a menu item 'Quit' where I show the same messagebox and 
  there
  it works fine.
 
  How can I make it work or is there another method to do this? I'm very new 
  to R
  and tcl/tk.
 
  Here is part of my code:
 
 
 exitProg - function()
 {
   returnVal - tkmessageBox(title=Question,
 message=Save modified file?,
 icon=question, type=yesnocancel, default=yes)
 
   returnVal - as.character(returnVal)
 
   if( returnVal == yes )
   {
 # save file
 value - saveFile()
 # destroy window when save was successfull
 if( value == 1 )
   tkdestroy(mw)
   }
   if( returnVal == no )
   {
 tkdestroy(mw)
   }
   if( returnVal == cancel )
   {
 # do nothing
 cat(Cancel was pressed.\n)
   }
 }
 
 # bind the destroy event in order to show a message box
 tkbind(mw,Destroy,exitProg)
 
 # menu item which works fine
 tkadd(fileMenu, command, label=Quit, command=exitProg)
 
 
 
  Thank you.
 
  Franco Mendolia
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] tcl/tk bind destroy event

2006-08-03 Thread Franco Mendolia


 I don't think that is it. As I read it, Tk catches the Destroy event
 alright, but the problem is that it is too late: the window
 destruction is already in motion at the time and there's no way to
 undestroy it from within binding. I think you need to create a wm
 protocol handler for WM_DELETE_WINDOW. As in
  
 x - tktoplevel()
 tcl(wm, protocol, x, WM_DELETE_WINDOW, quote(cat(I'm staying!\n)))
 
 --- now try to destroy the window using window controls ---
 
 tcl(wm, protocol, x, WM_DELETE_WINDOW, )
 
 --- things should be back to normal now ---

Yeah! That's it! Thanks!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with factor in list (same name for diffrent category)

2006-08-03 Thread Thomas Kuster

Hello

I try to ignore the problems with the umlaut.
I read the file in with:

 library(foreign)
 daten - read.spss(filename)

Then I want select a Vorlage (vote), but if I want select for example [6] 
and not 6 and 7 how can I select it? The real name is a other problem, but I 
can solve this via date.

 levels(daten$PROJETX)
  [1] Bg Stammzellenforschung
  [2] Bb  
  [3] Bb Neugestaltung des Finanzausgleichs
  [4] 
  [5] EV Postdienste f
  [6] Bb 
  [7] Bb 
  [8] Bg Steuerpaket
  [9] Bb Anhebung der Mehrwertsteuer s
 [10] 11. AHV-Revision
 [11] Volkinitiative Lebenslange Verwahrung
 [12] 
   .
   .
   .

Select with:
 pos - daten$PROJETX==11. AHV-Revision
 summary(pos)
   Mode   FALSETRUE
logical  2003731002

This is okay.

 pos - daten$PROJETX==(levels(daten$PROJETX)[6])
 summary(pos)
   Mode   FALSETRUE
logical  1949046471

Too many TRUE's.

Thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector DF [a, b] to replace values :was Finding the position of a variable in a data.frame

2006-08-03 Thread John Kane


--- jim holtman [EMAIL PROTECTED] wrote:

 ?which
 
  which(Df = 50, arr.ind=T)
   row col
 5   5   4

This works very nicely as has some other suggestions
on how to replace a value. Assuming that I have more
than one correction to make where Df = 50, can I use
vectors in the Df[] to do this.

My attempt shows that I can use vectors but it appears
thatthere is something wrong with my logic 

Eample

cat - c( 3,5,6,8,0)
dog - c(3,5,3,6, 0)
rat - c (5, 5, 4, 9, 51)
bat - c( 12, 42, 45, 32, 54)

Df - data.frame(cbind(cat, dog, rat, bat))
post - which(Df = 50, arr.ind=T)  # find values .=
50
post
correction - c(77, 88)  # new correct values
row - post[ ,1]  # vector for row number
col - post[ ,2]   # vector for column number

Df[row,col] -  correction
Df

---result-
  cat dog rat bat
1   3   3   5  12
2   5   5   5  42
3   6   3   4  45
4   8   6   9  32
5   0   0  88  88

I am replacing both instances with 88 which is not
correct

Thanks



 On 8/2/06, John Kane [EMAIL PROTECTED] wrote:
 
  Simple problem but I don't see the answer. I'm
 trying
  to clean up some data
  I have 120 columns in a data.frame.  I have one
 value
  in a column named blaw that I want to change.
 How do
  I find the coordinates. I can find the row by
 doing a
  subset on the data.frame but how do I find out
 here
  blaw  is in columns without manually counting
 them
  or converting names(Df) to a list and reading down
 the
  list.
 
  Simple example
 
  cat - c( 3,5,6,8,0)
  dog - c(3,5,3,6, 0)
  rat - c (5, 5, 4, 9, 0)
  bat - c( 12, 42, 45, 32, 54)
 
  Df - data.frame(cbind(cat, dog, rat, bat))
  Df
  subset(Df, bat = 50)
 
  results
  cat dog rat bat
  5   0   0   0  54
 
 
  Thus I know that my target is in row 5 but how do
 I
  figure out where 'bat' is?
 
  All I want to do is be able to say
  Df[5,4] - 100
 
  Is there some way to have function(bat) return the
  column number: some kind of a colnum() function? 
 I
  had thought that I had found somthing  in
  library(gdata) matchcols but no luck.
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem you are trying to solve?


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to use the EV AND condEV from BMA's results?

2006-08-03 Thread zhijie zhang

Dear friends,
In R, the help of bic.glm tells the difference between postmean(the
posterior mean of each coefficient from model averaging) and
condpostmean(the posterior mean of each coefficient conditional on the
variable being included in the model), But it's still unclear about the
results explanations, and the artile of Rnews in 2005 on BMA still don't
give more detail on it.
Suppose my results of logistic regression analyzed by bic.glm (BMA) as
follows:(dataset is birthwt(MASS) and i include the interaction)



  p!=0  EV  SD condEV  cond SDmodel 1   model
2   model 3   model 4model 5

Intercept 100 0.1841  1.2204   0.1841.2201.017
1.175-0.853-1.057 0.532

age17.8   -0.0113  0.0285  -0.0630.036 .
. . .   -0.071

lwt   50.0   -0.0079  0.0093   -0.016   0.007   -0.017-
0.017  . . .

smokeTRUE  9.5   0.0469  0.1798   0.4960.345 .
.
   . .  .

ptdTRUE   99.41.5161  0.4751   1.526   0.461   1.407
1.596 1.732 1.463  1.608

htTRUE54.4   0.9477  1.02691.742   0.7441.894
1.930  . . .

uiTRUE13.30.0976  0.2987   0.7310.453 .
. . . .

ftv   12.3


   .1-0.0257  0.5117   -0.209   2.438.
.
-0.867  . .

   .2+0.7470  2.1277   6.0813.371.
.6.024  . .

age.ftv1  33.7   -0.0136  0.0278  -0.0400.035 .   -
0.036  . . .

age.ftv2. 15.9   -0.0340  0.0950  -0.2140.135 .
.   -0.271  . .

smokeTRUE.uiTRUE   2.4   0.0103  0.12090.422   0.652.
   .
. .  .




nVar3
4
3 1 2
post prob 0.117
0.086  0.083 0.061 0.044

1. how should I write my final logistic model?
2. Which parameter estimation should be used, condEV OR EV? How should I use
the two different parameter estimations correctly?
Thanks for your precious time!


-- 
Kind Regards,
Zhi Jie,Zhang

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector DF [a, b] to replace values :was Finding the position of a variable in a data.frame

2006-08-03 Thread Peter Dalgaard

John Kane [EMAIL PROTECTED] writes:

 --- jim holtman [EMAIL PROTECTED] wrote:
 
  ?which
  
   which(Df = 50, arr.ind=T)
row col
  5   5   4
 
 This works very nicely as has some other suggestions
 on how to replace a value. Assuming that I have more
 than one correction to make where Df = 50, can I use
 vectors in the Df[] to do this.
 
 My attempt shows that I can use vectors but it appears
 thatthere is something wrong with my logic 
 
 Eample
 
 cat - c( 3,5,6,8,0)
 dog - c(3,5,3,6, 0)
 rat - c (5, 5, 4, 9, 51)
 bat - c( 12, 42, 45, 32, 54)
 
 Df - data.frame(cbind(cat, dog, rat, bat))
 post - which(Df = 50, arr.ind=T)  # find values .=
 50
 post
 correction - c(77, 88)  # new correct values
 row - post[ ,1]  # vector for row number
 col - post[ ,2]   # vector for column number
 
 Df[row,col] -  correction
 Df
 
 ---result-
   cat dog rat bat
 1   3   3   5  12
 2   5   5   5  42
 3   6   3   4  45
 4   8   6   9  32
 5   0   0  88  88
 
 I am replacing both instances with 88 which is not
 correct

You're being bitten by two issues here: One is that Df[row,col] does
not vectorize in parallel in the two indices: Df[1:2,1:2] has *four*
elements, not two. Another is that data frames are not matrices: you
can have different types of values in different columns, so you cannot
expect to pick an arbitrary set of elements and assign a vector into
it (well, it doesn't work, anyway).

This sort of stuff works much easier with matrices, where we also have
the wonderful feature of indexing with a matrix:

 M - cbind(cat, dog, rat, bat)
 M[post]- correction
 M
 cat dog rat bat
[1,]   3   3   5  12
[2,]   5   5   5  42
[3,]   6   3   4  45
[4,]   8   6   9  32
[5,]   0   0  77  88


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bringToTop without focus?

2006-08-03 Thread christian.ritter

Hi all who know R on Windows,

Quick question: Is there a way to do bringToTop(stay=TRUE) without giving 
focus? I would like to pop a graph window but I would like to preserve focus 
in the window which I was in before. 

Thanks for any lead,

Chris

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bringToTop without focus?

2006-08-03 Thread Duncan Murdoch

On 8/3/2006 9:13 AM, [EMAIL PROTECTED] wrote:
 Hi all who know R on Windows,
 
 Quick question: Is there a way to do bringToTop(stay=TRUE) without giving 
 focus? I would like to pop a graph window but I would like to preserve 
 focus in the window which I was in before. 

No, there's no way to do that.  The closest is

{ bringToTop(stay=TRUE); bringToTop(-1) }

but that only works if the previous window was the console.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Looking for transformation to overcome heterogeneity of variances

2006-08-03 Thread Paul Smith

Dear All

My data consists in 96 groups, each one with 10 observations. Levene's
test suggests that the variances are not equal, and therefore I have
tried to apply the classical transformations to have homocedasticity
in order to be able to use ANOVA. Unfortunately, no transformation
that I have used transforms my data into data with homocedasticity.
The histogram of variances is at

http://phhs80.googlepages.com/hist1.png

Is someone able to suggest to me a transformation to overcome the
problem of heterocedasticity?

Thanks in advance,

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss and umlaut

2006-08-03 Thread Thomas Lumley


On Thu, 3 Aug 2006, Thomas Kuster wrote:


Hello

Am Mittwoch, 2. August 2006 17.11 schrieb Thomas Lumley:

This sounds like a conflict between encodings -- eg if R is assuming UTF-8
and the file is encoding in Latin-1 then the sequence
U+00FC : LATIN SMALL LETTER U WITH DIAERESIS
U+0072 : LATIN SMALL LETTER R
is coded as FC72 in the file, which is an illegal byte sequence in UTF-8.


Hex:  74 65 20 66 fc 72 20 61 6c 6c 65 53 45 2f 31 36
Text:  t  e f  ?  r a  l  l  e  S  E  /  1  6


Ok, so that looks like Latin-1 encoding in the file


The underlying C code (being written in the US quite a long time ago)
doesn't know about encodings, and I don't know what the rules are in SPSS
for valid characters (I suspect that in these old portable file formats it
probably just reads and writes bytes, leaving it up to the OS to interpret
them.


But why stopp the C code reading? Is / not the endmark of the string? What
is the problem, if I chance that in the source?


You haven't shown anything that indicates that the C code stopped reading. 
More likely R just stops displaying when it gets to an illegal byte 
sequence.  You could use nchar() to count the bytes in the string to find 
out.



You could try running R in a non-UTF-8 locale to see if it helps.


I think my local is non-UTF-8 (de_CH, isolatin). How can I check that, and set
an other temporary?


You can use charToRaw() to see what R thinks the byte sequence is for a 
word with a u-umlaut.


Sys.setlocale() will let you change the locale, but your locale does look 
non-UTF-8.


This is all guesswork since we can't see the file.

-thomas__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity of variances

2006-08-03 Thread Peter Dalgaard

Paul Smith [EMAIL PROTECTED] writes:

 Dear All
 
 My data consists in 96 groups, each one with 10 observations. Levene's
 test suggests that the variances are not equal, and therefore I have
 tried to apply the classical transformations to have homocedasticity
 in order to be able to use ANOVA. Unfortunately, no transformation
 that I have used transforms my data into data with homocedasticity.
 The histogram of variances is at
 
 http://phhs80.googlepages.com/hist1.png
 
 Is someone able to suggest to me a transformation to overcome the
 problem of heterocedasticity?

Not based on that information. Try the following instead:

fit - lm(y~g)
par(mfrow=c(2,2)); plot(fit)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] get() in sapply() in with()

2006-08-03 Thread Heinz Tuechler

Dear All,

applying some function within a with() function I wanted to use also
sapply() and get() to form a data.frame, but did not succede.
Below is a simplified example.
It is possible to use sapply() within a with() function, it is also
possible to use get() within a with() function, but when I try to use get
within sapply within with I arrive at Error in get(x, envir, mode,
inherits) : variable v5 was not found.
Is there a solution?

Thanks,
Heinz Tüchler


## example
df1 - data.frame(v5=16:20, v6=21:25, v7=I(letters[16:20]), v8=letters[16:20])

with(df1, sapply(c('v5', 'v6'), get) ) ## Error, see next line
## Error in get(x, envir, mode, inherits) : variable v5 was not found

with(df1, sapply(list(v5, v6), mean) ) # does work
with(df1, get('v5') ) # does work

platform   i386-pc-mingw32  
arch   i386 
os mingw32  
system i386, mingw32
status Patched  
major  2
minor  3.1  
year   2006 
month  07   
day23   
svn rev38687
language   R
version.string Version 2.3.1 Patched (2006-07-23 r38687)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity of variances

2006-08-03 Thread Paul Smith

On 03 Aug 2006 15:45:10 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote:
  My data consists in 96 groups, each one with 10 observations. Levene's
  test suggests that the variances are not equal, and therefore I have
  tried to apply the classical transformations to have homocedasticity
  in order to be able to use ANOVA. Unfortunately, no transformation
  that I have used transforms my data into data with homocedasticity.
  The histogram of variances is at
 
  http://phhs80.googlepages.com/hist1.png
 
  Is someone able to suggest to me a transformation to overcome the
  problem of heterocedasticity?

 Not based on that information. Try the following instead:

 fit - lm(y~g)
 par(mfrow=c(2,2)); plot(fit)

Thanks, Peter. By 'g', you mean

factor1* factor2*factor3*factor4

?

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] efficient way to make NAs of empty cells in a factor (or character)

2006-08-03 Thread Henrik Parn

Dear all,

I have some csv-files (originating from Excel-files) containing empty 
cells. In my example file I have four variables of different classes, 
each with some empty cells in the original csv-file:

  test - read.csv2(test.csv, dec=.)

  test
  id id2  x   y
1  a  1  NA
2  b   e NA 2.2
3  f  3 3.3
4  c   g  4 4.4


  class(test$id)
[1] factor
  class(test$id2)
[1] factor
  class(test$x)
[1] integer
  class(test$y)
[1] numeric

In the help text of read.csv2 you can read 'Blank fields are also 
considered to be missing values in logical, integer, numeric and complex 
fields.'. Thus, empty cells in a factor (or a character I assume) is not 
considered as missing values but an own level:

  is.na(test$id)
[1] FALSE FALSE FALSE FALSE
  levels(test$id)
[1]   a b c

When I work with my real (larger) dataset I would like to use functions 
like 'is.na' and '!is.na' on factors. Now I wonder if there is an R 
alternativ to do 'search (for empty cells) and replace (with NA)' in Excel?

I have tried a modification of Uwe Ligges suggestion on missing value 
posted 2 Aug:
  is.na(test[test==]) - TRUE

...but it did not work on the data set:

Error in [-.data.frame(`*tmp*`, test == , value = c(NA, NA, NA, NA :
rhs is the wrong length for indexing by a logical matrix


However it worked fine when applied to a single vector:

  is.na(test$id[test$id==]) - TRUE
  test$id
[1] abNA c  
Levels:  a b c

  is.na(test$id)
[1] FALSE FALSE  TRUE FALSE

Is there a more efficient way to fill empty cells in all my factors in R 
or should I just do it in advance in Excel by 'search and replace'?

Thanks in advance!

-- 

Henrik Pärn
Department of Biology
NTNU
7491 Trondheim
Norway

+47 735 96282 (office)
+47 909 89 255 (mobile)
+47 735 96100 (fax)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bringToTop without focus?

2006-08-03 Thread christian.ritter

Thanks for fast response.

'Had another look at the problem and found a partial but nice solution which 
will solve my present problem and might be useful to others.

Here is the context:
In fact I have all my data and stuff in Excel and I use R(D)COM with Rexcel to 
communicate with R and do some of the work there. One of the things I do is to 
create a graph in R and to display it. To make sure it stays visible, I use 
bringToTop(stay=TRUE) from R. But then I would like to keep working in Excel 
without having to re-activate the workbook window. This is where the problem 
originated. Within excel there is a somewhat crude solution:
Submit the code which makes the graph via a VBA procedure (macro) and end it 
with SendKeys %{TAB}. Here is an example:

  Sub Refilter
  Range(Alldata).AdvancedFilter Action:=xlFilterCopy, CriteriaRange:=Range( _
Criteria), CopyToRange:=Range(Extractrange), Unique:=False
  SendCommands Range(DiagPlot)
  SendKeys %{TAB}
  End sub

Here DiagPlot is a range which contains the code to construct the plot such 
as plot(x,y). 

The VBA macro should be run when the worksheet of interest has focus. The call 
SendKeys %{TAB} simulates pressing ALT TAB which goes back to the previous 
window on the stack. This was the one which had focus when bringToTop. There we 
go. 

Have a nice day,

Chris.

Morale: if impossible / change the point of view

-Original Message-
From: Duncan Murdoch [mailto:[EMAIL PROTECTED]
Sent: Thursday, 03 August, 2006 3:18 PM
To: Ritter, Christian C GSMCIL-GSTMS/2
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] bringToTop without focus?


On 8/3/2006 9:13 AM, [EMAIL PROTECTED] wrote:
 Hi all who know R on Windows,
 
 Quick question: Is there a way to do bringToTop(stay=TRUE) without giving 
 focus? I would like to pop a graph window but I would like to preserve 
 focus in the window which I was in before. 

No, there's no way to do that.  The closest is

{ bringToTop(stay=TRUE); bringToTop(-1) }

but that only works if the previous window was the console.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Vectorizing a for loop

2006-08-03 Thread Daniel Gerlanc

Hello all,

Consider the following problem:

There are two vectors:

rows - c(1, 2, 3, 4, 5)
columns - c(10, 11, 12, 13, 14)

I want to create a matrix with dimensions length(rows) x length(columns):

res - matrix(nrow = length(rows), ncol = length(columns))

If i and j are the row and column indexes respectively, the values
of the cells are abs(rows[i] - columns[j]).  The resultant matrix
follows:

 [,1] [,2] [,3] [,4] [,5]
[1,]9   10   11   12   13
[2,]8910   11   12
[3,]78  9   10   11
[4,]67  8 9   10
[5,]56  7 89

This matrix may be generated by using a simple for loop:

for(i in 1:length(rows)){
  for(j in 1:length(columns)){
res[i,j] - abs(rows[i] - columns[j])
  }
}

Is there a quicker, vector-based approach for doing this or a function
included in the recommended packages that does this?

Thanks!

-- Dan Gerlanc
Williams College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] questions on plotting dedrograms

2006-08-03 Thread scheidler

Hi,

i've two questions concerning the plot of a dendrogram. first, i use 
hclust for clustering and if i plot the dendrogram, then the maximal 
height is the maximal dissimilarity found in my data. but i want to have 
a arbitary maximal height. for example if the maximal dissimilarity in 
my data is 50 and i want a height of 100, the plot should be compressed 
by 1/2 and the line to the root node should be as long as the rest of 
the plot. How can i do that ? (i want to use it for comparing 2 
clusterings with the same maximal possible dissimilarity)
The other question is, how can i plot the dendrogram horizontal, but 
with the root node to the right ?

thx for your efford
alexander scheidler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] questions on plotting dedrograms

2006-08-03 Thread scheidler

Hi,

i've two questions concerning the plot of a dendrogram. first, i use 
hclust for clustering and if i plot the dendrogram, then the maximal 
height is the maximal dissimilarity found in my data. but i want to have 
a arbitary maximal height. for example if the maximal dissimilarity in 
my data is 50 and i want a height of 100, the plot should be compressed 
by 1/2 and the line to the root node should be as long as the rest of 
the plot. How can i do that ? (i want to use it for comparing 2 
clusterings with the same maximal possible dissimilarity)
The other question is, how can i plot the dendrogram horizontal, but 
with the root node to the right ?

thx for your efford
alexander scheidler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] efficient way to make NAs of empty cells in a factor (orcharacter)

2006-08-03 Thread Dimitris Rizopoulos

try to use the 'na.strings' argument of read.csv(), e.g.,

test - read.csv(test.csv, na.strings = )


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm



- Original Message - 
From: Henrik Parn [EMAIL PROTECTED]
To: R-help r-help@stat.math.ethz.ch
Sent: Thursday, August 03, 2006 3:46 PM
Subject: [R] efficient way to make NAs of empty cells in a factor 
(orcharacter)


Dear all,

I have some csv-files (originating from Excel-files) containing empty
cells. In my example file I have four variables of different classes,
each with some empty cells in the original csv-file:

  test - read.csv2(test.csv, dec=.)

  test
  id id2  x   y
1  a  1  NA
2  b   e NA 2.2
3  f  3 3.3
4  c   g  4 4.4


  class(test$id)
[1] factor
  class(test$id2)
[1] factor
  class(test$x)
[1] integer
  class(test$y)
[1] numeric

In the help text of read.csv2 you can read 'Blank fields are also
considered to be missing values in logical, integer, numeric and 
complex
fields.'. Thus, empty cells in a factor (or a character I assume) is 
not
considered as missing values but an own level:

  is.na(test$id)
[1] FALSE FALSE FALSE FALSE
  levels(test$id)
[1]   a b c

When I work with my real (larger) dataset I would like to use 
functions
like 'is.na' and '!is.na' on factors. Now I wonder if there is an R
alternativ to do 'search (for empty cells) and replace (with NA)' in 
Excel?

I have tried a modification of Uwe Ligges suggestion on missing value
posted 2 Aug:
  is.na(test[test==]) - TRUE

...but it did not work on the data set:

Error in [-.data.frame(`*tmp*`, test == , value = c(NA, NA, NA, 
NA :
rhs is the wrong length for indexing by a logical matrix


However it worked fine when applied to a single vector:

  is.na(test$id[test$id==]) - TRUE
  test$id
[1] abNA c
Levels:  a b c

  is.na(test$id)
[1] FALSE FALSE  TRUE FALSE

Is there a more efficient way to fill empty cells in all my factors in 
R
or should I just do it in advance in Excel by 'search and replace'?

Thanks in advance!

-- 

Henrik Pärn
Department of Biology
NTNU
7491 Trondheim
Norway

+47 735 96282 (office)
+47 909 89 255 (mobile)
+47 735 96100 (fax)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing a for loop

2006-08-03 Thread Neuro LeSuperHéros

Hi,

Try this:

rows - c(1, 2, 3, 4, 5)
columns - c(10, 11, 12, 13, 14)
expanded -expand.grid(rows,columns)
abslist -abs(ex$Var1-ex$Var2)
res - matrix(abslist,nrow = length(rows), ncol = length(columns))

Neurorox


From: Daniel Gerlanc [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject: [R] Vectorizing a for loop
Date: Thu, 3 Aug 2006 10:10:46 -0400

Hello all,

Consider the following problem:

There are two vectors:

rows - c(1, 2, 3, 4, 5)
columns - c(10, 11, 12, 13, 14)

I want to create a matrix with dimensions length(rows) x length(columns):

res - matrix(nrow = length(rows), ncol = length(columns))

If i and j are the row and column indexes respectively, the values
of the cells are abs(rows[i] - columns[j]).  The resultant matrix
follows:

  [,1] [,2] [,3] [,4] [,5]
[1,]9   10   11   12   13
[2,]8910   11   12
[3,]78  9   10   11
[4,]67  8 9   10
[5,]56  7 89

This matrix may be generated by using a simple for loop:

for(i in 1:length(rows)){
   for(j in 1:length(columns)){
 res[i,j] - abs(rows[i] - columns[j])
   }
}

Is there a quicker, vector-based approach for doing this or a function
included in the recommended packages that does this?

Thanks!

-- Dan Gerlanc
Williams College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing a for loop

2006-08-03 Thread Thomas Lumley

On Thu, 3 Aug 2006, Daniel Gerlanc wrote:

 Hello all,

 Consider the following problem:

 There are two vectors:

 rows - c(1, 2, 3, 4, 5)
 columns - c(10, 11, 12, 13, 14)

 I want to create a matrix with dimensions length(rows) x length(columns):

 res - matrix(nrow = length(rows), ncol = length(columns))

 If i and j are the row and column indexes respectively, the values
 of the cells are abs(rows[i] - columns[j]).  The resultant matrix
 follows:

 [,1] [,2] [,3] [,4] [,5]
 [1,]9   10   11   12   13
 [2,]8910   11   12
 [3,]78  9   10   11
 [4,]67  8 9   10
 [5,]56  7 89

 This matrix may be generated by using a simple for loop:

 for(i in 1:length(rows)){
  for(j in 1:length(columns)){
res[i,j] - abs(rows[i] - columns[j])
  }
 }

 Is there a quicker, vector-based approach for doing this or a function
 included in the recommended packages that does this?


abs(outer(rows,columns,-))


-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing a for loop

2006-08-03 Thread Petr Pikal

Hi

outer(rows, columns, function(x,y) abs(x-y))
shall do it.

HTH
Petr


On 3 Aug 2006 at 10:10, Daniel Gerlanc wrote:

Date sent:  Thu, 3 Aug 2006 10:10:46 -0400
From:   Daniel Gerlanc [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] Vectorizing a for loop
Send reply to:  [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]

 Hello all,
 
 Consider the following problem:
 
 There are two vectors:
 
 rows - c(1, 2, 3, 4, 5)
 columns - c(10, 11, 12, 13, 14)
 
 I want to create a matrix with dimensions length(rows) x
 length(columns):
 
 res - matrix(nrow = length(rows), ncol = length(columns))
 
 If i and j are the row and column indexes respectively, the values
 of the cells are abs(rows[i] - columns[j]).  The resultant matrix
 follows:
 
  [,1] [,2] [,3] [,4] [,5]
 [1,]9   10   11   12   13
 [2,]8910   11   12
 [3,]78  9   10   11
 [4,]67  8 9   10
 [5,]56  7 89
 
 This matrix may be generated by using a simple for loop:
 
 for(i in 1:length(rows)){
   for(j in 1:length(columns)){
 res[i,j] - abs(rows[i] - columns[j])
   }
 }
 
 Is there a quicker, vector-based approach for doing this or a function
 included in the recommended packages that does this?
 
 Thanks!
 
 -- Dan Gerlanc
 Williams College
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing a for loop

2006-08-03 Thread Neuro LeSuperHéros

Sorry,

I used ex instead of expanded (was working name).

rows - c(1, 2, 3, 4, 5)
columns - c(10, 11, 12, 13, 14)
expanded -expand.grid(rows,columns)
abslist -abs(expanded$Var1-expanded$Var2)
res - matrix(abslist,nrow = length(rows), ncol = length(columns))

Désolé encore

Neuro


From: Daniel Gerlanc [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject: [R] Vectorizing a for loop
Date: Thu, 3 Aug 2006 10:10:46 -0400

Hello all,

Consider the following problem:

There are two vectors:

rows - c(1, 2, 3, 4, 5)
columns - c(10, 11, 12, 13, 14)

I want to create a matrix with dimensions length(rows) x length(columns):

res - matrix(nrow = length(rows), ncol = length(columns))

If i and j are the row and column indexes respectively, the values
of the cells are abs(rows[i] - columns[j]).  The resultant matrix
follows:

  [,1] [,2] [,3] [,4] [,5]
[1,]9   10   11   12   13
[2,]8910   11   12
[3,]78  9   10   11
[4,]67  8 9   10
[5,]56  7 89

This matrix may be generated by using a simple for loop:

for(i in 1:length(rows)){
   for(j in 1:length(columns)){
 res[i,j] - abs(rows[i] - columns[j])
   }
}

Is there a quicker, vector-based approach for doing this or a function
included in the recommended packages that does this?

Thanks!

-- Dan Gerlanc
Williams College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing a for loop

2006-08-03 Thread Marc Schwartz (via MN)

On Thu, 2006-08-03 at 10:10 -0400, Daniel Gerlanc wrote:
 Hello all,
 
 Consider the following problem:
 
 There are two vectors:
 
 rows - c(1, 2, 3, 4, 5)
 columns - c(10, 11, 12, 13, 14)
 
 I want to create a matrix with dimensions length(rows) x length(columns):
 
 res - matrix(nrow = length(rows), ncol = length(columns))
 
 If i and j are the row and column indexes respectively, the values
 of the cells are abs(rows[i] - columns[j]).  The resultant matrix
 follows:
 
  [,1] [,2] [,3] [,4] [,5]
 [1,]9   10   11   12   13
 [2,]8910   11   12
 [3,]78  9   10   11
 [4,]67  8 9   10
 [5,]56  7 89
 
 This matrix may be generated by using a simple for loop:
 
 for(i in 1:length(rows)){
   for(j in 1:length(columns)){
 res[i,j] - abs(rows[i] - columns[j])
   }
 }
 
 Is there a quicker, vector-based approach for doing this or a function
 included in the recommended packages that does this?
 
 Thanks!

See ?outer

 outer(rows, columns, function(x, y) abs(x - y))
 [,1] [,2] [,3] [,4] [,5]
[1,]9   10   11   12   13
[2,]89   10   11   12
[3,]789   10   11
[4,]6789   10
[5,]56789

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing a for loop

2006-08-03 Thread Eik Vettorazzi

res-outer(rows,columns,FUN=function(x,y) abs(x-y))

will help you.

Am Thu, 03 Aug 2006 16:10:46 +0200 schrieb Daniel Gerlanc  
[EMAIL PROTECTED]:

 Hello all,

 Consider the following problem:

 There are two vectors:

 rows - c(1, 2, 3, 4, 5)
 columns - c(10, 11, 12, 13, 14)

 I want to create a matrix with dimensions length(rows) x length(columns):

 res - matrix(nrow = length(rows), ncol = length(columns))

 If i and j are the row and column indexes respectively, the values
 of the cells are abs(rows[i] - columns[j]).  The resultant matrix
 follows:

  [,1] [,2] [,3] [,4] [,5]
 [1,]9   10   11   12   13
 [2,]8910   11   12
 [3,]78  9   10   11
 [4,]67  8 9   10
 [5,]56  7 89

 This matrix may be generated by using a simple for loop:

 for(i in 1:length(rows)){
   for(j in 1:length(columns)){
 res[i,j] - abs(rows[i] - columns[j])
   }
 }

 Is there a quicker, vector-based approach for doing this or a function
 included in the recommended packages that does this?

 Thanks!

 -- Dan Gerlanc
 Williams College

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide  
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 


Universität Hamburg
Institut für Statistik und Ökonometrie
Dipl.-Wi.-Math. Eik Vettorazzi
Von-Melle-Park 5
20146 Hamburg

Tel.: +49 40-42838-3540

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing a for loop

2006-08-03 Thread Dimitris Rizopoulos

try this:

abs(outer(rows, columns, -))


Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Daniel Gerlanc [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, August 03, 2006 4:10 PM
Subject: [R] Vectorizing a for loop


 Hello all,

 Consider the following problem:

 There are two vectors:

 rows - c(1, 2, 3, 4, 5)
 columns - c(10, 11, 12, 13, 14)

 I want to create a matrix with dimensions length(rows) x 
 length(columns):

 res - matrix(nrow = length(rows), ncol = length(columns))

 If i and j are the row and column indexes respectively, the 
 values
 of the cells are abs(rows[i] - columns[j]).  The resultant matrix
 follows:

 [,1] [,2] [,3] [,4] [,5]
 [1,]9   10   11   12   13
 [2,]8910   11   12
 [3,]78  9   10   11
 [4,]67  8 9   10
 [5,]56  7 89

 This matrix may be generated by using a simple for loop:

 for(i in 1:length(rows)){
  for(j in 1:length(columns)){
res[i,j] - abs(rows[i] - columns[j])
  }
 }

 Is there a quicker, vector-based approach for doing this or a 
 function
 included in the recommended packages that does this?

 Thanks!

 -- Dan Gerlanc
 Williams College

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity of variances

2006-08-03 Thread Paul Smith

On 03 Aug 2006 16:32:38 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote:
My data consists in 96 groups, each one with 10 observations. Levene's
test suggests that the variances are not equal, and therefore I have
tried to apply the classical transformations to have homocedasticity
in order to be able to use ANOVA. Unfortunately, no transformation
that I have used transforms my data into data with homocedasticity.
The histogram of variances is at
   
http://phhs80.googlepages.com/hist1.png
   
Is someone able to suggest to me a transformation to overcome the
problem of heterocedasticity?
  
   Not based on that information. Try the following instead:
  
   fit - lm(y~g)
   par(mfrow=c(2,2)); plot(fit)
 
  Thanks, Peter. By 'g', you mean
 
  factor1* factor2*factor3*factor4

 If that defines your 96 groups, yes.

Thanks, Peter. The result of

 fit - lm(tardiness ~ interaction(factor1,factor2,factor3,factor4))
 par(mfrow=c(2,2)); plot(fit)
Warning message:
X11 used font size 8 when 7 was requested


is at

http://phhs80.googlepages.com/2transform1.png

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity of variances

2006-08-03 Thread Peter Dalgaard

Paul Smith [EMAIL PROTECTED] writes:

 On 03 Aug 2006 15:45:10 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote:
   My data consists in 96 groups, each one with 10 observations. Levene's
   test suggests that the variances are not equal, and therefore I have
   tried to apply the classical transformations to have homocedasticity
   in order to be able to use ANOVA. Unfortunately, no transformation
   that I have used transforms my data into data with homocedasticity.
   The histogram of variances is at
  
   http://phhs80.googlepages.com/hist1.png
  
   Is someone able to suggest to me a transformation to overcome the
   problem of heterocedasticity?
 
  Not based on that information. Try the following instead:
 
  fit - lm(y~g)
  par(mfrow=c(2,2)); plot(fit)
 
 Thanks, Peter. By 'g', you mean
 
 factor1* factor2*factor3*factor4

If that defines your 96 groups, yes.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] get() in sapply() in with()

2006-08-03 Thread Thomas Lumley

On Thu, 3 Aug 2006, Heinz Tuechler wrote:

 Dear All,

 applying some function within a with() function I wanted to use also
 sapply() and get() to form a data.frame, but did not succede.
 Below is a simplified example.
 It is possible to use sapply() within a with() function, it is also
 possible to use get() within a with() function, but when I try to use get
 within sapply within with I arrive at Error in get(x, envir, mode,
 inherits) : variable v5 was not found.
 Is there a solution?

This isn't what you want to hear, but the solution is probably to not use 
get(). How to not use get() will depend on what your problem really is.


 ## example
 df1 - data.frame(v5=16:20, v6=21:25, v7=I(letters[16:20]), v8=letters[16:20])

 with(df1, sapply(c('v5', 'v6'), get) ) ## Error, see next line
 ## Error in get(x, envir, mode, inherits) : variable v5 was not found

get() looks in the environment it was called from, which is the 
environment inside lapply(), whose parent is the environment of 
the base package. There is no v5 there.

 with(df1, sapply(list(v5, v6), mean) ) # does work

This works because list(v5,v6) is evaluated in df1.

 with(df1, get('v5') ) # does work

This works because get() looks in the environment it was called from, 
which is df1.


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bullseye or polar display of circular data

2006-08-03 Thread Michael Jerosch-Herold


I have data for several rings of a left heart chamber, and which I would like 
to display in concentric rings, with color-encoding of the values. Each ring 
corresponds to one slice through the heart, and the rings correspond to 
positions from the base to the apex of the heart as you move from the outermost 
ring to the innermost one. The data have a circular pattern. These types of 
displays are referred to as bullseye displays in the nuclear medicine 
literature. Does any reader of these messages know of a R function/package that 
offers this functionality?

Also I noticed that in some contexts you can define a circular attribute for 
your data. Are there plot routines for such circular data?

thank you!

Michael Jerosch-Herold

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about dll crashing R

2006-08-03 Thread Thomas Lumley

On Thu, 3 Aug 2006, Benn Fine wrote:

 I have ported some R code to C to make it faster.

 I can perform .Call(foobar,) once and it
 works fine. Absolutely correct answer.

 If I put a loop inside foobar and run the  main code
 routine more than 100 times, it crashes R.

 Or if I call .Call(foobar) seperately more than
 two tims it crashes R.

snip
 SEXP mvntest (SEXP mean, SEXP cov, SEXP temp)


 { int nrx , ncx , nry , ncy ,info,mode;
 SEXP xdims , ydims , ans;

 int i,j, one=1;
 info = 1;

 xdims = getAttrib (mean , R_DimSymbol ) ;
 ydims = getAttrib (cov , R_DimSymbol ) ;
 mode = REALSXP;
 nrx = INTEGER( xdims ) [ 0 ] ;
 ncx = INTEGER( xdims ) [ 1 ] ;
 nry = INTEGER( ydims ) [ 0 ] ;
 ncy = INTEGER( ydims ) [ 1 ] ;



 /* create the upper trianglular matrix A */

 /* such that t(A) %*% A = Sigma */

 GetRNGstate();

 F77_CALL(dpofa) ( REAL( cov ), nry , ncy , info);
 Rprintf(Info = %d\n,info);


 for(i=0;inry;i++)
  for(j=0;ji;j++)
REAL(cov)[i+j*ncy] = 0.0;


 PROTECT( ans = allocMatrix (mode, nrx , one ) ) ;
 for(i=0;inry;i++)
  REAL(temp)[i] = rnorm(0,1);
 ans = tmatrixprod(cov,temp);

^
Here you are returning a new SEXP from tmatrixprod but not protecting it. 
Remember, PROTECT() operates on the *value* of its argument, so it 
protects the thing that ans points to. When ans points to a new thing, it 
is still the old thing that is protected.

-thomas

 for(i=0;inry;i++)
  REAL(ans)[i] = REAL(ans)[i]+REAL(mean)[i];
 UNPROTECT( 1 ) ;
 PutRNGstate();
 return( ans ) ;


 }


 I have a feeling I am messing up memory usage
 somewhere but haven't a clue. Do I need to do garbage
 collecting inside the C program ?The fact that the
 code
 runs a few times before R crashes is driving me nuts.
 I send most of what I need into the C routine from R,
 so I am not creating that many SEXP objects within the
 program.

 Any hints or ideas ?

 Thanks!

 Benn

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] get() in sapply() in with()

2006-08-03 Thread Heinz Tuechler

Thank you, Thomas!
It helps a lot to know that something is impossible and that I have to look
for a different kind of solution.
Heinz

At 07:33 03.08.2006 -0700, Thomas Lumley wrote:
On Thu, 3 Aug 2006, Heinz Tuechler wrote:

 Dear All,

 applying some function within a with() function I wanted to use also
 sapply() and get() to form a data.frame, but did not succede.
 Below is a simplified example.
 It is possible to use sapply() within a with() function, it is also
 possible to use get() within a with() function, but when I try to use get
 within sapply within with I arrive at Error in get(x, envir, mode,
 inherits) : variable v5 was not found.
 Is there a solution?

This isn't what you want to hear, but the solution is probably to not use 
get(). How to not use get() will depend on what your problem really is.


 ## example
 df1 - data.frame(v5=16:20, v6=21:25, v7=I(letters[16:20]),
v8=letters[16:20])

 with(df1, sapply(c('v5', 'v6'), get) ) ## Error, see next line
 ## Error in get(x, envir, mode, inherits) : variable v5 was not found

get() looks in the environment it was called from, which is the 
environment inside lapply(), whose parent is the environment of 
the base package. There is no v5 there.

 with(df1, sapply(list(v5, v6), mean) ) # does work

This works because list(v5,v6) is evaluated in df1.

 with(df1, get('v5') ) # does work

This works because get() looks in the environment it was called from, 
which is df1.


   -thomas

Thomas Lumley  Assoc. Professor, Biostatistics
[EMAIL PROTECTED]  University of Washington, Seattle



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] efficient way to make NAs of empty cells in a factor (or character)

2006-08-03 Thread Petr Pikal

Hi

try to set

na.strings = 

in calling read.csv2. Works for me

 is.na(read.delim(clipboard, na.strings=)$mono)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE

 read.delim(clipboard, na.strings=)$mono
[1] hruby hruby jemny jemny nejhrubsi nejhrubsi 
standard  standard  NA 
Levels: hruby jemny nejhrubsi standard

or you can try

test[(test==)] - NA

HTH
Petr


On 3 Aug 2006 at 15:46, Henrik Parn wrote:

Date sent:  Thu, 03 Aug 2006 15:46:32 +0200
From:   Henrik Parn [EMAIL PROTECTED]
Organization:   NTNU
To: R-help r-help@stat.math.ethz.ch
Subject:[R] efficient way to make NAs of empty cells in a 
factor (or
character)
Send reply to:  [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]

 Dear all,
 
 I have some csv-files (originating from Excel-files) containing empty
 cells. In my example file I have four variables of different classes,
 each with some empty cells in the original csv-file:
 
   test - read.csv2(test.csv, dec=.)
 
   test
   id id2  x   y
 1  a  1  NA
 2  b   e NA 2.2
 3  f  3 3.3
 4  c   g  4 4.4
 
 
   class(test$id)
 [1] factor
   class(test$id2)
 [1] factor
   class(test$x)
 [1] integer
   class(test$y)
 [1] numeric
 
 In the help text of read.csv2 you can read 'Blank fields are also
 considered to be missing values in logical, integer, numeric and
 complex fields.'. Thus, empty cells in a factor (or a character I
 assume) is not considered as missing values but an own level:
 
   is.na(test$id)
 [1] FALSE FALSE FALSE FALSE
   levels(test$id)
 [1]   a b c
 
 When I work with my real (larger) dataset I would like to use
 functions like 'is.na' and '!is.na' on factors. Now I wonder if there
 is an R alternativ to do 'search (for empty cells) and replace (with
 NA)' in Excel?
 
 I have tried a modification of Uwe Ligges suggestion on missing value
 posted 2 Aug:
   is.na(test[test==]) - TRUE
 
 ...but it did not work on the data set:
 
 Error in [-.data.frame(`*tmp*`, test == , value = c(NA, NA, NA,
 NA :
 rhs is the wrong length for indexing by a logical matrix
 
 
 However it worked fine when applied to a single vector:
 
   is.na(test$id[test$id==]) - TRUE
   test$id
 [1] abNA c  
 Levels:  a b c
 
   is.na(test$id)
 [1] FALSE FALSE  TRUE FALSE
 
 Is there a more efficient way to fill empty cells in all my factors in
 R or should I just do it in advance in Excel by 'search and replace'?
 
 Thanks in advance!
 
 -- 
 
 Henrik Pärn
 Department of Biology
 NTNU
 7491 Trondheim
 Norway
 
 +47 735 96282 (office)
 +47 909 89 255 (mobile)
 +47 735 96100 (fax)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] run self written functions

2006-08-03 Thread Dieter Menne

Antje niederlein-rstat at yahoo.de writes:

 I'm not sure if I'm in the right place with my question...
 I'm running R on Windows and wrote a function and saved it as .R file.
 It looks like this:
 
 bmi - function(weight, height) {
 bmi - weight / height^2
 bmi
 }
 
 If I want to use this function, I have to mark everything and then press 
 Ctrl-R. But then everything single line is executed on the command line, 
 which means that I will loose my history when the code becomes longer.

I suggest that you forget the history function, it tends to accumulate all the
nonsense we make. Use the following method instead: Keep a RGui (assuming
Windows) on the left half of your screen, and an editor (Tinn-R, I suggest, if
you are no emacian) on the right half. Write everything you plan to keep in the
editor, and send it to RGui using the menu items of the editor provided. That
way, you end up with a nice little program that you can take home and re-run
tomorry, when your BMI has changed (I hope it stays constant).

R is a bit different from other programming languages, it does not need the
assignment to the function name. The following is the same as your function.

bmi - function(weight, height) weight / height^2

Dieter
(TÜ)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] levels of an array (strings and numbers)

2006-08-03 Thread Alessandro Antonucci

Reading a csv file (db.csv) as:

yes,full
no,full
no,empty

I can use the command 'levels' to extract
the different values appearing for each
column as follows:

 d-read.csv(db.csv)
 levels(d[,2])

which returns

 [1] empty full 

while, doing the same with a numerical csv file as:

1,6
0,6
0,7

the same instruction returns an empty output instead of 6 7

Any idea about that?

Thanks in advance,
Alessandro

-- 

Alessandro Antonucci
Dalle Molle Institute for Artificial Intelligence (IDSIA)
at Idsiae-mail: [EMAIL PROTECTED]
Galleria 2  web:   idsia.ch/~alessandro
Via Cantonale   mobile:   +39 339-567-23-28
CH-6928 tel:   +41 58-666-66-70
Manno - Lugano  fax:   +41 58-666-66-61
Switzerland skype: alessandro.antonucci

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bullseye or polar display of circular data

2006-08-03 Thread Marc Schwartz (via MN)

On Thu, 2006-08-03 at 07:46 -0700, Michael Jerosch-Herold wrote:
 I have data for several rings of a left heart chamber, and which I
 would like to display in concentric rings, with color-encoding of the
 values. Each ring corresponds to one slice through the heart, and the
 rings correspond to positions from the base to the apex of the heart
 as you move from the outermost ring to the innermost one. The data
 have a circular pattern. These types of displays are referred to as
 bullseye displays in the nuclear medicine literature. Does any reader
 of these messages know of a R function/package that offers this
 functionality?
 
 Also I noticed that in some contexts you can define a circular
 attribute for your data. Are there plot routines for such circular
 data?
 
 thank you!
 
 Michael Jerosch-Herold

You might want to take a look at the 'circular' or 'CircStats' packages
on CRAN:

http://cran.us.r-project.org/src/contrib/Descriptions/circular.html

http://cran.us.r-project.org/src/contrib/Descriptions/CircStats.html

There are some examples of plots generated using the packages in the R
Graphics Gallery here:

http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=121
http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=97


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector DF [a, b] to replace values :was Finding the position of a variable in a data.frame

2006-08-03 Thread John Kane


--- Peter Dalgaard [EMAIL PROTECTED] wrote:

 John Kane [EMAIL PROTECTED] writes:
 
  --- jim holtman [EMAIL PROTECTED] wrote:
  
   ?which
   
which(Df = 50, arr.ind=T)
 row col
   5   5   4
  
  This works very nicely as has some other
 suggestions
  on how to replace a value. Assuming that I have
 more
  than one correction to make where Df = 50, can I
 use
  vectors in the Df[] to do this.
  
  My attempt shows that I can use vectors but it
 appears
  thatthere is something wrong with my logic 
  
  Eample
  
  cat - c( 3,5,6,8,0)
  dog - c(3,5,3,6, 0)
  rat - c (5, 5, 4, 9, 51)
  bat - c( 12, 42, 45, 32, 54)
  
  Df - data.frame(cbind(cat, dog, rat, bat))
  post - which(Df = 50, arr.ind=T)  # find values
 .=
  50
  post
  correction - c(77, 88)  # new correct values
  row - post[ ,1]  # vector for row number
  col - post[ ,2]   # vector for column number
  
  Df[row,col] -  correction
  Df
  
  ---result-
cat dog rat bat
  1   3   3   5  12
  2   5   5   5  42
  3   6   3   4  45
  4   8   6   9  32
  5   0   0  88  88
  
  I am replacing both instances with 88 which is not
  correct
 
 You're being bitten by two issues here: One is that
 Df[row,col] does
 not vectorize in parallel in the two indices:
 Df[1:2,1:2] has *four*
 elements, not two. 

It took a moment to see it but it's rather obvious
now. I'm still fuzzy on the differences in behaviour
between a data frame and a matrix.  


  Another is that data frames are
 not matrices: you
 can have different types of values in different
 columns, so you cannot
 expect to pick an arbitrary set of elements and
 assign a vector into
 it (well, it doesn't work, anyway).
 
 This sort of stuff works much easier with matrices,
 where we also have
 the wonderful feature of indexing with a matrix:
 
  M - cbind(cat, dog, rat, bat)
  M[post]- correction
  M
  cat dog rat bat
 [1,]   3   3   5  12
 [2,]   5   5   5  42
 [3,]   6   3   4  45
 [4,]   8   6   9  32
 [5,]   0   0  77  88

Which of course I, stupidly  thought that I was doing
in the data frame.

Thank you very much.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question about dll crashing R

2006-08-03 Thread Prof Brian Ripley

On Thu, 3 Aug 2006, Thomas Lumley wrote:

 On Thu, 3 Aug 2006, Benn Fine wrote:
 
  I have ported some R code to C to make it faster.
 
  I can perform .Call(foobar,) once and it
  works fine. Absolutely correct answer.
 
  If I put a loop inside foobar and run the  main code
  routine more than 100 times, it crashes R.
 
  Or if I call .Call(foobar) seperately more than
  two tims it crashes R.
 
 snip
  SEXP mvntest (SEXP mean, SEXP cov, SEXP temp)
 
 
  { int nrx , ncx , nry , ncy ,info,mode;
  SEXP xdims , ydims , ans;
 
  int i,j, one=1;
  info = 1;
 
  xdims = getAttrib (mean , R_DimSymbol ) ;
  ydims = getAttrib (cov , R_DimSymbol ) ;
  mode = REALSXP;
  nrx = INTEGER( xdims ) [ 0 ] ;
  ncx = INTEGER( xdims ) [ 1 ] ;
  nry = INTEGER( ydims ) [ 0 ] ;
  ncy = INTEGER( ydims ) [ 1 ] ;
 
 
 
  /* create the upper trianglular matrix A */
 
  /* such that t(A) %*% A = Sigma */
 
  GetRNGstate();
 
  F77_CALL(dpofa) ( REAL( cov ), nry , ncy , info);
  Rprintf(Info = %d\n,info);
 
 
  for(i=0;inry;i++)
   for(j=0;ji;j++)
 REAL(cov)[i+j*ncy] = 0.0;
 
 
  PROTECT( ans = allocMatrix (mode, nrx , one ) ) ;
  for(i=0;inry;i++)
   REAL(temp)[i] = rnorm(0,1);
  ans = tmatrixprod(cov,temp);
 
 ^
 Here you are returning a new SEXP from tmatrixprod but not protecting it. 
 Remember, PROTECT() operates on the *value* of its argument, so it 
 protects the thing that ans points to. When ans points to a new thing, it 
 is still the old thing that is protected.

and the old value of 'ans' is never used.

  for(i=0;inry;i++)
   REAL(ans)[i] = REAL(ans)[i]+REAL(mean)[i];
  UNPROTECT( 1 ) ;
  PutRNGstate();
  return( ans ) ;

but the next two lines do not do any allocations.  However, PutRNGstate 
does, so you need to UNPROTECT *after* that call.

While we are at it, REAL(ans) is a function call, and you will do better 
to assign the value once outside the loop.

  I have a feeling I am messing up memory usage
  somewhere but haven't a clue. Do I need to do garbage
  collecting inside the C program ?The fact that the
  code
  runs a few times before R crashes is driving me nuts.
  I send most of what I need into the C routine from R,
  so I am not creating that many SEXP objects within the
  program.
 
  Any hints or ideas ?

See the debugging sections in `Writing R Extensions'.  Using 
gctorture(TRUE) and valgrind (if possible) will find these things faster.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity ofvariances

2006-08-03 Thread Berton Gunter

I know I'm coming late to this, but ...

  Is someone able to suggest to me a transformation to overcome the
  problem of heterocedasticity?

It is not usually useful to worry about this. In my experience, the gain in
efficiency from using an essentially ideal weighted analysis vs. an
approximate unweighted one is usually small and unimportant (transformation
to simplify a model is another issue ...). Of far greater importance usually
is the loss in efficiency due to the presence of a few unusual extreme
values; have you carefully checked to make sure that none of the large
sample variances you have are due merely to the presence of a small number
of highly discrepant values?


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss and umlaut

2006-08-03 Thread Thomas Lumley


I have gone and looked at the code for reading SPSS portable files, and 
the file format appears to specify that you cannot read many legal 
characters.

Part of the header information in the file format is a 256-byte 
translation table apparently designed for translating between character 
representations.  It can mark characters as untranslateable, and the 
code for reading character strings replaces untranslateable characters 
with NULs.

In the example file in the foreign package the only translatable 
characters are the ASCII alphanumeric characters and 
.(+0[]!$*);^-/|,%_?`:#@'=~{}\

So, it looks as though your SPSS portable file may be marking character 
code FC as untranslatable.  This is easy to check -- look at the start of 
the file and find the sequence ABCDEF..., which is in the middle of the 
translation table. See if u-umlaut is in the table.  It might even work to 
modify the translation table to allow the accented characters.

It looks as though SPSS .sav files don't have this limitation.

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help Building packages for windows

2006-08-03 Thread Denise Mauldin


Hello all,

I've tried to build my linux source package with the following commands 
and then run it through the RCrossBuild package with R 2.3.1, but I get 
errors about how the built field is incorrect and that my Rd file has a 
non-empty \name.  I've looked through the Rd file and it seems to not have 
any empty \name values, but I'm not sure how to tell if any are 'missing'.  
The built field is: Built: R 2.3.1; ; 2006-08-02 21:42:59; unix

R CMD INSTALL --build pkgName
R CMD INSTALL --build --use-zip pkgName
R CMD BUILD --use-zip pkgName
R CMD BUILD pkgName

An RCHECK on the package gives back the warning that subdirectory data 
contains no data sets, despite the presence of Rdata.rdb, Rdata.rds, and 
Rdata.rdx in the directory.  This package is an annotation package for an 
array and can be used fine in *nix R.  If I try to install the package in 
R for windows it gives me a bunch of errors with package names that look 
like random strings of characters.

Any help would be appreciated.

Thanks,
Denise

 sessionInfo()
Version 2.3.1 (2006-06-01)
i686-pc-linux-gnu
 
attached base packages: 
[1] methods  stats  graphics  grDevices utils  datasets
[7] base
 
other attached packages: 
pkgName
   2.3

Linux 2.4.20-31.9 #1 Tue Apr 13 17:38:16 EDT 2004 i686 athlon i386 GNU/Linux

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] levels of an array (strings and numbers)

2006-08-03 Thread Uwe Ligges



Alessandro Antonucci wrote:
 Reading a csv file (db.csv) as:
 
 yes,full
 no,full
 no,empty
 
 I can use the command 'levels' to extract
 the different values appearing for each
 column as follows:
 
 d-read.csv(db.csv)
 levels(d[,2])
 
 which returns
 
 [1] empty full 
 
 while, doing the same with a numerical csv file as:
 
 1,6
 0,6
 0,7

R (i.e. read.csv) assumes this is numeric rather than categorical data.
Hence specify the class of each variable in read.cvs' argument colClasses.

Uwe Ligges



 the same instruction returns an empty output instead of 6 7
 
 Any idea about that?
 
 Thanks in advance,
 Alessandro


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] DOE in R

2006-08-03 Thread Spencer Graves

see in line

Petr Pikal wrote:
 See
 
 ?aov
 ?lm
 
 something like
  
 lm(result~yourfactor1+yourfactor2+yourfactor3+yourfactor5, 
 data=yourdataframe)

or

lm(result~(yourfactor1+yourfactor2+yourfactor3+yourfactor5)^2,
data=yourdataframe)

if you want all interactions.  Make sure all yourfactors are of class 
factor.

  Hope this helps.
  Spencer Graves

  or
 aov(result~yourfactor1+yourfactor2+yourfactor3+yourfactor5, 
 data=yourdataframe)
 
 but the exact structure of lm or aov construction depends on what you 
 want to test.
 
 HTH
 Petr
 
 
 On 28 Jul 2006 at 21:43, [EMAIL PROTECTED] wrote:
 
 Date sent:Fri, 28 Jul 2006 21:43:38 -0700
 From: [EMAIL PROTECTED]
 To:   r-help@stat.math.ethz.ch
 Subject:  [R] DOE in R
 Send reply to:[EMAIL PROTECTED]
   mailto:[EMAIL PROTECTED]
   mailto:[EMAIL PROTECTED]
 
 Hi.

 I'm a student in a graduate program at Simon Fraser University in
 Canada.

 I am trying to run a simple screening experiment with some simulated
 data.

 I simply want to do an ANOVA of an experiemnt with 5 factors (4 have 2
 levels, the last has 3 levels) and 48 runs (ie, full factorial).

 The thing is that I have multiple observations for each level
 combination (run).

 So,

 1) How do I do the anova based on the setup above?

 and 

 2) More importantly, because of convergence issues for my simulations,
 I will likely have an unequal number of observations for the 48 runs.
 How can I do this?

 Seems like a straightforward enough situation.

 I am trying to avaoid writing my own C code to do the analysis since I
 am working under some pretty tight time constraints.

 ANy help would be appreciated.

 Thanks very much.

 Dean Vrecko

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.
 
 Petr Pikal
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity ofvariances

2006-08-03 Thread John Sorkin

Peter
You question is difficult to answer without more information about the
distribution of your residuals. Different residual patterns call for
different transformations to stabilize the variance. One very common
form of  heterocedasticity is increasing variance with increasing values
of an independent predictor, i.e. the variance of the residuals of y=x
increase as x increases. In this case a log transformation of some, or
all, of the independent variables of the helps. Please also note the
comment by Bert Gunter (included below) in which some important points
are raised, particularly about extreme values. 

If you want more help, please describe the pattern of your residuals. 


John Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
Baltimore VA Medical Center GRECC,
University of Maryland School of Medicine Claude D. Pepper OAIC,
University of Maryland Clinical Nutrition Research Unit, and
Baltimore VA Center Stroke of Excellence

University of Maryland School of Medicine
Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524

(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)
[EMAIL PROTECTED]

 Berton Gunter [EMAIL PROTECTED] 8/3/2006 11:56:28 AM 
I know I'm coming late to this, but ...

  Is someone able to suggest to me a transformation to overcome the
  problem of heterocedasticity?

It is not usually useful to worry about this. In my experience, the
gain in
efficiency from using an essentially ideal weighted analysis vs. an
approximate unweighted one is usually small and unimportant
(transformation
to simplify a model is another issue ...). Of far greater importance
usually
is the loss in efficiency due to the presence of a few unusual
extreme
values; have you carefully checked to make sure that none of the large
sample variances you have are due merely to the presence of a small
number
of highly discrepant values?


-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific
learning
process.  - George E. P. Box

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss 'error reading system-file header'

2006-08-03 Thread Finn Sandø

Thank you for your answer.
I did try this but got similar errors. I tried all the other 
spss-specific formats with same result.

A way to get access to the data  is: save in sas-xport format (though 
all labels are lost) then save in another sas-format and asking spss to 
also save value labels in a sas-format file which can then be edited to 
extract both value labels and variable labels.
These methods are quite laborious and quite complicated, though, and you 
need access to spss.

Ulrich Keller wrote:
 Question 2: Try saving the data as an SPSS portable file. I never had 
 trouble reading these in R.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss 'error reading system-file header'

2006-08-03 Thread Finn Sandø

Thank you Michael
I think you point to the real cause of the problem.
I solved my own immediate problem by using StatTransfer to transfer from 
sav to sav.

My real concern is that the read.spss() function will become obsolete. 
Most of the data I have received in the last year have had those 
problems (and increasingly so), now they have become so serious that the 
import process fails all together.
Therefore I believe it is important to do something about



Michael Bibo wrote:
 I  don't know if this is relevant in your particular case, but the error
 messages you quote are precisely what you get if the SPSS .sav file has
 been created with the SPSS Data Entry product.  If this is the case, it
 is covered by section 3.1 of the R Data Import/Export document.

 If that is the problem, and if you don't have access to SPSS Data
 Entry, I could do a data export for you.  If I can help, please respond
 on both my work email (below) and [EMAIL PROTECTED], as I'm not sure
 which one I would see first.







 Michael Bibo
 Research Officer
 Projects and Research Service
 West Moreton Community Health Services
 West Moreton Health Service District
 Queensland Health

 [EMAIL PROTECTED]

 Ph. +61 7 3817 2400

 P.O. Box 878
 Ipswich,  4305
 Australia

 This e-mail, including any attachments sent with it, is confidential 
 and for the sole use of the intended recipient(s). This confidentiality

 is not waived or lost if you receive it and you are not the intended 
 recipient(s), or if it is transmitted/ received in error.  
  
 Any unauthorised use, alteration, disclosure, distribution or review 
 of this e-mail is prohibited.  It may be subject to a statutory duty of

 confidentiality if it relates to health service matters.
  
 If you are not the intended recipient(s), or if you have received this

 e-mail in error, you are asked to immediately notify the sender by 
 telephone or by return e-mail.  You should also delete this e-mail 
 message and destroy any hard copies produced.


 *
 This email, including any attachments sent with it, is
 confidential and for the sole use of the intended recipient(s).
 This confidentiality is not waived or lost, if you receive it and
 you are not the intended recipient(s), or if it is transmitted/
 received in error.

 Any unauthorised use, alteration, disclosure, distribution or
 review of this email is strictly prohibited.  The information
 contained in this email, including any attachment sent with
 it, may be subject to a statutory duty of confidentiality if it
 relates to health service matters.

 If you are not the intended recipient(s), or if you have
 received this email in error, you are asked to immediately
 notify the sender by telephone collect on Australia
 +61 1800 198 175 or by return email.  You should also
 delete this email, and any copies, from your computer
 system network and destroy any hard copies produced.

 If not an intended recipient of this email, you must not copy,
 distribute or take any action(s) that relies on it; any form of
 disclosure, modification, distribution and/or publication of this
 email is also prohibited.

 Although Queensland Health takes all reasonable steps to
 ensure this email does not contain malicious software,
 Queensland Health does not accept responsibility for the
 consequences if any person's computer inadvertently suffers
 any disruption to services, loss of information, harm or is
 infected with a virus, other malicious computer programme or
 code that may occur as a consequence of receiving this
 email.

 Unless stated otherwise, this email represents only the views
 of the sender and not the views of the Queensland Government.
 




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gsummary

2006-08-03 Thread Mike Saunders

Could someone give me a hand with the format of the gsummary function?  
Basically, I have a large set of xyz coordinates generated by LiDAR data (37 
million points) and I am trying to derive various summary statistics on the 
z-coordinates by a grid cell.  I wrote a function to do this by creating 
factors from the x- and y- coordinates and then using gsummary.  However, I 
want the function to calculate BOTH the max z-value in a grid cell and the 
number of z-values in that cell.  gsummary is supposed to be able to use a list 
a functions as input to FUN, but I am having trouble coding it correctly.  Here 
is what I came up to so far.

gr-gsummary(xyz.packet,FUN=list(numeric=max,numeric=length),omitGroupingFactor=T,form=~X/Y)

The help for the function gives no examples using multiple functions as a FUN 
value, so the help pages haven't been overly useful in this case.

Thanks in advance,
Mike

Mike R. Saunders
Forest Biometrician
Cooperative Forest Research Unit
University of Maine
5755 Nutting Hall
Orono, ME  04469-5755

207-581-2763 (O)
207-581-2833 (F)

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fitting a model with the nlme package

2006-08-03 Thread Frank Johannes

Dear all,
I am analyzing some data that requires a mixed model. I have been
reading Pinheiro and Bates' book,
but cannot find the notation to fit the following model:

Suppose I have the dataset below. Here I am fitting variable p as a
fixed effect, variable h
as a random effect and variable t as nested within h.  I would like to
include the variable j as well
as an independent (non-nested) random effect. I can't find the notation
in the book to tell R
to do this. Does anybody know? 

library(nlme)

h-c(1,1,1,2,2,2,3,3,3)
j-c(2,3,8,3,4,3,9,5,4)
t-c(1,2,3,1,2,3,1,2,3)
f-c(1,2,1,2,1,2,1,2,1)
p-c(45,32,35,12,23,12,2,9,12)
set-data.frame(p,h,j,t)

out-lme(p~ -1 + f,data=set, random=~1|h/t)

Thanks a alot,
frank.

-- 

  unladen european swallow

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsummary

2006-08-03 Thread Douglas Bates

The gsummary function was intended to apply a summary function to each
column in each of the groups of observations.  The (implicit)
definition of a summary function is that it produces a scalar from a
vector.  (See the groupings of functions in Table A.1, p. 472 of
Statistical Models in S, a.k.a. The White Book.)

I don't think you need to use gsummary for your application.  You may
be able to use tapply or a combination of split and lapply instead.


On 8/3/06, Mike Saunders [EMAIL PROTECTED] wrote:
 Could someone give me a hand with the format of the gsummary function?  
 Basically, I have a large set of xyz coordinates generated by LiDAR data (37 
 million points) and I am trying to derive various summary statistics on the 
 z-coordinates by a grid cell.  I wrote a function to do this by creating 
 factors from the x- and y- coordinates and then using gsummary.  However, I 
 want the function to calculate BOTH the max z-value in a grid cell and the 
 number of z-values in that cell.  gsummary is supposed to be able to use a 
 list a functions as input to FUN, but I am having trouble coding it 
 correctly.  Here is what I came up to so far.

 gr-gsummary(xyz.packet,FUN=list(numeric=max,numeric=length),omitGroupingFactor=T,form=~X/Y)

 The help for the function gives no examples using multiple functions as a FUN 
 value, so the help pages haven't been overly useful in this case.

 Thanks in advance,
 Mike

 Mike R. Saunders
 Forest Biometrician
 Cooperative Forest Research Unit
 University of Maine
 5755 Nutting Hall
 Orono, ME  04469-5755

 207-581-2763 (O)
 207-581-2833 (F)

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity ofvariances

2006-08-03 Thread Peter Dalgaard

[Resending -- recipient list length issue]

John Sorkin [EMAIL PROTECTED] writes:

 Peter

Erm, that was Paul's question, not mine! If you want to help, please
look at the pattern of residuals which he put up on the web on my
request 

 You question is difficult to answer without more information about the
 distribution of your residuals. Different residual patterns call for
 different transformations to stabilize the variance. One very common
 form of  heterocedasticity is increasing variance with increasing values
 of an independent predictor, i.e. the variance of the residuals of y=x
 increase as x increases. In this case a log transformation of some, or
 all, of the independent variables of the helps. Please also note the
 comment by Bert Gunter (included below) in which some important points
 are raised, particularly about extreme values. 
 
 If you want more help, please describe the pattern of your residuals. 
 
 
 John Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 Baltimore VA Medical Center GRECC,
 University of Maryland School of Medicine Claude D. Pepper OAIC,
 University of Maryland Clinical Nutrition Research Unit, and
 Baltimore VA Center Stroke of Excellence
 
 University of Maryland School of Medicine
 Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 [EMAIL PROTECTED]
 
  Berton Gunter [EMAIL PROTECTED] 8/3/2006 11:56:28 AM 
 I know I'm coming late to this, but ...
 
   Is someone able to suggest to me a transformation to overcome the
   problem of heterocedasticity?
 
 It is not usually useful to worry about this. In my experience, the
 gain in
 efficiency from using an essentially ideal weighted analysis vs. an
 approximate unweighted one is usually small and unimportant
 (transformation
 to simplify a model is another issue ...). Of far greater importance
 usually
 is the loss in efficiency due to the presence of a few unusual
 extreme
 values; have you carefully checked to make sure that none of the large
 sample variances you have are due merely to the presence of a small
 number
 of highly discrepant values?
 
 
 -- Bert Gunter
 Genentech Non-Clinical Statistics
 South San Francisco, CA
  
 The business of the statistician is to catalyze the scientific
 learning
 process.  - George E. P. Box
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help 
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html 
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to access a column by its label?

2006-08-03 Thread Neil McLeod

Hi all,

Is there any way to access a column of a data frame by its label (title)
rather than its column index? For example, I'd like to be able to select
animals[,weight] rather than animals[,3], if the third column of the
animals data frame has the label weight.

Thank you!

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss 'error reading system-file header'

2006-08-03 Thread Thomas Lumley


On Thu, 3 Aug 2006, Finn Sand? wrote:

My real concern is that the read.spss() function will become obsolete.
Most of the data I have received in the last year have had those
problems (and increasingly so), now they have become so serious that the
import process fails all together.
Therefore I believe it is important to do something about



Well, we would be happy if someone did something about it.  It still reads 
all the files it used to read (which includes all the SPSS files I have 
ever encountered in my work).


As has been pointed out several times, PSPP now has newer code to read 
SPSS files than the code in the foreign package, and that could be 
adapted.


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting a model with the nlme package

2006-08-03 Thread Douglas Bates

On 8/3/06, Frank Johannes [EMAIL PROTECTED] wrote:
 Dear all,
 I am analyzing some data that requires a mixed model. I have been
 reading Pinheiro and Bates' book,
 but cannot find the notation to fit the following model:

 Suppose I have the dataset below. Here I am fitting variable p as a
 fixed effect, variable h
 as a random effect and variable t as nested within h.  I would like to
 include the variable j as well
 as an independent (non-nested) random effect. I can't find the notation
 in the book to tell R
 to do this. Does anybody know?

 library(nlme)

 h-c(1,1,1,2,2,2,3,3,3)
 j-c(2,3,8,3,4,3,9,5,4)
 t-c(1,2,3,1,2,3,1,2,3)
 f-c(1,2,1,2,1,2,1,2,1)
 p-c(45,32,35,12,23,12,2,9,12)
 set-data.frame(p,h,j,t)

 out-lme(p~ -1 + f,data=set, random=~1|h/t)

It is not easy to fit mixed models with non-nested grouping factors
for the random effects using lme.  I would recommend using lmer from
the lme4/Matrix package instead.  Even with lmer you can't fit the
model you want to fit because you have 9 levels of the interaction h:t
and only 9 observations.  The version of lmer in the
soon-to-be-released Matrix_0.995-13 even produces a warning about
this.

 library(lme4)
Loading required package: Matrix
Loading required package: lattice
Loading required package: lattice
 set - data.frame(h = factor(c(1,1,1,2,2,2,3,3,3)),
+   j = factor(c(2,3,8,3,4,3,9,5,4)),
+   t = factor(c(1,2,3,1,2,3,1,2,3)),
+   f = c(1,2,1,2,1,2,1,2,1),
+   p = c(45,32,35,12,23,12,2,9,12))
 (out - lmer(p ~ f - 1 + (1|h/t) + (1|j), set))
Linear mixed-effects model fit by REML
Formula: p ~ f - 1 + (1 | h/t) + (1 | j)
   Data: set
  AIC  BIClogLik MLdeviance REMLdeviance
 39.21916 40.00806 -15.60958   36.17505 31.21916
Random effects:
 Groups   NameVariance   Std.Dev.
 t:h  (Intercept) 1.7053e-22 1.3059e-11
 j(Intercept) 1.0067e+02 1.0033e+01
 h(Intercept) 1.2655e+02 1.1249e+01
 Residual 3.4106e-13 5.8400e-07
number of obs: 9, groups: t:h, 9; j, 6; h, 3

Fixed effects:
  Estimate Std. Error t value
f   12.000  4.899  2.4495
Warning messages:
1: Estimated variance for group(s) t:h is zero in:
LMEoptimize-(`*tmp*`, value = list(maxIter = 200, tolerance =
1.49011611938477e-08,
2: nlminb returned message singular convergence (7)
 in: LMEoptimize-(`*tmp*`, value = list(maxIter = 200, tolerance =
1.49011611938477e-08,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to access a column by its label?

2006-08-03 Thread vincent

?colnames
hih

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to access a column by its label?

2006-08-03 Thread Marc Schwartz (via MN)

On Thu, 2006-08-03 at 14:44 -0400, Neil McLeod wrote:
 Hi all,
 
 Is there any way to access a column of a data frame by its label (title)
 rather than its column index? For example, I'd like to be able to select
 animals[,weight] rather than animals[,3], if the third column of the
 animals data frame has the label weight.
 
 Thank you!

You answered your own question...animals[,weight]

You can also do:

animals$weight

or

animals[[weight]]

or 

subset(animals, select = weight)


See ?Extract and ?subset for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot facet label font size

2006-08-03 Thread Walker, Sam


This works OK, but there is some extra spacing between the panels, the
top axis and the strip on the top, and the left labels and panel.

How can I remove these extra spaces?

I've tried changing various layout.widths settings with no luck.  It
seems the spaces are calculated based on the number of conditioning
variables, in this case 2 (sex+smoker).


Thanks in advance...
-Sam 


-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, August 02, 2006 6:04 PM
To: Walker, Sam
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] ggplot facet label font size

On 8/2/06, Walker, Sam [EMAIL PROTECTED] wrote:
 How do I change the font size in the facet labels along the edges of
the
 plot?

 For example (from the ggplot help file):
 p-ggplot(tips, sex ~ smoker, aesthetics=list(x=tip/total_bill))
 gghistogram(p)

 In this plot, the facet labels are smoker: No, smoker: Yes, sex:
 Female, sex: Male.  What command can I use to reduce the font size
of
 these labels?

 In lattice terminology, cex is used to scale these strip labels.  But
I
 couldn't find the equivalent in ggplot.

 The reason I'm asking is I have a 9x7 array of plots which I've been
 plotting with lattice.  I wanted to use ggplot because I like having
the
 labels on the edge of the plots

Note that lattice can do that by using custom strip functions:

library(ggplot) # data resides here
library(lattice)

my.strip - function(which.given, which.panel, ...)
   if (which.given == 1  which.panel[2] == 2)
  strip.default(which.given, which.panel, ...)

my.strip.left - function(which.given, which.panel, ..., horizontal)
   if (which.given == 2  which.panel[1] == 1)
  strip.default(which.given, which.panel, horizontal = FALSE, ...)

histogram(~ tip/total_bill | sex + smoker, tips, strip = my.strip,
 strip.left = my.strip.left, par.settings = list(add.text =
list(cex = 0.7)))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looking for transformation to overcome heterogeneity of variances

2006-08-03 Thread Brian S Cade

Paul:  It is too bad that most peoples statistical thought processes lead 
them to thinking heterogeneity is something to overcome so that a simple 
test of differences in means with ANOVA can be made.  If you have that 
much heterogeneity among 96 groups (hard for me to imagine otherwise), 
perhaps the distributional heterogeneity rather than simple shifts in 
means is the more important effect.   You might try using omnibus tests of 
distributional differences (eg., MRPP, coverage tests, etc.) or compare 
multiple quantiles (e.g., with quantile regression) since you've already 
admitted that the group distributions differ by more than just a shift in 
means.  Heterogeneous variances among groups immediately implies that 
there is not a single parameter describing changes in distributions among 
groups.   Focusing on just a comparison of means, while traditional and 
analytically expedient, is unlikely to be very enlightening.   You could 
of course, weight each group inversely by its variance to achieve a 
weighted comparison of means.  But doing this just makes it so that you've 
made a valid test on only one of the parameters characterizing 
distributional differences.  A better analysis but still not as 
enlightening as possible.

My 2 pence.

Brian

Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



Paul Smith [EMAIL PROTECTED] 
Sent by: [EMAIL PROTECTED]
08/03/2006 07:33 AM

To
r-help@stat.math.ethz.ch
cc

Subject
[R] Looking for transformation to overcome heterogeneity of variances






Dear All

My data consists in 96 groups, each one with 10 observations. Levene's
test suggests that the variances are not equal, and therefore I have
tried to apply the classical transformations to have homocedasticity
in order to be able to use ANOVA. Unfortunately, no transformation
that I have used transforms my data into data with homocedasticity.
The histogram of variances is at

http://phhs80.googlepages.com/hist1.png

Is someone able to suggest to me a transformation to overcome the
problem of heterocedasticity?

Thanks in advance,

Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] meta characters in file path

2006-08-03 Thread Li,Qinghong,ST.LOUIS,Molecular Biology

Hi,

I need to read in some files. The file names contain come meta characters such 
as @, #, and white spaces etc, In read.csv, file= option, is there any way that 
one can make the function to recognize a file path with those characters?

Thanks
Johnny

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] meta characters in file path

2006-08-03 Thread Tony Plate

What is the problem you are having?  Seems to work fine for me running 
under Windows2000:

  write.table(data.frame(a=1:3,b=4:6), file=@# x.csv, sep=,)
  read.csv(file=@# x.csv)
   a b
1 1 4
2 2 5
3 3 6
  sessionInfo()
Version 2.3.1 (2006-06-01)
i386-pc-mingw32

attached base packages:
[1] methods   stats graphics  grDevices utils datasets
[7] base

other attached packages:
  XML
0.99-8
 

Li,Qinghong,ST.LOUIS,Molecular Biology wrote:
 Hi,
 
 I need to read in some files. The file names contain come meta characters 
 such as @, #, and white spaces etc, In read.csv, file= option, is there any 
 way that one can make the function to recognize a file path with those 
 characters?
 
 Thanks
 Johnny
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting a model with the nlme package

2006-08-03 Thread Doran, Harold

 
  (out - lmer(p ~ f - 1 + (1|h/t) + (1|j), set))

Doug:

It seems the nesting syntax is handled a bit differently. Is (1|h/t)
equivalent to the old lme nesting syntax?

 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Weibull distribution

2006-08-03 Thread Göran Broström

On 7/21/06, Thomas Lumley [EMAIL PROTECTED] wrote:
 On Fri, 21 Jul 2006, Valentin Dimitrov wrote:

  Dear Leaf,
 
  I modified your code as follows:
 
  gamma.fun - function(mu,sd,start=100)
  {
  f.fn - function(alpha)
  {abs(sd^2-mu^2/(gamma(1+1/alpha))^2*(gamma(1+2/alpha)-(gamma(1+1/alpha))^2))}
  alpha - optim(start, f.fn)
  beta - mu/gamma(1+1/alpha$par)
  return(list=c(a=alpha$par,b=beta));
  }
 
  Now it works properly.
 
  First, I added an abs(). You tried to solve an
  equation by means of the R-function optim(), which
  finds a minimum. That's why you can find the solution
  of f(x)=a through minimization of abs(f(x)-a).
  Second, I deleted the optim-method BFGS from the
  optim() function, because it is not appropriate in
  this case.

 optim() is not appropriate at all in this case -- its help page says to
 use optimize() for one-dimensional problems.

Just to clarify: The help page says:

'optim' will work with one-dimensional 'par's, but the default
method does not work well (and will warn).  Use 'optimize' instead.

In other words, if you for instance use the 'BFGS' method, optim is
perfectly OK for one-dimensional problems.

 In fact, in one dimension there isn't any need to resort to optimization
 when you really want root-finding, and uniroot() is more appropriate than
 optimize().

One reason to use optim instead of uniroot or optimize is that you
need not specify a finite interval that covers the solution.

Göran

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] geodesic distance (solved)

2006-08-03 Thread stefano iacus

I found the answer to my problem.
stefano

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot facet label font size

2006-08-03 Thread hadley wickham

Hi Sam,

 How do I change the font size in the facet labels along the edges of the
 plot?

Unfortunately, you can't currently change the size of those fonts.
However, it is on my todo list (as well as completely custom strip
functions) and should be available in the near future.

One thing you could do is have a look at ggopt, where you can at least
change the strip text, if not the size.

Regards,

Hadley

On 8/2/06, Walker, Sam [EMAIL PROTECTED] wrote:

 For example (from the ggplot help file):
  p-ggplot(tips, sex ~ smoker, aesthetics=list(x=tip/total_bill))
  gghistogram(p)

 In this plot, the facet labels are smoker: No, smoker: Yes, sex:
 Female, sex: Male.  What command can I use to reduce the font size of
 these labels?

 In lattice terminology, cex is used to scale these strip labels.  But I
 couldn't find the equivalent in ggplot.

 The reason I'm asking is I have a 9x7 array of plots which I've been
 plotting with lattice.  I wanted to use ggplot because I like having the
 labels on the edge of the plots, but the label font size is too large
 and exceeding the size of the label box.

 Thanks in advance...
 -Sam

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding the position of a variable in a data.frame

2006-08-03 Thread Don MacQueen

You don't need to find out the column index. This works:

Df[5,'bat'] - 100

-Don

At 5:01 PM -0400 8/2/06, John Kane wrote:
Simple problem but I don't see the answer. I'm trying
to clean up some data
I have 120 columns in a data.frame.  I have one value
in a column named blaw that I want to change. How do
I find the coordinates. I can find the row by doing a
subset on the data.frame but how do I find out here
blaw  is in columns without manually counting them
or converting names(Df) to a list and reading down the
list.

Simple example

cat - c( 3,5,6,8,0)
dog - c(3,5,3,6, 0)
rat - c (5, 5, 4, 9, 0)
bat - c( 12, 42, 45, 32, 54)

Df - data.frame(cbind(cat, dog, rat, bat))
Df
subset(Df, bat = 50)

results
   cat dog rat bat
5   0   0   0  54


Thus I know that my target is in row 5 but how do I
figure out where 'bat' is? 

All I want to do is be able to say
Df[5,4] - 100

Is there some way to have function(bat) return the
column number: some kind of a colnum() function?  I
had thought that I had found somthing  in
library(gdata) matchcols but no luck.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


-- 
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tcltk package

2006-08-03 Thread John McHenry

Hi Adrian,

Thanks for the tip. I re-installed and everything seems to work just fine. 

Thanks,

Jack.

Adrian DUSA [EMAIL PROTECTED] wrote: On Tuesday 01 August 2006 19:24, John 
McHenry wrote:
 [...]
 Yes, I built R myself. I couldn't find a debian package for R 2.3.1. The
 latest available is 2.2.1.

Oh, but there is... right on CRAN. For Dapper just add this line to your 
sources.list:

deb http://cran.R-project.org/bin/linux/ubuntu/ dapper/

This repository has lots of other packages compiled for Ubuntu, feel free to 
take a look.

HTH,
Adrian

-- 
Adrian DUSA
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
  +40 21 3120210 / int.101



-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding the position of a variable in a data.frame

2006-08-03 Thread John Kane


--- Don MacQueen [EMAIL PROTECTED] wrote:

 You don't need to find out the column index. This
 works:
 
 Df[5,'bat'] - 100
 
 -Don
 

Thanks, I'd tried 
Df[5, bat] - 100  :(

I never thought of the ' ' being needed.



 At 5:01 PM -0400 8/2/06, John Kane wrote:
 Simple problem but I don't see the answer. I'm
 trying
 to clean up some data
 I have 120 columns in a data.frame.  I have one
 value
 in a column named blaw that I want to change. How
 do
 I find the coordinates. I can find the row by doing
 a
 subset on the data.frame but how do I find out here
 blaw  is in columns without manually counting
 them
 or converting names(Df) to a list and reading down
 the
 list.
 
 Simple example
 
 cat - c( 3,5,6,8,0)
 dog - c(3,5,3,6, 0)
 rat - c (5, 5, 4, 9, 0)
 bat - c( 12, 42, 45, 32, 54)
 
 Df - data.frame(cbind(cat, dog, rat, bat))
 Df
 subset(Df, bat = 50)
 
 results
cat dog rat bat
 5   0   0   0  54
 
 
 Thus I know that my target is in row 5 but how do I
 figure out where 'bat' is? 
 
 All I want to do is be able to say
 Df[5,4] - 100
 
 Is there some way to have function(bat) return the
 column number: some kind of a colnum() function?  I
 had thought that I had found somthing  in
 library(gdata) matchcols but no luck.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 
 
 -- 
 --
 Don MacQueen
 Environmental Protection Department
 Lawrence Livermore National Laboratory
 Livermore, CA, USA
 --


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Ambitious newbie with some ongoing Q's

2006-08-03 Thread Mitchell Maltenfort

I'm new to the list and I've been playing about with R for some months
now, mostly using the power analysis routines including the pwr
package.

I'm currently looking at a project which will require a
repeated-measures MANCOVA.

I've been reviewing the files available at CRAN, including
http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_repms.html
and http://tolstoy.newcastle.edu.au/R/help/03b/7663.html which
describe repeated-measures ANOVA

Is making this a repeated-measures MANCOVA as simple as adding a
covariate to the fixed model and then making the Y and covariate
variables matrices?

If not, how do I do it?  Can I do it?

And if I can do it, can I also do a power analysis?  (OK, I'm a greedy
little newbie...)

Thanks in advance.



-- 
I can answer any question.
I don't know is an answer.
I don't know yet is a better answer.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Random structure of nested design in lme

2006-08-03 Thread Spencer Graves

  I'm not familiar with 'aov', but I have two observations that might 
help you:

1.  UNESTIMABLE VARIANCE COMPONENT

  The variance component 'soiltype' is not estimable in your 'lme' 
model:

lme(NA.1~soiltype*habitat,random=~1|destination/soiltype)

That's because each level of 'soiltype' occurs only once within each 
level of 'destination' in the self-contained example you provided below.

  To confirm this, I deleted 'soiltype' from this model:

fit.lme - lme(response~soiltype*habitat, random=~1|destination/origin)
fit.lme0 - lme(response~soiltype*habitat, random=~1|destination)

  The answers seemed to be identical except for one thing:

  VarCorr(fit.lme)
   Variance StdDev
destination = pdLogChol(1)
(Intercept)   0.004149471  0.06441639
origin =  pdLogChol(1)
(Intercept)   0.060968550  0.24691810
Residual  0.007265180  0.08523603
  VarCorr(fit.lme0)
destination = pdLogChol(1)
 VarianceStdDev
(Intercept) 0.004149471 0.06441639
Residual0.068233730 0.26121587

  The Residual variance in fit.lme0 equals the sum of origin and 
Residual variances in fit.lme.

  It would help if 'lme' checked for situations like this and either 
refused to run or dropped inestimable variance components.  However, 
it's possible that there are so many ways that variance components can 
be inestimable that it's just not feasible to check for them all.  (The 
function 'varcomp' in S-Plus 6.2 has the same problem.)


2.  CROSSED OR NESTED?

  Are 'destination' and 'origin' crossed or nested in your 'aov' model:

  aov(response~soiltype*habitat+Error(destination+origin))

  I have not used 'aov', and I don't think I should take the time now 
to try to figure this out.  However, this model specification suggests 
to me that 'destination' and 'origin' might be crossed not nested.  (The 
difference is the 'destination:origin' interaction:  If 
'destination+origin' is crossed, their interaction is used as the error 
term;  otherwise, it looks to me like you have a saturated model.)  By 
contrast, 'destination/origin' in lme is 'nested', which means that the 
variance component for 'origin' is in essence the crossed term and the 
interaction combined.

  I believe there is a way to estimate crossed random effects using 
'lme', but I don't understand how.  Fortunately, we can do it using 
'lmer' in the 'lme4' and 'Matrix' packages.

  Because of potential conflicts between 'nlme' and 'lme4', I always 
quit R and restart when I switch from one to another.  The following 
will then fit something using 'lmer' that looks like it might match your 
'aov' fit:

library(lme4)
fit.lme4 - lmer(
   response~soiltype*habitat
  +(1|destination)+(1|origin), Dat0)

where Dat0 is a data.frame with columns 'response', 'soiltype', 
'habitat', 'destination' and 'origin'.

  I don't know 'aov' well enough to determine easily if the results 
from this 'lmer' fit match those from 'aov', but I hope this helps.

  Spencer Graves

ESCHEN Rene wrote:
 Spencer,
 
 Thank you for the kind and elaborate reply to my previous post.
 
 I did consider the option you suggested and many variations. 
Depending on the order of the random factors, lme will either
give the same output as the aov model for soiltype or for habitat,
but not both in the same model.
 
 The closest I came was 
 
   anova(lme(NA.1~soiltype*habitat,random=~1|destination/soiltype))
 
 However, it apppears that in this case the interaction is tested at the same 
 level as soiltype.
 
 In this post, a small sample dataset with a brief explanation of the meaning 
 of the different column titles is included below. Also, I included both the 
 aov model and the lme model.
 
 Hopefully, this will help to get closer to a solution to my problem.
 
 Best regards,
 
 René Eschen.
 
 ___
 
 #Small sample dataset
 #
 data=read.table(Sample dataset.csv,header=T) 
 require(nlme)
 soiltype=factor(soiltype)
 habitat=factor(habitat)
 destination=factor(destination)
 origin=factor(origin)
 summary(aov(response~soiltype*habitat+Error(destination+origin)))
 anova(lme(response~soiltype*habitat,random=~1|destination/origin))
 #
 #habitat type is either 'arable' or 'grassland'
 #destination indicates what site the soil was transplanted into, and is 
 considered a random factor within habitat type
 #soiltype is either 'arable' or 'grassland'
 #origin indicates what site the soil was taken from, and is considered a 
 random factor within soil type
 #response is the response variable, typically some plant parameter such as 
 growth rate or number of leaves, but in this example it is a random number 
 between 0 and 1.
 #
 habitat destination   soiltype  originresponse
 1 1   1   1   0.63
 1 2   1   1   0.76
 1 3   1   1   0.14
 2 4   1   1   0.27
 2 5   1

Re: [R] Finding the position of a variable in a data.frame

2006-08-03 Thread Gabor Grothendieck

On 8/3/06, John Kane [EMAIL PROTECTED] wrote:

 --- Don MacQueen [EMAIL PROTECTED] wrote:

  You don't need to find out the column index. This
  works:
 
  Df[5,'bat'] - 100
 
  -Don
 

 Thanks, I'd tried
 Df[5, bat] - 100  :(

 I never thought of the ' ' being needed.

Right -- the quotes are not needed if you use $ but they are needed if
you use [.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Building a random walk vector

2006-08-03 Thread Matthew Wilson

I'm studying R in my free time.  I want to build a vector where each
element is equal to the element before it in the sequence plus some
random tweak.

In python, I would write:

vec = [100] * 50 # make a 50-element list with each element set to 100
from random import randint
for i, v in enumerate(vec):
if i is not 0: # if we're not on the first element
vec[i] = vec[i-1] + randint(-2, 2)

I suspect R has some fancier way of doing this.  How to?

TIA


-- 
A better way of running series of SAS programs:
http://overlook.homelinux.net/wilsonwiki/SasAndMakefiles

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Questions about sweave...

2006-08-03 Thread Brian Lunergan

Evening all:

I'm taking a little time to experiment with R, Sweave, and Miktex/LaTex but 
I've run up against some problems and -well- I hope that there are some on 
the list who might have some suggestions. This will be kind of wordy as I 
will include the complete files involved as I'm just not sure what I'm 
looking for. Apologies at the outset.

I created the source file (example.Snw):

\documentclass[12pt,letterpaper]{article}
\title{Sweave Example 1}
\author{Friedrich Leisch}
\begin{document}
\maketitle

In this example we embed parts of the examples from the 
\texttt{kruskal.test} help page into a \LaTeX{} document:

=
data(airquality)
library(stats)
kruskal.test(Ozone ~ Month , data=airquality)
@

which shows that the location parameter of the Ozone distribution varies 
significantly from month to month. Finally we include a boxplot of the data:
\begin{center}
fig=TRUE , echo=FALSE=
boxplot(Ozone ~ Month , data=airquality)
@
\end{center}
\end{document}

which R seems to accept gracefully to produce the tex file (example.tex):

\documentclass[12pt,letterpaper]{article}
\title{Sweave Example 1}
\author{Friedrich Leisch}
\usepackage{C:/PROGRA~1/R/R-23~1.1/share/texmf/Sweave}
\begin{document}
\maketitle

In this example we embed parts of the examples from the 
\texttt{kruskal.test} help page into a \LaTeX{} document:

\begin{Schunk}
\begin{Sinput}
  data(airquality)
  library(stats)
  kruskal.test(Ozone ~ Month, data = airquality)
\end{Sinput}
\begin{Soutput}
Kruskal-Wallis rank sum test

data:  Ozone by Month
Kruskal-Wallis chi-squared = 29.2666, df = 4, p-value = 6.901e-06
\end{Soutput}
\end{Schunk}

which shows that the location parameter of the Ozone distribution varies 
significantly from month to month. Finally we include a boxplot of the data:
\begin{center}
\includegraphics{example-002}
\end{center}
\end{document}

but when I try to run Latex on it I get errors (example.log):

This is e-TeX, Version 3.141592-2.2 (MiKTeX 2.4) (preloaded format=latex 
2006.8.3)  3 AUG 2006 20:57
entering extended mode
**example.tex
(example.tex
LaTeX2e 2005/12/01
Babel v3.8g and hyphenation patterns for english, usenglishmax, dumylang, noh
yphenation, french, ukenglish, loaded.
(C:\texmf\tex\latex\base\article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(C:\texmf\tex\latex\base\size12.clo
File: size12.clo 2005/09/16 v1.4f Standard LaTeX file (size option)
)
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
\abovecaptionskip=\skip41
\belowcaptionskip=\skip42
\bibindent=\dimen102
)
! Missing \endcsname inserted.
to be read again
\protect
l.5 \begin
   {document}
The control sequence marked to be read again should
not appear between \csname and \endcsname.


! LaTeX Error: Missing \begin{document}.

See the LaTeX manual or LaTeX Companion for explanation.
Type  H return  for immediate help.
  ...

l.5 \begin
   {document}
You're in trouble here.  Try typing  return  to proceed.
If that doesn't work, type  X return  to quit.

! Extra \endcsname.
[EMAIL PROTECTED] [EMAIL PROTECTED] -h@@k\endcsname
   [EMAIL PROTECTED] \let 
\CurrentOptio...
l.5 \begin
   {document}
I'm ignoring this, since I wasn't doing a \csname.

! Missing \endcsname inserted.
to be read again
\protect
l.5 \begin
   {document}
The control sequence marked to be read again should
not appear between \csname and \endcsname.

! Extra \endcsname.
[EMAIL PROTECTED]@aded ...er \ifx \csname [EMAIL PROTECTED]
   \relax \expandafter 
[EMAIL PROTECTED]
l.5 \begin
   {document}
I'm ignoring this, since I wasn't doing a \csname.

! Missing \endcsname inserted.
to be read again
\protect
l.5 \begin
   {document}
The control sequence marked to be read again should
not appear between \csname and \endcsname.

! Extra \endcsname.
[EMAIL PROTECTED]@ptions ...xdef \csname [EMAIL PROTECTED]
   [EMAIL PROTECTED] 
[EMAIL PROTECTED]
l.5 \begin
   {document}
I'm ignoring this, since I wasn't doing a \csname.

! Missing \endcsname inserted.
to be read again
\protect
l.5 \begin
   {document}
The control sequence marked to be read again should
not appear between \csname and \endcsname.

! Missing \endcsname inserted.
to be read again
\protect
l.5 \begin
   {document}
The control sequence marked to be read again should
not appear between \csname and \endcsname.

! Extra \endcsname.
argument ...e/texmf/[EMAIL PROTECTED] \endcsname
   ,
l.5 \begin
   {document}
I'm ignoring this, since I wasn't doing a \csname.

! Missing \endcsname inserted.
to be read again

[R] gnlsControl

2006-08-03 Thread Daniel Coleman

When I run gnls I get the error:

 

Error in nls(y ~ cbind(1, 1/(1 + exp((xmid - x)/exp(lscal, data = xy,  :


step factor 0.000488281 reduced below 'minFactor' of 0.000976563

 

My first thought was to decrease minFactor but gnlsControl does not contain
minFactor nor nlsMinFactor (see below).  It does however contain nlsMaxIter
and nlsTol which I assume are the analogs of maxiter and tol in nls.control.
I would be happy to hear from anyone who has an idea on what parameters in
gnlsControl to change to get convergence.  

 

Cheers

 

Dan Coleman

Genentech Inc.

 

 gnlsControl

function (maxIter = 50, nlsMaxIter = 7, msMaxIter = 50, minScale = 0.001, 

tolerance = 1e-06, nlsTol = 0.001, msTol = 1e-07, msScale = lmeScale, 

returnObject = FALSE, msVerbose = FALSE, apVar = TRUE, .relStep =
(.Machine$double.eps)^(1/3), 

nlmStepMax = 100, opt = c(nlminb, optim), optimMethod = BFGS, 

minAbsParApVar = 0.05) 

{

list(maxIter = maxIter, nlsMaxIter = nlsMaxIter, msMaxIter = msMaxIter, 

minScale = minScale, tolerance = tolerance, nlsTol = nlsTol, 

msTol = msTol, msScale = msScale, returnObject = returnObject, 

msVerbose = msVerbose, apVar = apVar, nlmStepMax = nlmStepMax, 

opt = match.arg(opt), optimMethod = optimMethod, .relStep =
.relStep, 

minAbsParApVar = minAbsParApVar)

}

 

 nls.control

$maxiter

[1] 50

 

$tol

[1] 1e-05

 

$minFactor

[1] 0.0009765625


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building a random walk vector

2006-08-03 Thread Duncan Murdoch

On 8/3/2006 9:17 PM, Matthew Wilson wrote:
 I'm studying R in my free time.  I want to build a vector where each
 element is equal to the element before it in the sequence plus some
 random tweak.
 
 In python, I would write:
 
 vec = [100] * 50 # make a 50-element list with each element set to 100
 from random import randint
 for i, v in enumerate(vec):
 if i is not 0: # if we're not on the first element
 vec[i] = vec[i-1] + randint(-2, 2)
 
 I suspect R has some fancier way of doing this.  How to?

Assuming randint(-2, 2) gives a value uniform on (-2, 2) a quick way to 
do what you want is

vec - 100 + c(0, cumsum(runif(49, -2, 2)))

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Building a random walk vector

2006-08-03 Thread Dirk Eddelbuettel


On 4 August 2006 at 01:17, Matthew Wilson wrote:
| I'm studying R in my free time.  I want to build a vector where each
| element is equal to the element before it in the sequence plus some
| random tweak.
| 
| In python, I would write:
| 
| vec = [100] * 50 # make a 50-element list with each element set to 100
| from random import randint
| for i, v in enumerate(vec):
| if i is not 0: # if we're not on the first element
| vec[i] = vec[i-1] + randint(-2, 2)
| 
| I suspect R has some fancier way of doing this.  How to?

Yup, cumsum() is your friend. You only need the first scalar of 100,
vectorisation does the rest. Try

 set.seed(12345) 
 Z - 100 + cumsum(runif(50,-2,2))
 summary(Z)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  98.54  100.80  102.30  102.10  103.40  105.70 
 plot(Z, type='l')

Lastly, I think N(0, some_sd) is more customary that U(-2,2) but that is easy
to change.

Cheers, Dik

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot facet label font size

2006-08-03 Thread Gabor Grothendieck

If you are willing to use grid then you could create only the sex
factor in the left strips since its already in the desired position
but when displaying it output a factor.level, i.e. label of A.
(my.strip.left is modified from the prior post to do that.)

Then after the plot is drawn, looping through all grobs looking for
those with a label component of A producing a list of grob names,
strip.left.names.  We then mapply the real factor levels with
those grobs editing them in reset.levels(), defined below.
(I have used the fact, empirically determined that the stripts
are produced in order of the factor levels.)

Everything is the same as the last post except my.strip.left
which has been modified and everything which comes after the
call to histogram.

Although this seems to work, maybe Deepayan or Paul can think of
something slicker.


library(ggplot) # data resides here
library(lattice)
library(grid)

my.strip - function(which.given, which.panel, ...)
   if (which.given == 1  which.panel[2] == 2)
  strip.default(which.given, which.panel, ...)

my.strip.left - function(which.given, which.panel, ...,
   factor.levels, horizontal)
   if (which.given == 1  which.panel[1] == 1)
  strip.default(which.given, which.panel, factor.levels = LETTERS,
  horizontal = FALSE, ...)


histogram(~ tip/total_bill | sex + smoker, tips, strip = my.strip,
 strip.left = my.strip.left, par.settings = list(add.text =
list(cex = 0.7)))

is.strip.left - function(name) identical(grid.get(name)$label, A)
strip.left.names - getNames()[sapply(getNames(), is.strip.left)]

reset.levels - function(nam, lev) grid.edit(nam, label = lev)
mapply(reset.levels , strip.left.names, levels(tips$smoker))






On 8/3/06, Walker, Sam [EMAIL PROTECTED] wrote:

 This works OK, but there is some extra spacing between the panels, the
 top axis and the strip on the top, and the left labels and panel.

 How can I remove these extra spaces?

 I've tried changing various layout.widths settings with no luck.  It
 seems the spaces are calculated based on the number of conditioning
 variables, in this case 2 (sex+smoker).


 Thanks in advance...
 -Sam


 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Wednesday, August 02, 2006 6:04 PM
 To: Walker, Sam
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] ggplot facet label font size

 On 8/2/06, Walker, Sam [EMAIL PROTECTED] wrote:
  How do I change the font size in the facet labels along the edges of
 the
  plot?
 
  For example (from the ggplot help file):
  p-ggplot(tips, sex ~ smoker, aesthetics=list(x=tip/total_bill))
  gghistogram(p)
 
  In this plot, the facet labels are smoker: No, smoker: Yes, sex:
  Female, sex: Male.  What command can I use to reduce the font size
 of
  these labels?
 
  In lattice terminology, cex is used to scale these strip labels.  But
 I
  couldn't find the equivalent in ggplot.
 
  The reason I'm asking is I have a 9x7 array of plots which I've been
  plotting with lattice.  I wanted to use ggplot because I like having
 the
  labels on the edge of the plots

 Note that lattice can do that by using custom strip functions:

 library(ggplot) # data resides here
 library(lattice)

 my.strip - function(which.given, which.panel, ...)
   if (which.given == 1  which.panel[2] == 2)
  strip.default(which.given, which.panel, ...)

 my.strip.left - function(which.given, which.panel, ..., horizontal)
   if (which.given == 2  which.panel[1] == 1)
  strip.default(which.given, which.panel, horizontal = FALSE, ...)

 histogram(~ tip/total_bill | sex + smoker, tips, strip = my.strip,
 strip.left = my.strip.left, par.settings = list(add.text =
 list(cex = 0.7)))

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot facet label font size

2006-08-03 Thread Gabor Grothendieck

Just sending this to you.  One thing that might be easy to do
yet give a lot of flexibility is to:

1. put meaningful names on the grobs.  Even with just this it would be
possible to do a getNames() in grid and then from inspection grid.edit
the appropriate one(s).
2. create a routine that retrieves grobs so one does not have to use
getNames with grep.

trellis.focus and friends do this in lattice.

Regards.



On 8/3/06, hadley wickham [EMAIL PROTECTED] wrote:
 Hi Sam,

  How do I change the font size in the facet labels along the edges of the
  plot?

 Unfortunately, you can't currently change the size of those fonts.
 However, it is on my todo list (as well as completely custom strip
 functions) and should be available in the near future.

 One thing you could do is have a look at ggopt, where you can at least
 change the strip text, if not the size.

 Regards,

 Hadley

 On 8/2/06, Walker, Sam [EMAIL PROTECTED] wrote:
 
  For example (from the ggplot help file):
   p-ggplot(tips, sex ~ smoker, aesthetics=list(x=tip/total_bill))
   gghistogram(p)
 
  In this plot, the facet labels are smoker: No, smoker: Yes, sex:
  Female, sex: Male.  What command can I use to reduce the font size of
  these labels?
 
  In lattice terminology, cex is used to scale these strip labels.  But I
  couldn't find the equivalent in ggplot.
 
  The reason I'm asking is I have a 9x7 array of plots which I've been
  plotting with lattice.  I wanted to use ggplot because I like having the
  labels on the edge of the plots, but the label font size is too large
  and exceeding the size of the label box.
 
  Thanks in advance...
  -Sam
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 103 matches

Mail list logo