date:20070226

[R] Automated figure production

2007-02-26 Thread Sergey Goriatchev

Hello, everybody

Two questions:

1) I am new to maillists, and particularly to r-help maillist. Where
specifically do I go online to see my question and answers to it??? If
I use the searchable archives, or archives by Robert King, I see my
question but not the answers, though I know that at least one persion
posted the answer to r-help. Why do not I see the answers?

2)

I need to produce 104 figures. In each graphic window I want to plot 8
figures. How do I automate the process so that it opens 13 separate
graphic windows in R Gui and plots the figures? (I then paste them in
Word, one by one). Also, how do I save all 104 figures to one PDF
file?

Here is the code to produce the figures, and at this point I change
the j-index in the for-loop by hand, produce 8 figures, copy them to
Word, then change j to 9:16 etc...

#HOW TO PRODUCE ALL 104 GRAPHS AT ONCE

old.par-par(no.readonly=TRUE)
par(mfrow=c(4,2))

for(j in 1:8) {
plot(Cleaned[zz[,j],4], type=b, main=long.names[j], xlab=Month,
ylab=MFR, col=blue)
abline(h=0, col=red)
}
par(old.par)


Thanks for your help!
Sergey

-- 
Laziness is nothing more than the habit of resting before you get tired.
- Jules Renard (writer)

Experience is one thing you can't get for nothing.
- Oscar Wilde (writer)

When you are finished changing, you're finished.
- Benjamin Franklin (President)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to fill between 2 stair plots

2007-02-26 Thread Peter Dalgaard

Williams Scott wrote:
 Hi all,

 I want to create a simple plot with 2 type='s' lines on it:

 plot(a, b, type='s')
 lines(x, y, type='s') 

 I wish to then fill the area between the curves with a colour to
 accentuate the differences eg col=gray(0.95). I cant seem to come up
 with a simple method for this. Any pointers in the right direction much
 appreciated.

   
I don't think there is a really simple method for this. I'd start with
converting the two 's' lines to ordinary lines along the lines of

N - length(a)
a1 - c(a[1],rep(a[-1],each=2),a[N]) # possibly a[N]+a_bit for the final
step)
b1 - rep(b,each=2)

x1, y1 similarly, then

polygon(c(a1,rev(x1)),c(b1,rev(y1), col=grey)

(Did I confuse 's' and 'S'? Anyways, you get the idea)

 Cheers

 Scott
 _

 Dr. Scott Williams

 MBBS BScMed FRANZCR

 Peter MacCallum Cancer Centre

 Melbourne, Australia

 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to fill between 2 stair plots

2007-02-26 Thread Petr Klasterecky

Williams Scott napsal(a):
 Hi all,
 
 I want to create a simple plot with 2 type='s' lines on it:
 
 plot(a, b, type='s')
 lines(x, y, type='s') 
 
 I wish to then fill the area between the curves with a colour to
 accentuate the differences eg col=gray(0.95). I cant seem to come up
 with a simple method for this. Any pointers in the right direction much
 appreciated.
 
 Cheers
 
 Scott
 _
 
 Dr. Scott Williams
 
 MBBS BScMed FRANZCR
 
 Peter MacCallum Cancer Centre
 
 Melbourne, Australia
 
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
?polygon might be useful. See also demo(graphics)
Petr
-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Automated figure production

2007-02-26 Thread Petr Pikal

On 26 Feb 2007 at 9:14, Sergey Goriatchev wrote:

Date sent:  Mon, 26 Feb 2007 09:14:17 +0100
From:   Sergey Goriatchev [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] Automated figure production

 Hello, everybody

 Two questions:

 1) I am new to maillists, and particularly to r-help maillist. Where
 specifically do I go online to see my question and answers to it??? If
 I use the searchable archives, or archives by Robert King, I see my
 question but not the answers, though I know that at least one persion
 posted the answer to r-help. Why do not I see the answers?

You is close. Try click on latest R-help in Searchable R-list archive

 2)

 I need to produce 104 figures. In each graphic window I want to plot 8
 figures. How do I automate the process so that it opens 13 separate
 graphic windows in R Gui and plots the figures? (I then paste them in
 Word, one by one). Also, how do I save all 104 figures to one PDF
 file?

see ?pdf, ?png 

 Here is the code to produce the figures, and at this point I change
 the j-index in the for-loop by hand, produce 8 figures, copy them to
 Word, then change j to 9:16 etc...

 #HOW TO PRODUCE ALL 104 GRAPHS AT ONCE

e.g.
cycle
png(name.based.on.cycle.pointer, 800,800)
 old.par-par(no.readonly=TRUE)
 par(mfrow=c(4,2))

 for(j in 1:8) {
 plot(Cleaned[zz[,j],4], type=b, main=long.names[j], xlab=Month,
 ylab=MFR, col=blue) abline(h=0, col=red) } par(old.par)

dev.off()
endcycle

or similar without cycle using pdf device see onefile=T option

HTH
Petr

 Thanks for your help!
 Sergey

 -- 
 Laziness is nothing more than the habit of resting before you get
 tired. - Jules Renard (writer)

 Experience is one thing you can't get for nothing.
 - Oscar Wilde (writer)

 When you are finished changing, you're finished.
 - Benjamin Franklin (President)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Chi Square with two tab-delimited text files

2007-02-26 Thread Carina Brehony

Hi,

I want to do a chi square test and I have two tab delimited text files with
Expected and Observed values to compare.  Each file contains only the values
and are 48 rows by 116 columns.  I have managed to do something with them,
but I don't think it is right as I got a p value of 1.  In this case I used
the read.table() function to read the values from the files.  But I don't
know if this was right.

 

 

 x=read.table(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Expected
input.txt)

 

 y=read.table(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Observed
input.txt)

 

 chisq.test(x,y)

 

 

 

 

Pearson's Chi-squared test

 

data:  x 

X-squared = 4.4602, df = 5405, p-value = 1

 

Warning message:

Chi-squared approximation may be incorrect in: chisq.test(x, y)

 

 

 

 

Maybe the scan() function is more correct??  Using this I got:

 

 

 x=scan(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Observed
input.txt)

Read 5568 items

 

 y=scan(file=C:/Program Files/R/R-2.2.1/Projects/Stats EU/Expected
input.txt)

Read 5568 items

 

 chisq.test(x,y)

 

Pearson's Chi-squared test

 

data:  x and y 

X-squared = 172306.4, df = 13880, p-value  2.2e-16

 

Warning message:

Chi-squared approximation may be incorrect in: chisq.test(x, y)

 

 

 

Any help would be much appreciated.

Regards,

 

Carina


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Double-banger function names: preferences and suggestions

2007-02-26 Thread Barry Rowlingson

hadley wickham wrote:
 What do you prefer/recommend for double-banger function names:
 
  1 scale.colour
  2 scale_colour
  3 scaleColour
 
 1 is more R-like, but conflicts with S3.  2 is a modern version of
 number 1, but not many packages use it.  Number 3 is more java-like.
 (I like number 2 best)

  Or you can be lisp-ish and use hyphens (or many other symbols) by quoting:

   scale-colour=2
   ls()
  [1] scale-colour

  but that requires further perversions:

   get(scale-colour)
  [1] 2

  I like (3), aka camelCase - aka about a dozen other names: 
http://en.wikipedia.org/wiki/Camel_case - but that's mainly because its 
widely used in Python, and Python syntax is just marvellous. The more R 
syntax tends to Python syntax the better. Let's get rid of curly 
brackets and make whitespace significant...

  But I digress. As usual.

  ANytHiNg bUt sTUdLY cApS: http://en.wikipedia.org/wiki/StudlyCaps

Barry

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] some caracter dont work with JGR

2007-02-26 Thread Henric Nilsson (Public)

Den Fr, 2007-02-23, 19:21 skrev Ronaldo Reis Junior:
 Hi,

 I testing JGR and I like, but my ~ caracter dont work. My keyboard is
 Brazilian ABNT2.

 The key is OK, only in JGR it dont work.

 Anybody have any idea about this?

Yes, and it's known problem -- see e.g.
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6371251.

It has been discussed on more than one occasion over at the RoSuDa devel
list (http://www.rosuda.org/lists.shtml) for which questions on JGR and
related projects are more appropriate than R-help.


HTH,
Henric




 Thanks
 Ronaldo
 --
 Mais variado que baldeação em Cacequi.
 --
 Prof. Ronaldo Reis Júnior
 |  .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva
 | : :'  : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia
 | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil
 |   `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] |
 [EMAIL PROTECTED]
 | ICQ#: 5692561 | LinuxUser#: 205366

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nested design in lme, need help with specifying model

2007-02-26 Thread Mike Dunbar

Dear Radka

I'm not sure I quite understand your design and quite where the nesting comes 
in.

But a quick suggestion is why are you adding species as random as well as 
fixed? I don't think you can do this or indeed should do it. I think this is 
why you get problems with your fixed effects. If you have 3 species then 
species ought to be fixed. Replicate is more the sort of effect that ought to 
be random, this ought to pick up the fact that prey within one run of the 
experiment won't be independent. But if you only have two replicates per 
treatment (species of prey?), then this will limit your ability to detect 
differences between species of prey, unless your within-replicate variation is 
very low. You can look at this very simply but not quite as powerfully by 
averaging the responses for each replicate and doing a non nested anova.

Re your second analysis, this seems along the right lines. How have you coded 
replicate? This may explain your results. Without more details on the plot you 
did its difficult to help further.

regards

Mike



 Radka Ptacnikova [EMAIL PROTECTED] 25/02/2007 23:32 
Hi,

I wonder if anyone can help me with specifying a right model for my analysis. I 
am a beginner to lme methods and though have spent already many hours studying 
from various books an on-line helps, I was unfortunately not able to find a 
solution to my problem on my own. 

Data structure:
I studied escape behavior of three species of a prey to a predator. The prey 
specimens (many) were in a vessel, together with one predator. Escape responses 
were video-recorded when a prey approached the predator close enough and jumped 
consequently away. Each set was run twice with a fresh predator and a fresh set 
of the prey specimens, leading to two replicates per treatment. Unequal number 
of shots (i.e. prey specimens) were analyzed in each of the two replicates for 
each of the three prey species (range 11-19). The data are therefore unbalanced 
and also variance for treatments/replicates is far from being homogeneous, so 
that a nested anova is not a good choice here. As the number of prey specimens 
was rather high, I assume that each shot represents a different prey 
individual. 

My questions:
1) Do the three prey species significantly differ in their escape response?
2) What was variability between replicates within a species and how much did it 
contribute to overall variability?

Now, to my best understanding, the model should be:

mod1-lme(Escape.parameter~Species, random=~1|Species/Replicate)

as I am interested in Species as fixed effects and want to know variability 
caused by Replicates nested within Species as random effects. However, when 
running this model, I get 

Random effects:
 Formula: ~1 | Species
(Intercept)
StdDev:2.937479

 Formula: ~1 | Replicate %in% Species
(Intercept) Residual
StdDev:4.973931 4.266302

Fixed effects: Max_speed ~ Species 
Value Std.ErrorDF   t-value p-value
(Intercept) 23.792040  4.798143 39  4.958593   0
Spec2 -7.121766  6.747930   0 -1.055400 NaN
Spec3  -9.779830  6.725391  0 -1.454165 NaN

So I get variance within species and within replicates, but what the hell are 
these zero DF's, leading to zero p's and how should I interpret them?


Another model I tried was:
mod2-lme(Escape.parameter~Species, random=~1|Replicate)

Random effects:
 Formula: ~1 | Replicate
 (Intercept) Residual
StdDev: 0.0002733313 5.180472

Fixed effects: Max_speed ~ Species
  Value Std.ErrorDF   t-value   p-value
(Intercept)26.00364  1.561971  41  16.647963   0e+00
SpeciesSpec2  -7.93297  2.056430  41  -3.857641   4e-04
SpeciesSpec3-11.81048  1.962713  41  -6.017425   0e+00

Alright, I get the among species differences, but I am confused here with the 
very low StdDev of Replicate as a random effect, since I know f.ex. from a 
plot, that it is relatively high. Which leads me to thinking, that something is 
wrong here. 

I'd appreciate any hints and suggestions.

Radka






 

Never Miss an Email

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

-- 
This message (and any attachments) is for the recipient only...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Add-up duplicates and merge

2007-02-26 Thread Serguei Kaniovski


Hello,

a have two matrices of data as below. I would like to add-up the duplicate
in terms of pair of names in rows, and then merge the values in the second
matrix to the pairs as two new variables x3 and x4.

Input

,x1,x2
jane.mike,31,43
jane.steve,32,2
jane.steve,5,3
jim.mike,76,5
jane.steve,4,4
mike.steve,54,7
mike.steve,5,7
jane.mike,7,8

and

,y
jane,0.3
jim,0.4
mike,0.1
carl,0.5
john,0.9
steve,0.4
dirk,0.2

Output:

,x1,x2,x3,x4
jane.mike,38,51,0.3,0.1
jane.steve,41,9,0.3,0.4
jim.mike,76,5,0.4,0.1
mike.steve,59,14,0.1,0.4

Any help appreciated,
Serguei
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PlotAffyRNAdeg on Estrogen Data

2007-02-26 Thread Brooks, Anthony B

Hi everyone,

I'm trying to generate an RNA degradation plot of the Estrogen example
data plot, but seem to get an error. I've tried defining an ylim value,
ylim=c(0,30) , but it doesn't seem to work either.

 

My code is as follows:

 

 RNAdeg-AffyRNAdeg(Data)

 png(DegLoc, width=720, height=720)

 par(ann=FALSE)

 par(mar=c(3,3,0.1,0.1))

 plotAffyRNAdeg(RNAdeg,col=cols, cex.axis=1.2)

Error in plot.window(xlim, ylim, log, asp, ...) : 

need finite 'ylim' values

 dev.off()

null device 

  1

 

Thanks in advance

Tony


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add-up duplicates and merge

2007-02-26 Thread Theis, Winfried

Hello,

 for the first task change the names if they are character into a factor
and have a look at ?by 
For the merge you will need strsplit() on character vectors (so be
carefull to change the factor back to character) and merge() 

Regards,
Winfried 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Serguei Kaniovski
Sent: Monday, February 26, 2007 12:14 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Add-up duplicates and merge


Hello,

a have two matrices of data as below. I would like to add-up the
duplicate in terms of pair of names in rows, and then merge the values
in the second matrix to the pairs as two new variables x3 and x4.

Input

,x1,x2
jane.mike,31,43
jane.steve,32,2
jane.steve,5,3
jim.mike,76,5
jane.steve,4,4
mike.steve,54,7
mike.steve,5,7
jane.mike,7,8

and

,y
jane,0.3
jim,0.4
mike,0.1
carl,0.5
john,0.9
steve,0.4
dirk,0.2

Output:

,x1,x2,x3,x4
jane.mike,38,51,0.3,0.1
jane.steve,41,9,0.3,0.4
jim.mike,76,5,0.4,0.1
mike.steve,59,14,0.1,0.4

Any help appreciated,
Serguei
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi Square with two tab-delimited text files

2007-02-26 Thread Petr Klasterecky

Carina Brehony napsal(a):
 Hi,
 I want to do a chi square test and I have two tab delimited text files with
 Expected and Observed values to compare.  Each file contains only the values
snip

There are a lot of chi^2 tests, most of them compare OE quantities and 
it is not clear which one you want to use. I'd guess a goodness of fit 
test, but who knows? See ?chisq.test and the examples given there. It 
also tells you that the y-argument is ignored if x is a matrix (that's 
probably the reason why you get different results using read.table and 
scan).
Petr
-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi Square with two tab-delimited text files

2007-02-26 Thread Carina Brehony

Yes, I would like to do a goodness-of-fit test.

-Original Message-
From: Petr Klasterecky [mailto:[EMAIL PROTECTED] 
Sent: 26 February 2007 11:50
To: Carina Brehony
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Chi Square with two tab-delimited text files

Carina Brehony napsal(a):
 Hi,
 I want to do a chi square test and I have two tab delimited text files
with
 Expected and Observed values to compare.  Each file contains only the
values
snip

There are a lot of chi^2 tests, most of them compare OE quantities and 
it is not clear which one you want to use. I'd guess a goodness of fit 
test, but who knows? See ?chisq.test and the examples given there. It 
also tells you that the y-argument is ignored if x is a matrix (that's 
probably the reason why you get different results using read.table and 
scan).
Petr
-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi Square with two tab-delimited text files

2007-02-26 Thread David Barron

It's a bit difficult to advise without knowing what the rows and
columns represent, but why not just calculate the statistic yourself,
given that you already have observed and expected values?  For
example:

chi2 - sum((y-x)^2/x)



On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote:
 Yes, I would like to do a goodness-of-fit test.

 -Original Message-
 From: Petr Klasterecky [mailto:[EMAIL PROTECTED]
 Sent: 26 February 2007 11:50
 To: Carina Brehony
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Chi Square with two tab-delimited text files

 Carina Brehony napsal(a):
  Hi,
  I want to do a chi square test and I have two tab delimited text files
 with
  Expected and Observed values to compare.  Each file contains only the
 values
 snip

 There are a lot of chi^2 tests, most of them compare OE quantities and
 it is not clear which one you want to use. I'd guess a goodness of fit
 test, but who knows? See ?chisq.test and the examples given there. It
 also tells you that the y-argument is ignored if x is a matrix (that's
 probably the reason why you get different results using read.table and
 scan).
 Petr
 --
 Petr Klasterecky
 Dept. of Probability and Statistics
 Charles University in Prague
 Czech Republic

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] If you had just one book on R to buy...

2007-02-26 Thread Ramon Diaz-Uriarte

On 2/25/07, Julien Barnier [EMAIL PROTECTED] wrote:
 Hi,

 I am starting a new job as a study analyst for a social science
 research unit. I would really like to use R as my main tool for data
 manipulation and analysis. So I'd like to ask you, if you had just one
 book on R to buy (or to keep), which one would it be ? I already
 bought the Handbook of Statistical Analysis Using R, but I'd like to
 have something more complete, both on the statistical point of view
 and on R usage.

 I thought that Modern applied statistics with S-Plus would be a good
 choice, but maybe some of you could have interesting suggestions ?



Dear Julien,

I'd definitely go for MASS if you already have Handbook. MASS is an
awesome book, but you did not tell us anything about your background
(stats begginners, for instance, sometimes get lost in MASS, because
that is not the target audience). In terms of books of this level,
MASS is unique. (There are more specific books for certain topics,
such as mixed models, etc; but for a wide coverage, I'd go with MASS).

HTH,

R.



 Thanks in advance,

 --
 Julien

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi Square with two tab-delimited text files

2007-02-26 Thread Carina Brehony

Hi,
The files look like below and the rows and columns are numbers of genetic
types e.g. row1 is type 4; column1 is type A. So for, row1:column1 cell
there are 78 type 4/type A combinations.  I hope this makes sense!



78  500 18  6   0   4   0   1   6
1   1   0   0   0   1   0   0   0   0
0   1   0   0   0   0   0   2   1   0
0   0   1   0   0   0   0   23  0   0
0   7   0   0   7   0   0   0   6   0
8   0   0   0   0   0   0   14  0   0
0   0   0   0   0   0   0   5   0   0
0   0   0   0   45  0   0   0   0   0
0   0   0   0   0   0   0   3   0   40
0   0   0   0   0   0   0   0   0   0
0   0   0   12  0   0   0   0   8   4
0   0   0   0   0   0   etc...  





-Original Message-
From: David Barron [mailto:[EMAIL PROTECTED] 
Sent: 26 February 2007 12:12
To: Carina Brehony; r-help
Subject: Re: [R] Chi Square with two tab-delimited text files

It's a bit difficult to advise without knowing what the rows and
columns represent, but why not just calculate the statistic yourself,
given that you already have observed and expected values?  For
example:

chi2 - sum((y-x)^2/x)



On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote:
 Yes, I would like to do a goodness-of-fit test.

 -Original Message-
 From: Petr Klasterecky [mailto:[EMAIL PROTECTED]
 Sent: 26 February 2007 11:50
 To: Carina Brehony
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Chi Square with two tab-delimited text files

 Carina Brehony napsal(a):
  Hi,
  I want to do a chi square test and I have two tab delimited text files
 with
  Expected and Observed values to compare.  Each file contains only the
 values
 snip

 There are a lot of chi^2 tests, most of them compare OE quantities and
 it is not clear which one you want to use. I'd guess a goodness of fit
 test, but who knows? See ?chisq.test and the examples given there. It
 also tells you that the y-argument is ignored if x is a matrix (that's
 probably the reason why you get different results using read.table and
 scan).
 Petr
 --
 Petr Klasterecky
 Dept. of Probability and Statistics
 Charles University in Prague
 Czech Republic

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Rdonlp2 0.2-1 released

2007-02-26 Thread Ryuich Tamura

Hello R-lists,

I have released the new version of Rdonlp2
(an extension package to solve nonlinear constrained optimization problem).

0.2-1 improves stability and usability, and runs little faster.

Windows, OSX binary, and source files are available from:
http://arumat.net/Rdonlp2/

Any feedbacks are highly welcome.

Regards,

Ryuichi Tamura

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi Square with two tab-delimited text files

2007-02-26 Thread David Barron

In that case, you can just ignore the expected values and use the
observed values in the chisq.test.  The reason you got a p value of 1
before is because the second argument was ignored, and so you did a
chi square test on the expected values alone.

If you have loaded the obseved values into a matrix y using read.table
as in your first example, then just use chisq.test(y).  But you should
notice that you have a lot of zero cells and so probably lots of small
expected values, which is a problem for the chi square test.



On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote:
 Hi,
 The files look like below and the rows and columns are numbers of genetic
 types e.g. row1 is type 4; column1 is type A. So for, row1:column1 cell
 there are 78 type 4/type A combinations.  I hope this makes sense!



 78  500 18  6   0   4   0   1   6
 1   1   0   0   0   1   0   0   0   0
 0   1   0   0   0   0   0   2   1   0
 0   0   1   0   0   0   0   23  0   0
 0   7   0   0   7   0   0   0   6   0
 8   0   0   0   0   0   0   14  0   0
 0   0   0   0   0   0   0   5   0   0
 0   0   0   0   45  0   0   0   0   0
 0   0   0   0   0   0   0   3   0   40
 0   0   0   0   0   0   0   0   0   0
 0   0   0   12  0   0   0   0   8   4
 0   0   0   0   0   0   etc...





 -Original Message-
 From: David Barron [mailto:[EMAIL PROTECTED]
 Sent: 26 February 2007 12:12
 To: Carina Brehony; r-help
 Subject: Re: [R] Chi Square with two tab-delimited text files

 It's a bit difficult to advise without knowing what the rows and
 columns represent, but why not just calculate the statistic yourself,
 given that you already have observed and expected values?  For
 example:

 chi2 - sum((y-x)^2/x)



 On 26/02/07, Carina Brehony [EMAIL PROTECTED] wrote:
  Yes, I would like to do a goodness-of-fit test.
 
  -Original Message-
  From: Petr Klasterecky [mailto:[EMAIL PROTECTED]
  Sent: 26 February 2007 11:50
  To: Carina Brehony
  Cc: r-help@stat.math.ethz.ch
  Subject: Re: [R] Chi Square with two tab-delimited text files
 
  Carina Brehony napsal(a):
   Hi,
   I want to do a chi square test and I have two tab delimited text files
  with
   Expected and Observed values to compare.  Each file contains only the
  values
  snip
 
  There are a lot of chi^2 tests, most of them compare OE quantities and
  it is not clear which one you want to use. I'd guess a goodness of fit
  test, but who knows? See ?chisq.test and the examples given there. It
  also tells you that the y-argument is ignored if x is a matrix (that's
  probably the reason why you get different results using read.table and
  scan).
  Petr
  --
  Petr Klasterecky
  Dept. of Probability and Statistics
  Charles University in Prague
  Czech Republic
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 =
 David Barron
 Said Business School
 University of Oxford
 Park End Street
 Oxford OX1 1HP




-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] PLotting R graphics/symbols without user x-y scaling

2007-02-26 Thread Jonathan Lees


Is it possible to add lines or other
user defined graphics
to a plot in R that does not depend on
the user scale for the plot?

For example I have a plot
plot(x,y)
and I want to add some graphic that is
scaled in inches or cm but I do not want the
graphic to change when the x-y scales are
changed - like a thermometer, scale bar or
other symbol -
How does one do this?

I want to build my own library of glyphs to add to plots
but I do not know how to plot them when their
size is independent of the device/user coordinates.

Is it possible to add to the list
of symbols in the function symbols()
other than:
  _circles_, _squares_, _rectangles_, _stars_, _thermometers_, and
  _boxplots_

can I make my own symbols and have symbols call these?


Thanks-


-- 
Jonathan M. Lees
Professor
THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL
Department of Geological Sciences
Campus Box #3315
Chapel Hill, NC  27599-3315
TEL: (919) 962-0695
FAX: (919) 966-4519
[EMAIL PROTECTED]
http://www.unc.edu/~leesj

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Chi Square with two tab-delimited text files

2007-02-26 Thread Carina Brehony

Hi,
Thanks for the input.  I have tried the test again just using the Observed
values and the read.table() function and get this:

data:  y 
X-squared = NaN, df = 5405, p-value = NA

Warning message:
Chi-squared approximation may be incorrect in: chisq.test(y)


So it doesn't seem to like it!  I guess the zeroes are a problem for it.  Is
there another way around? Do I need to have the totals of each column and
row in the file also?

Thanks,
Carina

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PlotAffyRNAdeg on Estrogen Data

2007-02-26 Thread James W. MacDonald

Hi Tony,

This question concerns a Bioconductor package, so is best asked on the 
BioC mailing list instead of R-help.

Best,

Jim

Brooks, Anthony B wrote:
 Hi everyone,
 
 I'm trying to generate an RNA degradation plot of the Estrogen example
 data plot, but seem to get an error. I've tried defining an ylim value,
 ylim=c(0,30) , but it doesn't seem to work either.
 
  
 
 My code is as follows:
 
  
 
 
RNAdeg-AffyRNAdeg(Data)
 
 
png(DegLoc, width=720, height=720)
 
 
par(ann=FALSE)
 
 
par(mar=c(3,3,0.1,0.1))
 
 
plotAffyRNAdeg(RNAdeg,col=cols, cex.axis=1.2)
 
 
 Error in plot.window(xlim, ylim, log, asp, ...) : 
 
 need finite 'ylim' values
 
 
dev.off()
 
 
 null device 
 
   1
 
  
 
 Thanks in advance
 
 Tony
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Test of Presence Matrix HOWTO?

2007-02-26 Thread Johannes Graumann

Hello,

Imagine 3 lists like so:

 a - list(A,B,C,D)
 b - list(A,B,E,F)
 c - list(A,C,E,G)

What I need (vennDiagram) is a matrix characterizing with 1 or 0 whether any
given member is present or not like so:
 x1 x2 x3
[1,]  1  1  1
[2,]  1  1  0
[3,]  1  0  1
[4,]  1  0  0
[5,]  0  1  1
[6,]  0  1  0
[7,]  0  0  1

(where the rows represent A-G and the columns a-c, respectively).

 table(c(a,b,c))
will give me a quick answer for the 1 1 1 case, but how to deal with the
other cases efficiently without looping over each string and looking for
membership %in% each list?

Thanks for enlightening the learning,

Joh

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] someattributes

2007-02-26 Thread Thaden, John J

I'd like to use someattributes(), as described 
in documentation for R version 2.4.1 (windows build)  

help(attributes)

however, someattributes() does not seem to exist.

 someattributes()
Error: could not find function someattributes

Is this true or am I doing something wrong?

-John

Confidentiality Notice: This e-mail message, including any a...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test of Presence Matrix HOWTO?

2007-02-26 Thread ONKELINX, Thierry

 a - c(A,B,C,D)
 b - c(A,B,E,F)
 c - c(A,C,E,G)
 Df - cbind(a, b, c)
 apply(Df, 2, function(x)(LETTERS[1:7] %in% x))
 a b c
[1,]  TRUE  TRUE  TRUE
[2,]  TRUE  TRUE FALSE
[3,]  TRUE FALSE  TRUE
[4,]  TRUE FALSE FALSE
[5,] FALSE  TRUE  TRUE
[6,] FALSE  TRUE FALSE
[7,] FALSE FALSE  TRUE
 
 apply(Df, 2, function(x)(as.numeric(LETTERS[1:7] %in% x)))
 a b c
[1,] 1 1 1
[2,] 1 1 0
[3,] 1 0 1
[4,] 1 0 0
[5,] 0 1 1
[6,] 0 1 0
[7,] 0 0 1


Cheers,

Thierry



ir. Thierry Onkelinx

Instituut voor natuur- en bosonderzoek / Reseach Institute for Nature
and Forest

Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance

Gaverstraat 4

9500 Geraardsbergen

Belgium

tel. + 32 54/436 185

[EMAIL PROTECTED]

www.inbo.be 

 

Do not put your faith in what statistics say until you have carefully
considered what they do not say.  ~William W. Watt

A statistical analysis, properly conducted, is a delicate dissection of
uncertainties, a surgery of suppositions. ~M.J.Moroney


 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED] [mailto:r-help-
 [EMAIL PROTECTED] Namens Johannes Graumann
 Verzonden: maandag 26 februari 2007 16:25
 Aan: r-help@stat.math.ethz.ch
 Onderwerp: [R] Test of Presence Matrix HOWTO?
 
 Hello,
 
 Imagine 3 lists like so:
 
  a - list(A,B,C,D)
  b - list(A,B,E,F)
  c - list(A,C,E,G)
 
 What I need (vennDiagram) is a matrix characterizing with 1 or 0
whether
 any
 given member is present or not like so:
  x1 x2 x3
 [1,]  1  1  1
 [2,]  1  1  0
 [3,]  1  0  1
 [4,]  1  0  0
 [5,]  0  1  1
 [6,]  0  1  0
 [7,]  0  0  1
 
 (where the rows represent A-G and the columns a-c, respectively).
 
  table(c(a,b,c))
 will give me a quick answer for the 1 1 1 case, but how to deal with
the
 other cases efficiently without looping over each string and looking
for
 membership %in% each list?
 
 Thanks for enlightening the learning,
 
 Joh
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Test of Presence Matrix HOWTO?

2007-02-26 Thread Dimitris Rizopoulos

you can use something like the following:

a - list(A,B,C,D)
b - list(A,B,E,F)
c - list(A,C,E,G)

#

abc - list(a, b, c)
unq.abc - unique(unlist(abc))

out.lis - lapply(abc, %in%, x = unq.abc)
out.lis
lapply(out.lis, as.numeric)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Johannes Graumann [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Monday, February 26, 2007 4:25 PM
Subject: [R] Test of Presence Matrix HOWTO?


 Hello,

 Imagine 3 lists like so:

 a - list(A,B,C,D)
 b - list(A,B,E,F)
 c - list(A,C,E,G)

 What I need (vennDiagram) is a matrix characterizing with 1 or 0 
 whether any
 given member is present or not like so:
 x1 x2 x3
 [1,]  1  1  1
 [2,]  1  1  0
 [3,]  1  0  1
 [4,]  1  0  0
 [5,]  0  1  1
 [6,]  0  1  0
 [7,]  0  0  1

 (where the rows represent A-G and the columns a-c, 
 respectively).

 table(c(a,b,c))
 will give me a quick answer for the 1 1 1 case, but how to deal 
 with the
 other cases efficiently without looping over each string and looking 
 for
 membership %in% each list?

 Thanks for enlightening the learning,

 Joh

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] LD50 contrasts with lmer/lme4

2007-02-26 Thread Dieter Menne

Dear R-list,

I have a data set from 20 pigs, each of which is tested at crossed 9 doses
(logdose -4:4) and 3 skin treatment substances when exposed to a standard
polluted environment. So there are 27 patches on each pig. The response is
irritation=yes/no.

I want to determine equally effective 50% doses (similar to old LD50), and
to test the treatments against each other. I am looking for something like
dose.p in MASS generalized to lmer (or glmmPQL or whatever). The direct as
output by lmer are not useful, because saying 30% irritation with A and 40%
with B at dose xx has less meaning than giving equivalent effective
doses.

Dieter

- Simulated data -
library(lme4)
animal = data.frame(ID = as.factor(1:20), da = rnorm(1:20))
treat = data.frame(treat=c('A','B','C'), treatoff=c(1,2,1.5),
   treatslope = c(0.5,0.6,0.7))

gr = expand.grid(animal=animal$ID,treat=treat$treat,logdose=c(-4:4))
gr$resp = as.integer(treat$treatoff[gr$treat]+
  treat$treatslope[gr$treat]*gr$logdose+
  animal$da[gr$animal] +  rnorm(nrow(gr),0,2) 0)

gr.lmer = lmer(resp ~ treat*logdose+(1|animal),data=gr,family=binomial)
summary(gr.lmer)

--- Output
Fixed effects:
   Estimate Std. Error z value Pr(|z|)
(Intercept)  0.9553 0.30743.11   0.0019 **
treatB   0.8793 0.33132.65   0.0079 **
treatC   0.5516 0.30771.79   0.0730 .
logdose  0.3733 0.07744.82  1.4e-06 ***
treatB:logdose   0.3081 0.13232.33   0.0198 *
treatC:logdose   0.2666 0.12492.13   0.0328 *

- Goal
Value SD  p
50% logdose (A-B)  xx   xx  xx
50% logdose (A-C)  yy   yy  yy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] someattributes

2007-02-26 Thread Charilaos Skiadas


On Feb 26, 2007, at 10:35 AM, Thaden, John J wrote:

 I'd like to use someattributes(), as described
 in documentation for R version 2.4.1 (windows build)

 help(attributes)

 however, someattributes() does not seem to exist.

 someattributes()
 Error: could not find function someattributes

 Is this true or am I doing something wrong?

My help shows it as moreattributes, not someattributes. (MacOSX,  
though doesn't sound like it should be platform-specific).
 -John

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Double-banger function names: preferences and suggestions

2007-02-26 Thread hadley wickham

Thanks to every one who contributed - there definitely isn't a
consensus, but perhaps a slight preference towards number 3
(camelCase).  I'm not sure yet if this beats out my personal
preference for number 2.

Hadley

On 2/25/07, hadley wickham [EMAIL PROTECTED] wrote:
 What do you prefer/recommend for double-banger function names:

  1 scale.colour
  2 scale_colour
  3 scaleColour

 1 is more R-like, but conflicts with S3.  2 is a modern version of
 number 1, but not many packages use it.  Number 3 is more java-like.
 (I like number 2 best)

 Any suggestions?

 Thanks,

 Hadley


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Barplot Graph

2007-02-26 Thread Mohsen Jafarikia

Hello everyone:



I want to draw a Par plot but I don't know how to choose different pattern
of of colors for different bars to be able to distinguish them in a black
and white print. I want some kinds of patterns on the bars such as '///' or
'\\\' or 



Suppose I have the following data in the R.dat file:



1.29  22.43  1.92

5.08  18.70  0.00

2.19  33.69  1.92

2.95  20.39  0.00

3.29  36.16  4.48



and I am using the following code:



MP-read.table(file='R.dat')

names(MP)-c('BL','LR','Q')

cols-  I want white when 'Q' column has zero and
different kind of patterns when 'Q' is 1.92 and another pattern when 'Q' is
4.48

Graph-barplot(MP$LR, col=cols, width=(MP$BL))



Thanks

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding duplicates by rows

2007-02-26 Thread Serguei Kaniovski


Hi,

I am trying to add duplicates of matrix mat by row. Commands

subset(mat,duplicated(rownames(mat)))

or

mat[which(duplicated(rownames(mat))),]

return only half of the required indices. How can I find the remaining
ones, ie the matches, so that I can add them up?

Thanks,
Serguei

___

Austrian Institute of Economic Research (WIFO)

Name: Serguei Kaniovski   P.O.Box 91
Tel.: +43-1-7982601-231 Arsenal Objekt 20
Fax: +43-1-7989386  1103 Vienna, Austria
Mail: [EMAIL PROTECTED]  A-1030 Wien

http://www.wifo.ac.at/Serguei.Kaniovski
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] survival analysis using rpart

2007-02-26 Thread Walter345


Hello,

I use rpart to predict survival time and have a problem in interpreting the
output of “estimated rate”. Here is an example of what I do:

 stagec -
 read.table(http://www.stanford.edu/class/stats202/DATA/stagec.data;, 
 col.names=c(pgtime, pgstat, age,eet, g2, grade, gleason,
 ploidy))

 fit - rpart(Surv(pgtime, pgstat) ~ age + eet + g2 + grade + gleason +
 ploidy, data=stagec)


Result:

1) root 146 195.411600 1.000  
   2) grade 2.5 61  45.021520 0.3624701  
 4) g2 11.36 33   9.120116 0.1225562 *
 5) g2=11.36 28  27.804100 0.7335298  
  10) gleason 5.5 20  14.376900 0.5292190 *
  11) gleason=5.5 8  11.201470 1.3083680 *
   3) grade=2.5 85 125.327400 1.6190620  
 6) age=56.5 75 104.154700 1.4287310  
  12) gleason 7.5 50  66.701410 1.1431320 *
  13) gleason=7.5 25  33.993130 2.0355220  
26) g2=15.29 13  16.555970 1.3494740 *
27) g2 15.29 12  14.220260 2.9210480 *
 7) age 56.5 10  15.522810 3.1977430 *

Let’s look at the terminal node 4:

#   PGTIME  PGSTAGE AGE EET G2  GRADE   GLEASON PLOIDY
1   8.6570840   70  1   4.431   3   1
2   16.700880   56  2   5.291   3   1
3   3.1622171   62  2   3.572   4   1
4   10.201230   63  2   5.142   5   1
5   4.4791240   63  2   5.752   5   1
6   6.5160840   66  2   5.922   5   1
7   4.9363450   67  2   6.412   5   1
8   10.798080   72  1   6.682   NA  1
9   9.1745370   62  1   6.742   5   1
10  10.874740   72  2   6.8 2   5   1
11  7.0280620   52  2   7.152   7   1
12  11.364810   59  2   7.612   5   1
13  10.176590   64  1   7.612   NA  1
14  6.96783 0   67  2   7.782   6   1
15  10.617380   55  2   7.812   5   1
16  6.5106090   70  1   7.882   6   1
17  10.362760   55  2   8.1 2   5   1
18  6.6940450   54  2   8.112   4   1
19  11.718  0   61  2   8.4 2   5   1
20  7.3018470   69  2   8.462   5   1
21  6.0670770   69  2   8.582   6   1
22  8.3531820   59  2   8.762   6   1
23  5.5414090   59  1   9.012   5   1
24  5.4921280   61  2   9.422   5   1
25  7.2087610   63  1   9.762   5   1
26  6.0041060   52  2   9.9 2   4   1
27  5.6646130   71  1   10.16   2   6   1
28  6.1300470   64  2   10.26   2   4   1
29  9.8124570   64  1   10.51   2   5   1
30  6.2751540   62  2   10.82   2   6   1
31  9.2539350   61  2   11.23   2   5   1
32  5.2019160   54  2   11.35   2   6   1
33  6.22861 0   65  2   11.35   2   5   1

Here we have 33 observations and 1 event. The “estimated rate” is 0.1225562.
My questions are:

(1) Is the “estimated rate” the estimated hazard rate ratio? 
(2) How does rpart calculate this rate?
(3) Suppose I use xpred.rpart(fit, xval=10) to perform 10-fold
cross-validation using (a) the complete stagec data set and (b) only a
subset of it, say, using the columns Age, EET, and G2 only. For the i-th
patient, I am likely to obtain a different estimated rate. How can I
meaningfully compare both rates? How can say which one is “better”? 

Thanks a lot for all comments!
Walter





-- 
View this message in context: 
http://www.nabble.com/survival-analysis-using-rpart-tf3294276.html#a9163329
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Partial whitening of time series?

2007-02-26 Thread Andy Bunn

I have a time series with a one year lag, ar=0.5. The series has some
interesting events that disappear when the series is whitened (i.e.,
fitting an AR process and looking at the residuals). I'd like to remove
the autocorrelation in stages to see the effect on the time series. Is
there a way to specify the autocorrelation term while fitting an AR
process? 

For instance, given the following:

x - arima.sim(model = list(order = c(1,0,0), ar = 0.5), n = 500,
sd=0.25)

Can I filter x in a way that the autocorrelation at lag one is 0.4, then
0.3, 0.2, 0.1, until I get to a clean series equivalent to:

y - arima(x, order = c(1,0,0))$resid

Thanks in advance, 
Andy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding duplicates by rows

2007-02-26 Thread Chuck Cleland

Serguei Kaniovski wrote:
 Hi,
 
 I am trying to add duplicates of matrix mat by row. Commands
 
 subset(mat,duplicated(rownames(mat)))
 
 or
 
 mat[which(duplicated(rownames(mat))),]
 
 return only half of the required indices. How can I find the remaining
 ones, ie the matches, so that I can add them up?

mat - matrix(runif(70), ncol=5)
rownames(mat) - c(Z, rep(LETTERS[1:6], each=2), G)

  There is probably a more elegant way, but this seems to do what you want:

mat[rownames(mat) %in% names(which(table(rownames(mat))  1)),]

  Also, have you considered aggregate()?

aggregate(mat, list(ROW = rownames(mat)), sum)

 Thanks,
 Serguei
 
 ___
 
 Austrian Institute of Economic Research (WIFO)
 
 Name: Serguei Kaniovski   P.O.Box 91
 Tel.: +43-1-7982601-231 Arsenal Objekt 20
 Fax: +43-1-7989386  1103 Vienna, Austria
 Mail: [EMAIL PROTECTED]  A-1030 Wien
 
 http://www.wifo.ac.at/Serguei.Kaniovski
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Partial whitening of time series?

2007-02-26 Thread Wensui Liu

andy,

if your model is Xt = 0.5 * Xt-1 + e, then it should have
Xt = 0.1 * Xt-1 + 0.4 * Xt-1 + e
(Xt - 0.1*Xt-1) = 0.4 * Xt-1 + e

so what you need to do is to substract part of lag from your series.
it is just my $0.02.

On 2/26/07, Andy Bunn [EMAIL PROTECTED] wrote:
 I have a time series with a one year lag, ar=0.5. The series has some
 interesting events that disappear when the series is whitened (i.e.,
 fitting an AR process and looking at the residuals). I'd like to remove
 the autocorrelation in stages to see the effect on the time series. Is
 there a way to specify the autocorrelation term while fitting an AR
 process?

 For instance, given the following:

 x - arima.sim(model = list(order = c(1,0,0), ar = 0.5), n = 500,
 sd=0.25)

 Can I filter x in a way that the autocorrelation at lag one is 0.4, then
 0.3, 0.2, 0.1, until I get to a clean series equivalent to:

 y - arima(x, order = c(1,0,0))$resid

 Thanks in advance,
 Andy

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] eigenvalue ordering

2007-02-26 Thread Kaustubh Patil

Hi all,
 
 Is it possible to get unordered eigenvalues and eigenvectors of a symmetric 
matrix in R?
 
 Any help appreciated.
 
 Regards,
 Kaustubh
 
 
-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] eigenvalue ordering

2007-02-26 Thread Alberto Monteiro

Kaustubh Patil wrote:
 
  Is it possible to get unordered eigenvalues and eigenvectors of a 
 symmetric matrix in R?
 
Yes, see help(eigen).

If you are strict about the unordered part, do a sample(set, size)
to randomize the eigenvalues.

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] eigenvalue ordering

2007-02-26 Thread Peter Dalgaard

Alberto Monteiro wrote:
 Kaustubh Patil wrote:
   
  Is it possible to get unordered eigenvalues and eigenvectors of a 
 symmetric matrix in R?
 Yes, see help(eigen).
   
Er, where do you see anything about (un)order? As far as I know, there's 
no natural ordering of eigenvalues and eigenvalue algorithms generally 
find them  in  either increasing or decreasing order (or closest to 
specified value).
 If you are strict about the unordered part, do a sample(set, size)
 to randomize the eigenvalues.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Macros in R

2007-02-26 Thread Don MacQueen

If I understand the question correctly, I would do this:


for (i in 1:54)   assign(  paste('input',i,sep='') ,  matrix( 
dataset$variable, nrow=1)   )


You now have 54 matrices, named input1, input2, ... input54, each 
having 1 row and as many columns as dataset$variable is long.
(also, they're identical, since all are created from the same object, 
dataset$variable)

See,  of course, the help page for assign() to see why this works.

However, I do wonder, in the bigger picture of what you're trying to 
do, whether there isn't a better way. For example, why matrices, 
since they all have only one row?

-Don

At 5:02 PM +0100 2/25/07, Monika Kerekes wrote:
Dear members,



I have started to work with R recently and there is one thing which I could
not solve so far. I don't know how to define macros in R. The problem at
hand is the following: I want R to go through a list of 1:54 and create the
matrices input1, input2, input3 up to input54. I have tried the following:



for ( i in 1:54) {

   input[i] = matrix(nrow = 1, ncol = 107)

   input[i][1,]=datset$variable

}



However, R never creates the required matrices. I have also tried to type
input'i' and input$i, none of which worked. I would be very grateful for
help as this is a basic question the answer of which is paramount to any
further usage of the software.



Thank you very much



Monika




   [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


-- 
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] eigenvalue ordering

2007-02-26 Thread Alberto Monteiro


Peter Dalgaard wrote:

  Is it possible to get unordered eigenvalues and eigenvectors of a 
 symmetric matrix in R?

 Yes, see help(eigen).
   
 Er, where do you see anything about (un)order? As far as I know, 
 there's no natural ordering of eigenvalues and eigenvalue 
 algorithms generally find them  in  either increasing or decreasing 
 order (or closest to specified value).

eigen orders the values. From help(eigen):

  values: a vector containing the p eigenvalues of 'x', sorted in
  _decreasing_ order, according to 'Mod(values)' in the
  asymmetric case when they might be complex (even for real
  matrices).  For real asymmetric matrices the vector will be
  complex only if complex conjugate pairs of eigenvalues are
  detected. 

So, if you are strict about getting unordered eigenvalues,
you must shuffle them :-)

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 2 data frames - list in one out put , matrix in another ??

2007-02-26 Thread John Kane

I have two more or less parallel dataframes that are
giving me different results on one subset of
variables.  I know that I assembled the 2 dataframes
slightly differently but I don't see why I am getting
this result because one set of variables are labelled
and the other is not. Variable names are the same,
etc.  as far as I can acertain.  The only diffference
seems to be that bdata variables are labelled.  

About now I really don't care which I get but I would
like them to be the same.  Can anyone suggest what I
am doing wrong or should be looking at?

Windows XP , R 2.4.1 Using Hmisc and gtools as well as
the basic R installation.  

Problem

load(adata)
fn1 - function(x) {table(x)}
jj -apply(adata[,110:127], 2, fn1)

OUTPUT jj is aa list of 18 tables
Examine a variable:
typeof(adata$act.toy)
[1] integer
 class(adata$act.toy)
[1] integer


load(bdata
fn1 - function(x) {table(x)}
kk -apply(bdata[,94:111], 2, fn1)

OUTPUT jj is a matrix 2 X 18
   class(bdata$act.toy)
[1] labelled
typeof(bdata$act.toy)
[1] integer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] someattributes

2007-02-26 Thread Thaden, John J

I had written

 ...someattributes() does not seem to exist.

And Haris Skiadas replied

 My help shows it as moreattributes, not 
 someattributes. (MacOSX), though doesn't 
 sound like it should be platform-specific).

Thanks for correcting me.  Actually, my windows
R documentation says mostattributes(), but it
makes no difference -- none of the three show
up as function names or R objects.

-John 

Confidentiality Notice: This e-mail message, including any a...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] returns from dnorm and dmvnorm

2007-02-26 Thread A Hailu

Hi All,
Why would calls to dnorm and dmvnorm return values that are above 1? For
example,
 dnorm(0.3,mean=0, sd=0.1)
[1] 3.989423

This is happening on two different installations of R that I have.

Thank you.

Hailu

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] hierarchical clustering using cutree

2007-02-26 Thread Jun Ding

Hi Everyone, 

I am doing hierarchical clustering analysis and have a
question regarding cutree. 

I am doing things like this:

hc - hclust(dist(X))
a - cutree(hc, k=2)

Basically a is a vector containing the assignments
of 1 or 2 for each sample. May I know how cutree
decides to assign 1 and 2's to each sample (in other
words, how clusters 1 and 2 are decided)? I am having
the feeling that the first sample will always be
assigned to Cluster 1, but I am not sure about this. 

Thank you!

Best,
Jun 


 

Looking for earth-friendly autos? 
Browse Top Cars by Green Rating at Yahoo! Autos' Green Center.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] returns from dnorm and dmvnorm

2007-02-26 Thread Benilton Carvalho

well, nobody said that the density must be smaller than 1, right? :-)

it's just the value of the normal density function at the point you  
asked. you may try doing that by hand and, with the correct math,  
you'll get the same thing.

b

On Feb 26, 2007, at 3:03 PM, A Hailu wrote:

 Hi All,
 Why would calls to dnorm and dmvnorm return values that are above  
 1? For
 example,
 dnorm(0.3,mean=0, sd=0.1)
 [1] 3.989423

 This is happening on two different installations of R that I have.

 Thank you.

 Hailu

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] someattributes

2007-02-26 Thread Charilaos Skiadas

On Feb 26, 2007, at 3:00 PM, Thaden, John J wrote:

 Thanks for correcting me.  Actually, my windows
 R documentation says mostattributes(), but it
 makes no difference -- none of the three show
 up as function names or R objects.

That's because there is no mostattributes function, it only works  
as an assignment:

?mostattributes-

Example:

  x - c(2,3,4)
  mostattributes(x) - list(foo=bar)
  x
[1] 2 3 4
attr(,foo)
[1] bar

 -John

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] someattributes

2007-02-26 Thread Gabor Grothendieck

On 2/26/07, Thaden, John J [EMAIL PROTECTED] wrote:
 I had written

  ...someattributes() does not seem to exist.

 And Haris Skiadas replied

  My help shows it as moreattributes, not
  someattributes. (MacOSX), though doesn't
  sound like it should be platform-specific).

 Thanks for correcting me.  Actually, my windows
 R documentation says mostattributes(), but it
 makes no difference -- none of the three show
 up as function names or R objects.

Try:

?mostattributes-

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] returns from dnorm and dmvnorm

2007-02-26 Thread Charilaos Skiadas

On Feb 26, 2007, at 3:03 PM, A Hailu wrote:
 Hi All,
 Why would calls to dnorm and dmvnorm return values that are above  
 1? For
 example,
 dnorm(0.3,mean=0, sd=0.1)
 [1] 3.989423

Because dnorm gives you the density function, whose integral is the  
distribution function, which is likely what you want. Try:

pnorm(0.3,mean=0, sd=0.1)

 This is happening on two different installations of R that I have.

 Thank you.

 Hailu

Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] returns from dnorm and dmvnorm

2007-02-26 Thread Ravi Varadhan

I guarantee that it would also happen on all future versions of R.

Why would you expect density to be smaller than 1?  The only constraints on
density are that (a) it is non-negative and (b) it integrates to one. The
smaller the variance, the greater the density is around its center.  Density
can be made to become arbitrarily large by letting the variance gets close
to zero, and in the limit you will obtain Dirac's delta function.  


Ravi. 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of A Hailu
Sent: Monday, February 26, 2007 3:04 PM
To: r-help@stat.math.ethz.ch
Subject: [R] returns from dnorm and dmvnorm

Hi All,
Why would calls to dnorm and dmvnorm return values that are above 1? For
example,
 dnorm(0.3,mean=0, sd=0.1)
[1] 3.989423

This is happening on two different installations of R that I have.

Thank you.

Hailu

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] someattributes (actually, mostattributes)

2007-02-26 Thread Thaden, John J

Haris Skiadas replied

 Thanks for correcting me.  Actually, my windows
 R documentation says mostattributes(), but it
 makes no difference -- none of the three show
 up as function names or R objects.

 That's because there is no mostattributes function,
 it only works as an assignment:
 
 ?mostattributes-

Thanks. Obviously I need to learn about assignments that
are not R objects.

-John





Confidentiality Notice: This e-mail message, including any a...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] returns from dnorm and dmvnorm

2007-02-26 Thread A Hailu

Yes, you are right. Thanks.

On 2/27/07, Benilton Carvalho [EMAIL PROTECTED] wrote:

 well, nobody said that the density must be smaller than 1, right? :-)

 it's just the value of the normal density function at the point you
 asked. you may try doing that by hand and, with the correct math,
 you'll get the same thing.

 b

 On Feb 26, 2007, at 3:03 PM, A Hailu wrote:

  Hi All,
  Why would calls to dnorm and dmvnorm return values that are above
  1? For
  example,
  dnorm(0.3,mean=0, sd=0.1)
  [1] 3.989423
 
  This is happening on two different installations of R that I have.
 
  Thank you.
 
  Hailu


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Partial whitening of time series?

2007-02-26 Thread Andy Bunn

Thanks, I wasn't thinking real clearly when I pressed 'send'. All
figured out now. -A

-Original Message-
From: Wensui Liu [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 26, 2007 10:15 AM
To: Andy Bunn
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Partial whitening of time series?

andy,

if your model is Xt = 0.5 * Xt-1 + e, then it should have
Xt = 0.1 * Xt-1 + 0.4 * Xt-1 + e
(Xt - 0.1*Xt-1) = 0.4 * Xt-1 + e

so what you need to do is to substract part of lag from your series.
it is just my $0.02.

On 2/26/07, Andy Bunn [EMAIL PROTECTED] wrote:
 I have a time series with a one year lag, ar=0.5. The series has some
 interesting events that disappear when the series is whitened (i.e.,
 fitting an AR process and looking at the residuals). I'd like to
remove
 the autocorrelation in stages to see the effect on the time series. Is
 there a way to specify the autocorrelation term while fitting an AR
 process?

 For instance, given the following:

 x - arima.sim(model = list(order = c(1,0,0), ar = 0.5), n = 500,
 sd=0.25)

 Can I filter x in a way that the autocorrelation at lag one is 0.4,
then
 0.3, 0.2, 0.1, until I get to a clean series equivalent to:

 y - arima(x, order = c(1,0,0))$resid

 Thanks in advance,
 Andy

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] returns from dnorm and dmvnorm

2007-02-26 Thread A Hailu

Thanks everyone. I should have thought of dnorm as a straing return from the
normal density formula.

Hailu

On 2/27/07, Charilaos Skiadas [EMAIL PROTECTED] wrote:

 On Feb 26, 2007, at 3:03 PM, A Hailu wrote:
  Hi All,
  Why would calls to dnorm and dmvnorm return values that are above
  1? For
  example,
  dnorm(0.3,mean=0, sd=0.1)
  [1] 3.989423

 Because dnorm gives you the density function, whose integral is the
 distribution function, which is likely what you want. Try:

 pnorm(0.3,mean=0, sd=0.1)

  This is happening on two different installations of R that I have.
 
  Thank you.
 
  Hailu

 Haris Skiadas
 Department of Mathematics and Computer Science
 Hanover College






[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Double-banger function names: preferences and suggestions

2007-02-26 Thread Steven McKinney


The underscore versus left-arrow 
conundrum has its roots in the evolution
of ASCII during the middle of the last
century. 

Some Teletype machines in the 1970s (when the
S language was being developed) still had a
left arrow, and its ASCII code was used in S
as a one keystroke convenience for the assignment
operator.  The left arrow symbol was then
removed from most keyboards/printers/fontsets
and replaced by the underscore.  Thus the
underscore remained as a one keystroke assignment
operator.


See e.g.

http://www.wps.com/projects/codes/index.html#GRPH


LEFT-ARROW, ?
UNDERSCORE, _

One of the graphical codes, left-arrow mutated to the 
underscore of ASCII-1967. It may have had earlier, 
or other, meanings, but for some early programming 
languages it was assignment, eg.

  c ? b + a

C is assigned the sum of B and A.




Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: [EMAIL PROTECTED]

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of Marc Schwartz
Sent: Sun 2/25/2007 8:28 AM
To: Alberto Vieira Ferreira Monteiro
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Double-banger function names: preferences and suggestions
 
On Sun, 2007-02-25 at 15:56 +, Alberto Vieira Ferreira Monteiro
wrote:
 hadley wickham wrote:
 
  What do you prefer/recommend for double-banger function names:
 
   1 scale.colour
   2 scale_colour
   3 scaleColour
 
  1 is more R-like, but conflicts with S3.  2 is a modern version of
  number 1, but not many packages use it.  Number 3 is more java-like.
  (I like number 2 best)
 
  Any suggestions?
 
 I always prefer 2, but this would make it non-portable to S-Plus. S-Plus
 has a bug, where _ is the equivalent to - (why would they do this? I
 prefer to think it's stupidity and not villainy)

That's not a bug. If you search the archives of both the S-PLUS list and
the R lists, you will see highly energized discussion on the use of the
underscore operator.

In R, the use of '_' was allowed for assignment up until version 1.8.0
when:

DEPRECATED  DEFUNCT

o   The assignment operator `_' has been removed.


and subsequently allowed in names in version 1.9.0 when:

 o  Underscore '_' is now allowed in syntactically valid names, and
make.names() no longer changes underscores.  Very old code
that makes use of underscore for assignment may now give
confusing error messages.


Not to further contribute to the dialog on 'style', but to further
contribute ;-), for those who have coded in the Windows environment (ie.
C, VBA, etc.) the extension of sorts to number 3 is of course Hungarian
Notation, named after Charles Simonyi, originally at Xerox PARC and
later senior developer/architect at MS. The extension was the inclusion
of the data type prefix, such as fnScaleColour to indicate that this was
a function, with the name using caps to make words more distinct.

And no, I'm not advocating that use...I have been guilty myself of using
variants of 1 and 3, perhaps driven by my circulating caffeine levels as
much as anything else.

HTH,

Marc Schwartz
Off to go remove 12 inches of snow from the driveway and sidewalk...oy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] training svm

2007-02-26 Thread Sixtease


Hello. I'm new to R and I'm trying to solve a classification problem. I have
a training dataset of about 40,000 rows and 50 columns. When I try to train
support vector machine, it gives me this error after a few seconds:

 Error in predict.svm(ret, xhold) : Model is empty!

This is the code I use:

 ne_span_data - as.matrix(read.table('ne_span.data.R.txt', header=TRUE,
row.names='id'))
 library('e1071')
 svm_ne_span_model - svm(NE_type ~ . , ne_span_data)

it gives me:
Error in predict.svm(ret, xhold) : Model is empty!

A line from the ne_span.data.R.txt file:
 svt OTHER N N I S 2 NA NA NA NA NA A NA NA 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 train-s1m2

Any idea what's wrong here?
-- 
View this message in context: 
http://www.nabble.com/training-svm-tf3296613.html#a9170716
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] match() function with a little enhancement

2007-02-26 Thread Nicolas Prune

Dear R users,

I was wondering if R has a built-in function doing the following :

my_match(values_vector,lookup_vector)
{
for each value of values_vector :

if value %in% lookup_vector, then value is unchanged
else, value is changed the the closest element of lookup_vector, closest
meaning the one that would come just after if we sorted them using order()
}

For example :

values - c(Kiwis, Bananas, Ananas, Cherries, Peer)
vector - c(Oranges, Bananas, Apples, Cherries, Lemons)

my_match(values, vector) should return :

c(Lemons,Bananas,Apples,Cherries,NA)

I currently use a home-made function for this, but it is quite slow on large
sets, msotly because I did not manage to avoid using a loop.

Many thanks for your ideas,
Nicolas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] match() function with a little enhancement

2007-02-26 Thread jim holtman

try this:

 values - c(Kiwis, Bananas, Ananas, Cherries, Peer)
 vector - c(Oranges, Bananas, Apples, Cherries, Lemons)
 vector - sort(vector)
 vector
[1] Apples   Bananas  Cherries Lemons   Oranges
 x - sapply(values, function(x)ifelse(x=vector, -1, 1))
 x
 Kiwis Bananas Ananas Cherries Peer
[1,] 1   1 -111
[2,] 1  -1 -111
[3,] 1  -1 -1   -11
[4,]-1  -1 -1   -11
[5,]-1  -1 -1   -11
 vector[apply(x, 2, function(z) which(z  0)[1])]
[1] Lemons   Bananas  Apples   Cherries NA




On 2/26/07, Nicolas Prune [EMAIL PROTECTED] wrote:

 Dear R users,

 I was wondering if R has a built-in function doing the following :

 my_match(values_vector,lookup_vector)
 {
 for each value of values_vector :

 if value %in% lookup_vector, then value is unchanged
 else, value is changed the the closest element of lookup_vector, closest
 meaning the one that would come just after if we sorted them using
 order()
 }

 For example :

 values - c(Kiwis, Bananas, Ananas, Cherries, Peer)
 vector - c(Oranges, Bananas, Apples, Cherries, Lemons)

 my_match(values, vector) should return :

 c(Lemons,Bananas,Apples,Cherries,NA)

 I currently use a home-made function for this, but it is quite slow on
 large
 sets, msotly because I did not manage to avoid using a loop.

 Many thanks for your ideas,
 Nicolas

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PLotting R graphics/symbols without user x-y scaling

2007-02-26 Thread Paul Murrell

Hi


Jonathan Lees wrote:
 Is it possible to add lines or other
 user defined graphics
 to a plot in R that does not depend on
 the user scale for the plot?
 
 For example I have a plot
 plot(x,y)
 and I want to add some graphic that is
 scaled in inches or cm but I do not want the
 graphic to change when the x-y scales are
 changed - like a thermometer, scale bar or
 other symbol -
 How does one do this?
 
 I want to build my own library of glyphs to add to plots
 but I do not know how to plot them when their
 size is independent of the device/user coordinates.
 
 Is it possible to add to the list
 of symbols in the function symbols()
 other than:
   _circles_, _squares_, _rectangles_, _stars_, _thermometers_, and
   _boxplots_
 
 can I make my own symbols and have symbols call these?


There is currently no mechanism for defining your own additions to
symbols(), but this sort of thing is easily doable using the grid
graphics system, and the resulting symbols would be easy to add to
lattice plots.  See ...
http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter4.pdf
http://www.stat.auckland.ac.nz/~paul/RGraphics/chapter5.pdf

There is also an example of how to do this sort of thing using the
grImport package (and grid and lattice) in
http://www.stat.auckland.ac.nz/~paul/Talks/import.pdf
The complete code for the example is ...

library(grImport)
hourglass -
 new(Picture,
 paths=
 list(new(PictureFill,
  x=c(0, 1, 0, 1),
  y=c(0, 0, 1, 1),
  rgb=black),
  new(PictureStroke,
  x=c(0, 1, 0, 1, 0),
  y=c(0, 0, 1, 1, 0),
  rgb=grey)),
 summary=
 new(PictureSummary,
 numPaths=1,
 xscale=c(0, 1),
 yscale=c(0, 1)))
dotplot(variety ~ yield | year, data=barley,
 panel=function(x, y, type, ...) {
 panel.dotplot(x, y, type=n, ...)
 grid.symbols(hourglass,
  x=unit(as.numeric(x), native),
  y=unit(as.numeric(y), native),
  size=unit(5, mm))
 })

Paul


 Thanks-
 
 

-- 
Dr Paul Murrell
Department of Statistics
The University of Auckland
Private Bag 92019
Auckland
New Zealand
64 9 3737599 x85392
[EMAIL PROTECTED]
http://www.stat.auckland.ac.nz/~paul/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Crosstabbing multiple response data

2007-02-26 Thread Michael Wexler


Thanks to Charles, Gabor, and a private message from Frank E Harrell with some 
good ideas and help.  This crossprod approach was very clever, I would never 
have thought of it.

Best, Michael


- Original Message 
From: Charles C. Berry [EMAIL PROTECTED]
To: Michael Wexler [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Sent: Thursday, February 22, 2007 1:17:44 PM
Subject: Re: [R] Crosstabbing multiple response data


 res - crossprod( as.matrix( ratings[ , -1] ) )
 diag(res) - 
 print(res, quote=F)
  att1 att2 att3
att1  21
att2 2 2
att3 12
 
 res2 - crossprod(as.matrix( ratings[ , -1])) * 100 / nrow( ratings )
 res2[] - paste( res2, %, sep= )
 diag(res2) - 
 print(res2, quote=F)
  att1 att2 att3
att1  50%  25%
att2 50%   50%
att3 25%  50%


Be sure to bone up on format and sprintf before taking this into 
production.

On Thu, 22 Feb 2007, Michael Wexler wrote:

 Using R version 2.4.1 (2006-12-18) on Windows, I have a dataset which 
 resembles this:

 idatt1att2att3
 1110
 2100
 3011
 4111

 ratings - data.frame(id = c(1,2,3,4), att1 = c(1,1,0,1), att2 = c(1,0,0,1), 
 att3 = c(0,1,1,1))

 I would like to get a cross tab of counts of co-ocurrence, which might 
 resemble this:

att1att2att3
 att1 2   1
 att222
 att312

 with the hope of understanding, at least pairwise, what things hang 
 together.   (Yes, there are much, much better ways to do this statistically 
 including clustering and binary corrected correlation, but the audience I am 
 working with asked for this version for a specific reason.)

 (Later on, I would also like to convert to percentages of the total unique 
 pop, so the final version of the table would be


att1att2att3

 att1 50%   25%

 att250%50%

 att325%50%


 But I can do this in excel if I can get the first table out.)

 I have tried the reshape library, but could not get anything resembling this 
 (both on its own, as well as feeding in to table()).  (I have also played 
 with transposing and using some comments from this list from 2002 and 2004, 
 but the questioners appear to assume more knowledge than I have in use of R; 
 the example in the posting guide was also more complex than I was ready for, 
 I'm afraid.)

 Sample of some of my efforts:
 library(reshape)
 melt(ratings,id=c(id))

 ds1 - melt(ratings,id=c(id))
 table(ds1$variable, ds1$variable) # returns only rowcounts, 3 along diagonal
 xtabs(formula = value ~ ds1$variable + ds1$variable , data=ds1) # returns 
 only a single row of collapsed counts, appears to not allow 1 variable in 
 multiple uses

 I suspect I am close, so any nudges in the right direction would be helpful.

 Thanks much, Michael

 PS: www.rseek.org is very impressive, I heartily encourage its use.


 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED] UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901







[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] looping

2007-02-26 Thread Neil Hepburn


Greetings:

I am looking for some help (probably really basic) with looping. What I want
to do is repeatedly sample observations (about 100 per sample) from a large
dataset (100,000 observations).  I would like the samples labelled sample.1,
sample.2, and so on (or some other suitably simple naming scheme).  To do
this manually I would 

smp.1 - sample(10, 100)
sample.1 - dataset[smp.1,]
smp.2 - sample(10, 100)
sample.2 - dataset[smp.2,]
.
.
.
smp.50 - sample(10, 100)
sample.50 - dataset[smp.50,]

and so on.

I tried the following loop code to generate 100 samples:

for (i in 1:50){
+ smp.[i] - sample(10, 100)
+ sample.[i] - dataset[smp.[i],]}

Unfortunately, that does not work -- specifying the looping variable i in
the way that I have does not work since R uses that to reference places in a
vector (x[i] would be the ith element in the vector x)

Is it possible to assign the value of the looping variable in a name within
the loop structure?

Cheers,
Neil Hepburn

===
Neil Hepburn, Economics Instructor
Social Sciences Department,
The University of Alberta Augustana Campus
4901 - 46 Avenue 
Camrose, Alberta
T4V 2R3

Phone (780) 697-1588
email [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RDA and trend surface regression

2007-02-26 Thread MORLON

Dear all,

 

I'm performing RDA on plant presence/absence data, constrained by
geographical locations. I'd like to constrain the RDA by the extended
matrix of geographical coordinates -ie the matrix of geographical
coordinates completed by adding all terms of a cubic trend surface
regression- . 

This is the command I use (package vegan):

 

rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) 

 

where Helling is the matrix of Hellinger-transformed presence/absence data

The result returned by R is exactly the same as the one given by:

 

anova(rda(Helling ~ x+y)

 

Ie the quadratic and cubic terms are not taken into account

 

I hope you can help me with that: how can I perform a RDA on an extended
matrix of geographical coordinates in R?.

 

Thank you very much in advance,

 

Helene Morlon

University of California, Merced

[EMAIL PROTECTED]

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RDA and trend surface regression

2007-02-26 Thread Kuhn, Max

Helene,

You will have to give us more information, such as your system/versions
and a small reproducible example. We try to stress that questions are
more easily answered when there are a lot of specific details given and
a reproducible case can be tested.

Here are two comments though:

 1. The quadratic terms probably are not showing up because you are not
using a proper model formula for the task. See:

http://cran.r-project.org/doc/manuals/R-intro.html#Formulae-for-statisti
cal-models

Specifically, the part that says

I(M): Insulate M. Inside M all operators have their normal arithmetic
meaning, and that term appears in the model matrix. 

is important. So, as an example from ?rda:

 x - rda(Species ~ (Sepal.Length+Sepal.Width)^2 + Sepal.Width^2, data =
iris)

would not work for the squared term, but

 x - rda(Species ~ (Sepal.Length+Sepal.Width)^2 + I(Sepal.Width^2),
data = iris)

would.

2. RDA is fitting models at or between LDA and QDA. So a QDA model with
quadratic terms would be quartic discriminant analysis. Of course, there
are no rules against this, but high order polynomials can do weird
things in the tail (which would be the edges of the space defined by
your training data). If your data are that nonlinear, there are much
better ways of classifying data. I'd suggests getting a copy of Hastie
et all (2001) or MASS.

Max



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of MORLON
Sent: Monday, February 26, 2007 7:14 PM
To: r-help@stat.math.ethz.ch
Subject: [R] RDA and trend surface regression

Dear all,

 

I'm performing RDA on plant presence/absence data, constrained by
geographical locations. I'd like to constrain the RDA by the extended
matrix of geographical coordinates -ie the matrix of geographical
coordinates completed by adding all terms of a cubic trend surface
regression- . 

This is the command I use (package vegan):

 

rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) 

 

where Helling is the matrix of Hellinger-transformed presence/absence
data

The result returned by R is exactly the same as the one given by:

 

anova(rda(Helling ~ x+y)

 

Ie the quadratic and cubic terms are not taken into account

 

I hope you can help me with that: how can I perform a RDA on an
extended
matrix of geographical coordinates in R?.

 

Thank you very much in advance,

 

Helene Morlon

University of California, Merced

[EMAIL PROTECTED]

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
LEGAL NOTICE\ Unless expressly stated otherwise, this messag...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] prop.test or chisq.test ..?

2007-02-26 Thread Dylan Beaudette

Hi everyone,

Suppose I have a count the occurrences of positive results, and the total 
number of occurrences:


pos - 14
total - 15

testing that the proportion of positive occurrences is greater than 0.5 gives 
a p-value and confidence interval:

prop.test( pos, total, p=0.5, alternative='greater')

1-sample proportions test with continuity correction

data:  14 out of 15, null probability 0.5 
X-squared = 9.6, df = 1, p-value = 0.0009729
alternative hypothesis: true p is greater than 0.5 
95 percent confidence interval:
 0.706632 1.00 
sample estimates:
p 
0.933


My question is how does the use of chisq.test() differ from the above 
operation. For example:

chisq.test(table( c(rep('pos', 14), rep('neg', 1)) ))

Chi-squared test for given probabilities

data:  table(c(rep(pos, 14), rep(neg, 1))) 
X-squared = 11.2667, df = 1, p-value = 0.0007891

... gives slightly different results. I am corrent in interpreting that the 
chisq.test() function in this case is giving me a p-value associated with the 
test that the probabilities of pos are *different* than the probabilities of 
neg -- and thus a larger p-value than the prop.test(... , p=0.5, 
alternative='greater') ? 

I realize that this is a rather elementary question, and references to a text 
would be just as helpful. Ideally, I would like a measure of how much I 
can 'trust' that a larger proportion is also statistically meaningful. Thus 
far the results from prop.test() match my intuition, but affirmation would be 
great.

Cheers,


-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] PLotting R graphics/symbols without user x-y scaling

2007-02-26 Thread Gabor Grothendieck

You can use par(usr) to get the min/max coords of the plot and then
use that.  For example, this will plot a red dot in the middle of the
plot area regardless of the coordinates:

plot(1:10)  # sample plot
usr - par(usr)
points(mean(usr[1:2]), mean(usr[3:4]), pch = 20, col = red) # red dot

See ?par

On 2/26/07, Jonathan Lees [EMAIL PROTECTED] wrote:

 Is it possible to add lines or other
 user defined graphics
 to a plot in R that does not depend on
 the user scale for the plot?

 For example I have a plot
 plot(x,y)
 and I want to add some graphic that is
 scaled in inches or cm but I do not want the
 graphic to change when the x-y scales are
 changed - like a thermometer, scale bar or
 other symbol -
 How does one do this?

 I want to build my own library of glyphs to add to plots
 but I do not know how to plot them when their
 size is independent of the device/user coordinates.

 Is it possible to add to the list
 of symbols in the function symbols()
 other than:
  _circles_, _squares_, _rectangles_, _stars_, _thermometers_, and
  _boxplots_

 can I make my own symbols and have symbols call these?


 Thanks-


 --
 Jonathan M. Lees
 Professor
 THE UNIVERSITY OF NORTH CAROLINA AT CHAPEL HILL
 Department of Geological Sciences
 Campus Box #3315
 Chapel Hill, NC  27599-3315
 TEL: (919) 962-0695
 FAX: (919) 966-4519
 [EMAIL PROTECTED]
 http://www.unc.edu/~leesj

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] looping

2007-02-26 Thread Bert Gunter

You do not say -- and I am unable to divine -- whether you wish to sample
with or without replacement: each time or as a whole.

In general, when you want to do this sort of thing, the fastest way to do it
is just to sample everything you need at once and then form it into a list
or matrix or whatever. For example, for sampling 100 each time with
replacement 200 times:

mySamples - matrix(sample(yourDatavector, 100*200,replace=FALSE),ncol=200)

will give you a 100 row by 200 column matrix of samples without replacement
from yourDatavector. I hope that you can adapt this to suit your needs.

 
Bert Gunter
Nonclinical Statistics
7-7374

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Neil Hepburn
Sent: Monday, February 26, 2007 4:11 PM
To: r-help@stat.math.ethz.ch
Subject: [R] looping


Greetings:

I am looking for some help (probably really basic) with looping. What I want
to do is repeatedly sample observations (about 100 per sample) from a large
dataset (100,000 observations).  I would like the samples labelled sample.1,
sample.2, and so on (or some other suitably simple naming scheme).  To do
this manually I would 

smp.1 - sample(10, 100)
sample.1 - dataset[smp.1,]
smp.2 - sample(10, 100)
sample.2 - dataset[smp.2,]
.
.
.
smp.50 - sample(10, 100)
sample.50 - dataset[smp.50,]

and so on.

I tried the following loop code to generate 100 samples:

for (i in 1:50){
+ smp.[i] - sample(10, 100)
+ sample.[i] - dataset[smp.[i],]}

Unfortunately, that does not work -- specifying the looping variable i in
the way that I have does not work since R uses that to reference places in a
vector (x[i] would be the ith element in the vector x)

Is it possible to assign the value of the looping variable in a name within
the loop structure?

Cheers,
Neil Hepburn

===
Neil Hepburn, Economics Instructor
Social Sciences Department,
The University of Alberta Augustana Campus
4901 - 46 Avenue 
Camrose, Alberta
T4V 2R3

Phone (780) 697-1588
email [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Optimizing the loop for large data

2007-02-26 Thread Takatsugu Kobayashi

Rusers:

I am trying to apply a quadratic discriminant function to find the best 
classification outcomes.
1 is assigned to the values greater than a threshold value; and 0 otherwise.
I would like to see how the apparent error rates and the optimal error 
rate change with increasing threshold values.

I have a 1000*10 data matrix: n=1000 and p=10.

Here is what I wrote so far, but seems to be inefficient. I appreciate 
if someone help me out.

library(foreign)
library(MASS)

D-read.dbf('data/Indianapolis015.dbf') # import a data

# data looks like this

  LONGLAB XY Perimeter   AreaX_UTM   
Y_UTM   F0 F1  F2
1  TAZ 18011:1000 -86.25985 39.95286  2.061630  0.1862549 50600.38 
4435792  235  0  35
2  TAZ 18011:1001 -86.31030 39.97591  3.657006  0.7305006 46440.80 
44386080  0   0
3  TAZ 18011:1002 -86.29542 39.97054  3.516089  0.6408084 47677.31 
4437936  155  0  15
4  TAZ 18011:1003 -86.27574 39.97294  5.000185  1.2592142 49374.91 
4438102  835  0  55
5  TAZ 18011:1004 -86.25967 39.97197  4.788531  1.1930984 50741.38 
4437913  425  0  80
6  TAZ 18011:1005 -86.29245 39.98580  6.189141  1.6734483 48031.44 
4439616  185  0  35
7  TAZ 18011:1006 -86.24899 39.98259  7.525633  2.0564466 51723.80 
4439040  505  0  45
8  TAZ 18011:1007 -86.30974 39.99014  3.773037  0.7790234 46583.20 
4440186   30  0  10
9  TAZ 18011:1008 -86.27151 39.99040  4.589226  1.2212674 49850.92 
4440021   40  0   0
10 TAZ 18011:9215 -86.58085 40.13588 37.278521 69.6681954 24438.13 
4457794 2095 85 200

thrs-seq(1000,1,length=50)
ED-D[,383]/D[,5] # employment density
CBDx-D[,6]-58277.194363 # convert a coordinate for x
CBDy-D[,7]-4414486.03135 # convert a coordinate for y

AER-vector(numeric,length(thrs))
OER-vector(numeric,length(thrs))
MER-vector(numeric,length(thrs))

# compute the apparent error rates for each threshold value
for (j in 1:length(thrs)){
ctgy-ifelse(EDthrs[j],2,1) # 2 categories are created by the threshold
test1-qda(cbind(ED,CBDx,CBDy),ctgy)
est1-cbind(ctgy,predict(test1)$class)
AER[j]-sum((est1[,1]-est1[2])==0)/dim(D)[1]
}

# OER computation for ith location taken out for the thresholds
for (k in 1:dim(D)[1]){
for (j in 1:length(thrs)){
ctgy-ifelse(EDthrs[j],2,1)
test2-qda(cbind(ED[-k],CBDx[-k],CBDy[-k]),ctgy[-k])
est2-cbind(ctgy[-k],predict(test2)$class)
OER[j]-mean(sum((est2[,1]-est2[2])==0)/(dim(D)[1]-1))
}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] exact matching of names in attr

2007-02-26 Thread Michael Toews

In R 2.5.0 (r40806), one of the change is to allow partial matching of 
name in the attr function. However, how can I tell if I have an exact 
match or not?

For example, checking to see if an object has a name attribute, then 
giving it one if it doesn't:

dat - data.frame(x=1:10,y=rnorm(10))
if(is.null(attr(dat,name)))
attr(dat,name) - Site 1
str(dat)

(This example works in R  2.5) Although there is no name attribute to 
the data.frame, it partially matches to names, resulting in not 
setting the attribute. (Personally, I think this change in the attr 
function is not desirable, and much prefer exact matches to avoid 
unintentional errors).

How can I tell if this is an exact match? Is there a way to force an 
exact match?

Thanks.

+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] looping

2007-02-26 Thread Michael Toews

Another way is to use an indexed list, which is far more tidier than 
your method. If you mean about 100 as in an irregular number, then a 
list is your friend (i.e., a ragged array, that can have sometimes 97 
samples, sometime 105 samples, etc.). Similar to your example:

dat - runif(10,0,100) # fake dataset
smp - list() # need an empty list first
for(i in 1:1000)
smp[[i]] - sample(dat,100)

However, if you are new to R/S, the best advice is to learn to _not_ use 
the for loop (because it is slow, and there are vectorized ways). For 
example, if we want to find the mean of each sample, then return a tidy 
result:

sapply(samp,mean)

or a crazy new analysis you might be working on:

crazy - function(x,y) (sum(xy)^2)/sum(x)
sapply(smp,crazy,10)

etc.
+mt

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to use bash command in R script?

2007-02-26 Thread Guo Wei-Wei

Dear All:

Maybe it is a too basic question, but I don't how to find the answer.
Sorry for that.

What I want to do is call a shell command, which will provide two
numbers, and assign those numbers to a vector. For example:

The following command:

$mxresult.sh ABC.mx

mxresult.sh is a script written by myself and ABC.mx is a Mx script.
I can get two numbers, 126.128 and 29, with this command.

Is there any way to do it like this:

c - somefunction(mxresult.sh ABC.mx)

Or is their any other way to fulfill the function?

Thanks in advance!

Best washes,
Wei-Wei

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use bash command in R script?

2007-02-26 Thread Peter Dalgaard

Guo Wei-Wei wrote:
 Dear All:

 Maybe it is a too basic question, but I don't how to find the answer.
 Sorry for that.

 What I want to do is call a shell command, which will provide two
 numbers, and assign those numbers to a vector. For example:

 The following command:

 $mxresult.sh ABC.mx

 mxresult.sh is a script written by myself and ABC.mx is a Mx script.
 I can get two numbers, 126.128 and 29, with this command.

 Is there any way to do it like this:

 c - somefunction(mxresult.sh ABC.mx)

 Or is their any other way to fulfill the function?
   
txt - system(mxresult.sh ABC.mx, intern=TRUE)

is the first step.  Then you need to get the numbers using either a 
scan() on a textConnection (see its help page) or something like

mynum - as.numeric(strsplit(txt,   *)[[1]])


 Thanks in advance!

 Best washes,
 Wei-Wei

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] exact matching of names in attr

2007-02-26 Thread Henrik Bengtsson

if (!name %in% names(attributes(dat))) {
  ...
}

/Henrik

On 2/26/07, Michael Toews [EMAIL PROTECTED] wrote:
 In R 2.5.0 (r40806), one of the change is to allow partial matching of
 name in the attr function. However, how can I tell if I have an exact
 match or not?

 For example, checking to see if an object has a name attribute, then
 giving it one if it doesn't:

 dat - data.frame(x=1:10,y=rnorm(10))
 if(is.null(attr(dat,name)))
 attr(dat,name) - Site 1
 str(dat)

 (This example works in R  2.5) Although there is no name attribute to
 the data.frame, it partially matches to names, resulting in not
 setting the attribute. (Personally, I think this change in the attr
 function is not desirable, and much prefer exact matches to avoid
 unintentional errors).

 How can I tell if this is an exact match? Is there a way to force an
 exact match?

 Thanks.

 +mt

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 2 data frames - list in one out put , matrix in another ??

2007-02-26 Thread Petr Pikal

Hi

as I do not know what class is labelled and not knowing about your 
data I just try to give you some hints.

What says str(adata) and str(bdata) about structure of your data?

It can be possible that in second case the structure of resulting 
table can be formed into matrix and apply probably does this coercion 
in accordance with Details section of help page.

But this is only a guess.

HTH
Petr



On 26 Feb 2007 at 14:56, John Kane wrote:

Date sent:  Mon, 26 Feb 2007 14:56:07 -0500 (EST)
From:   John Kane [EMAIL PROTECTED]
To: R R-help r-help@stat.math.ethz.ch
Subject:[R] 2 data frames - list in one out put  , matrix in 
another ??

 I have two more or less parallel dataframes that are
 giving me different results on one subset of
 variables.  I know that I assembled the 2 dataframes
 slightly differently but I don't see why I am getting
 this result because one set of variables are labelled
 and the other is not. Variable names are the same,
 etc.  as far as I can acertain.  The only diffference
 seems to be that bdata variables are labelled.  
 
 About now I really don't care which I get but I would
 like them to be the same.  Can anyone suggest what I
 am doing wrong or should be looking at?
 
 Windows XP , R 2.4.1 Using Hmisc and gtools as well as
 the basic R installation.  
 
 Problem
 
 load(adata)
 fn1 - function(x) {table(x)}
 jj -apply(adata[,110:127], 2, fn1)
 
 OUTPUT jj is aa list of 18 tables
 Examine a variable:
 typeof(adata$act.toy)
 [1] integer
  class(adata$act.toy)
 [1] integer
 
 
 load(bdata
 fn1 - function(x) {table(x)}
 kk -apply(bdata[,94:111], 2, fn1)
 
 OUTPUT jj is a matrix 2 X 18
class(bdata$act.toy)
 [1] labelled
 typeof(bdata$act.toy)
 [1] integer
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use bash command in R script?

2007-02-26 Thread Andrew Robinson

?system.

Cheers,

Andrew

On Tue, Feb 27, 2007 at 02:05:09PM +0800, Guo Wei-Wei wrote:
 Dear All:
 
 Maybe it is a too basic question, but I don't how to find the answer.
 Sorry for that.
 
 What I want to do is call a shell command, which will provide two
 numbers, and assign those numbers to a vector. For example:
 
 The following command:
 
 $mxresult.sh ABC.mx
 
 mxresult.sh is a script written by myself and ABC.mx is a Mx script.
 I can get two numbers, 126.128 and 29, with this command.
 
 Is there any way to do it like this:
 
 c - somefunction(mxresult.sh ABC.mx)
 
 Or is their any other way to fulfill the function?
 
 Thanks in advance!
 
 Best washes,
 Wei-Wei
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
http://www.ms.unimelb.edu.au/~andrewpr
http://blogs.mbs.edu/fishing-in-the-bay/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RDA and trend surface regression

2007-02-26 Thread Jari Oksanen


 'm performing RDA on plant presence/absence data, constrained by
 geographical locations. I'd like to constrain the RDA by the extended
 matrix of geographical coordinates -ie the matrix of geographical
 coordinates completed by adding all terms of a cubic trend surface
 regression- . 
 
 This is the command I use (package vegan):
 
  
 
 rda(Helling ~ x+y+x*y+x^2+y^2+x*y^2+y*x^2+x^3+y^3) 
 
  
 
 where Helling is the matrix of Hellinger-transformed presence/absence data
 
 The result returned by R is exactly the same as the one given by:
 
  
 
 anova(rda(Helling ~ x+y)
 
  
 
 Ie the quadratic and cubic terms are not taken into account
 

You must *I*solate the polynomial terms with function I (AsIs) so that
they are not interpreted as formula operators:

rda(Helling ~ x + y + I(x*y) + I(x^2) + I(y^2) + I(x*y^2) + I(y*x^2) +
I(x^3) + I(y^3))

If you don't have the interaction terms, then it is easier and better
(numerically) to use poly():

rda(Helling ~ poly(x, 3) + poly(y, 3))

Another issue is that in my opinion using polynomial constraints is an
Extremely Bad Idea(TM).

cheers, Jari Oksanen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] fitting of all possible models

2007-02-26 Thread Indermaur Lukas

Hi,
Fitting all possible models (GLM) with 10 predictors will result in loads of 
(2^10 - 1) models. I want to do that in order to get the importance of 
variables (having an unbalanced variable design) by summing the up the 
AIC-weights of models including the same variable, for every variable 
separately. It's time consuming and annoying to define all possible models by 
hand. 
 
Is there a command, or easy solution to let R define the set of all possible 
models itself? I defined models in the following way to process them with a 
batch job:
 
# e.g. model 1
preference- formula(Y~Lwd + N + Sex + YY)  
  
# e.g. model 2
preference_heterogeneity- formula(Y~Ri + Lwd + N + Sex + YY)  
etc.
etc.
 
 
I appreciate any hint
Cheers
Lukas
 
 
 
 
 
°°° 
Lukas Indermaur, PhD student 
eawag / Swiss Federal Institute of Aquatic Science and Technology 
ECO - Department of Aquatic Ecology
Überlandstrasse 133
CH-8600 Dübendorf
Switzerland
 
Phone: +41 (0) 71 220 38 25
Fax: +41 (0) 44 823 53 15 
Email: [EMAIL PROTECTED]
www.lukasindermaur.ch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to use bash command in R script?

2007-02-26 Thread Guo Wei-Wei

Thank you all! I solved my problem with your help.

Best wishes,
Wei-Wei

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

75 matches

Mail list logo