date:20090826

Re: [R] ANCOVA with defined error terms

2009-08-26 Thread hpdutra


Could someone or Richard explain to me what he meant by 

This also shows a singular Error().  We look at the data and see that
plot is identical to the three-way veget:fruit:block interaction.
It seems to me that I just needed to recoded the plots, in order to get rid
of the Error message. If that is true, then why I can't go back to the
original model proposed by Richard
track.aov - aov(mice ~ coon
+ block * veget * fruit * time - block:veget:fruit:time
+ Error(block/Plot), data = track) 

and calculate F values myself? 

Thank you very much for the feedback. 




-- 
View this message in context: 
http://www.nabble.com/ANCOVA-with-defined-error-terms-tp25055311p25147303.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Thank you David : Re : Odp: Re : Odp: Re : table function

2009-08-26 Thread Inchallah Yarab

Thank you very very much David 





De : David Winsemius dwinsem...@comcast.net
Ã : David Winsemius dwinsem...@comcast.net

EnvoyÃ© le : Mardi, 25 AoÃ»t 2009, 23h32mn 53s
ObjetÂ : Re: [R] Re : Odp: Re : Odp: Re : table function


On Aug 25, 2009, at 9:23 AM, David Winsemius wrote:


 On Aug 25, 2009, at 6:25 AM, Inchallah Yarab wrote:

[[elided Yahoo spam]]

[[elided Yahoo spam]]

[[elided Yahoo spam]]

 ?table
 ?xtab

I read this incorrectly. Try instead;

 by(z, z.c, sum)

z.c: 0 - 1000
[1] 1010

z.c: 1000 - 3000
[1] 6400

z.c: 3000
[1] 14200




 
 De : Petr PIKAL petr.pi...@precheza.cz

 Cc : r-help@r-project.org
 EnvoyÃ© le : Mardi, 25 AoÃ»t 2009, 11h53mn 21s
 Objet : Odp: [R] Re : Odp: Re : table function

 Hi

 r-help-boun...@r-project.org napsal dne 25.08.2009 11:28:31:

 Ã
 Thank you Peter,
 in my vector Z i have missing value NA and i want to count itsÂ 
 number
 in the vector
 i had did alla this i know the difference between a numeirc and aÂ 
 factor


 OK. If you know difference between factor and numeric you probablyÂ 
 have
 seen
 ?factor
 where is note about how to make NA an extra level

 #Here is z
 z-c(10,100, 1000, 1200, 2000, 2200, 3000, 3200, 5000, 6000)
 #let's put some NA values into it
 z[c(2,5)] -NA
 #let's make a cut
 z.c-cut(z, breaks = c(-Inf, 1000, 3000, Inf), labels = c(0 - 1000,
 1000 - 3000, 3000))
 #as you see there are NA values but they are not extra level
 z.c
 [1] 0 - 1000Â  Â  NAÂ  Â  Â  Â  0 - 1000Â  Â  1000 - 3000 NA
 [6] 1000 - 3000 1000 - 3000 3000Â  Â  Â  Â  3000Â  Â  Â  Â  3000
 Levels: 0 - 1000 1000 - 3000 3000
 is.na(z.c)
 [1] FALSEÂ  TRUE FALSE FALSEÂ  TRUE FALSE FALSE FALSE FALSE FALSE

 # so let's try what help page says
 factor(z.c, exclude=NULL)
 [1] 0 - 1000Â  Â  NAÂ  Â  Â  Â  0 - 1000Â  Â  1000 - 3000 NA
 [6] 1000 - 3000 1000 - 3000 3000Â  Â  Â  Â  3000Â  Â  Â  Â  3000
 Levels: 0 - 1000 1000 - 3000 3000 NA
 # wow we have NA as extra level, let's do table
 table(factor(z.c, exclude=NULL))

Â  Â  0 - 1000 1000 - 3000Â  Â  Â  Â  3000Â  Â  Â  Â  NA
Â  Â  Â  Â  Â  2Â  Â  Â  Â  Â  Â  3Â  Â  Â  Â  Â  Â  3Â  Â  Â  Â  Â  Â  2

 Regards
 Petr


 the goal of this exercice that to count the number of missing value,
 number
 betwwen 0-1000 , 1000-3000, 3000.

 Thank you again for your help




 
 De : Petr PIKAL petr.pi...@precheza.cz

 Cc : r-help@r-project.org
 EnvoyÄÂ© le : Mardi, 25 AoÄÂ»t 2009, 11h15mn 23s
 ObjetÃ : Odp: [R] Re : table function

 Hi

 r-help-boun...@r-project.org napsal dne 25.08.2009 10:08:36:

 Hi Mark,


 Thank you for your answer !! it works but if i have NA in theÂ 
 vector
 z
 what
 i shoud do to count its number in Z?

 You do not have NA in z, you manage to convert it somehow to factor.
 Please try to read about data types and its behaviour. Start withÂ 
 this
 chapter
 2.8 Other types of objects
 in R intro manual which I suppose you have in doc folder of RÂ  
 program
 directory. You possibly can convert it back to numeric by

 e.g.

 DF$z - as.numeric(as.character(DF$z))

 but I presume you need to check your original data maybe by

 str(your.data) what mode they are and why they are factor if youÂ 
 expect
 them numeric.

 Regards
 Petr

 xÃÂ  ÃÂ  ÃÂ  yÃÂ  ÃÂ  ÃÂ  ÃÂ  z
 1ÃÂ  ÃÂ  0ÃÂ  ÃÂ  ÃÂ  100
 5ÃÂ  ÃÂ  1ÃÂ  ÃÂ  ÃÂ  1500
 6ÃÂ  ÃÂ  1ÃÂ  ÃÂ  ÃÂ  NA
 2ÃÂ  ÃÂ  2ÃÂ  ÃÂ  ÃÂ  500
 1ÃÂ  ÃÂ  1ÃÂ  ÃÂ  ÃÂ  NA
 5ÃÂ  ÃÂ  2ÃÂ  ÃÂ  2000
 8ÃÂ  ÃÂ  5ÃÂ  ÃÂ  4500


 i did the same but it gives me this errorÃÂ  message:
 ÃÂ  [0 - 1000] [1000 - 3000]ÃÂ  ÃÂ  ÃÂ  ÃÂ  3000
 ÃÂ  ÃÂ  ÃÂ  ÃÂ  ÃÂ  ÃÂ  0ÃÂ  ÃÂ  ÃÂ  ÃÂ  ÃÂ  ÃÂ  0ÃÂ  ÃÂ  
 ÃÂ  ÃÂ  
 ÃÂ  ÃÂ  0
 Warning message:
 In inherits(x, factor) : NAs introduced by coercion


 Thank you


 
 De : Marc Schwartz marc_schwa...@me.com

 Cc : r-help@r-project.org
 EnvoyÄÂ© le : Lundi, 24 AoÄ¹Â±t 2009, 18h33mn 52s
 Objet : Re: [R] table function

 On Aug 24, 2009, at 10:59 AM, Inchallah Yarab wrote:

 hi,

 i want to use the function table to build a table not of frequence
 (number
 of time the vareable is repeated in a list or a data frame!!) butÂ  
 in
 function of classes
 [[elided Yahoo spam]]

 example

 xÃÂ  ÃÂ  ÃÂ  yÃÂ  ÃÂ  ÃÂ  ÃÂ  z
 1ÃÂ  ÃÂ  0ÃÂ  ÃÂ  ÃÂ  100
 5ÃÂ  ÃÂ  1ÃÂ  ÃÂ  ÃÂ  1500
 6ÃÂ  ÃÂ  1ÃÂ  ÃÂ  ÃÂ  1200
 2ÃÂ  ÃÂ  2ÃÂ  ÃÂ  ÃÂ  500
 1ÃÂ  ÃÂ  1ÃÂ  ÃÂ  ÃÂ  3500
 5ÃÂ  ÃÂ  2ÃÂ  ÃÂ  2000
 8ÃÂ  ÃÂ  5ÃÂ  ÃÂ  4500

 i want to do a table summerizing the number of variable where z is
 in
 [0-1000],],[1000-3000], [ 3000]

 thank you very much for your help


 See ?cut, which bins a continuous variable.

 DF
 ÃÂ  x yÃÂ  ÃÂ  z
 1 1 0ÃÂ  100
 2 5 1 1500
 3 6 1 1200
 4 2 2ÃÂ  500
 5 1 1 3500
 6 5 2 2000
 7 8 5 4500

[R] Select top three values from data frame

2009-08-26 Thread Noah Silverman


Hi,

I'm trying to find an easy way to do this.

I want to select the top three values of a specific column in a subset 
of rows in a data.frame.  I'll demonstrate.


ABC
x21
x41
x32
y15
y26
y38


I want the top 3 values of B from the data.frame where A=X and C 2

I could extract all the rows where C2, then sort by B, then take the 
first 3.  But that seems like the wrong way around, and it also will get 
messy with real data of over 100 columns.


Any suggestions?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ptproc package

2009-08-26 Thread Uwe Ligges

If you have read the posting guide and then tell us the website, the 
version, the R version, your OS, and so on, we might finally be able to 
help.


Uwe Ligges


oyelola.adegb...@student.uhasselt.be wrote:

Dear All,

Please can anyone assist on installing 'ptproc', I downloaded it on the
contributors website and tried to install it manual but R wont unzip it.

Thank you


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Simon Wood

  I am trying to understand the relationships between:
 
  y~s(x1)+s(x2)+s(x3)+s(x4)
 
  and
 
  y~s(x1,x2,x3,x4)
 
  Does the latter contain the former? what about the smoothers of all
  interaction terms?
The first says that you want a model 
E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1)
where the f_j are smooth functions. The additive decomposition is quite a 
strong assumption, since it assumes that the effect of x_j is not dependent 
on x_k unless j=k. The second model is just
E(y) = f(x_1,x_2,x_3,x4)  (2)
where f is a smooth function. This looks very general, but actually `s' terms 
assume isotropic smoothness, which is also quite a strong assumption. 

Now if I simply state that f and the f_j are `smooth functions', and leave it 
at that, then (2) would of course contain (1), but to actually estimate the 
models I need to state, mathematically, what I mean by `smooth'. Once I've 
done that I've pretty much determined the function spaces in which f and the 
f_j will lie, and in general (2) will no longer strictly contain (1). mgcv's 
`s' terms use a thin plate spline measure of smoothness for multivariate 
smooths, and this means that (1) will not be strictly nested within (2), 
since e.g. a 4D thin plate spline can not generally represent exactly what 
the sum of 4 1D splines can represent. 

If you want to acheive exact nesting then using tensor product smooths with 
something like 

y~te(x1)+te(x2)+te(x3)+te(x4)   (3)

y~te(x1,x2,x3,x4) (4)

will do the trick (because the function space for (4) is built up from the 
function spaces used in (3)). 

As to where all the 2 and 3 way interactions have gone in (4)... it's just 
like ANOVA - if you put in a 4 way interaction then the lower order 
interactions are not identifiable, unless you choose to add constraints to 
make them so. `mgcv' will allow you add main effects and interactions, and 
will handle the constraints automatically, but if this sort of functional 
ANOVA is a major component of what you want to do, then it is probably worth 
checking out the gss package and Chong Gu's book on smoothing spline ANOVA.

best,
Simon






-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Select top three values from data frame

2009-08-26 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 26.08.2009 10:36:22:

 Hi,
 
 I'm trying to find an easy way to do this.
 
 I want to select the top three values of a specific column in a subset 
 of rows in a data.frame.  I'll demonstrate.
 
 ABC
 x21
 x41
 x32
 y15
 y26
 y38
 
 
 I want the top 3 values of B from the data.frame where A=X and C 2
 
 I could extract all the rows where C2, then sort by B, then take the 
 first 3.  But that seems like the wrong way around, and it also will get 

 messy with real data of over 100 columns.

One way is to use subset, order and head

head(subset(your.data[order(your.data$B, decreasing=T),], subset = C2  
A==x), 3)

Regards
Petr


 
 Any suggestions?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Mohamed Lajnef


Noah Silverman a écrit :

Hi,

I'm trying to find an easy way to do this.

I want to select the top three values of a specific column in a subset 
of rows in a data.frame.  I'll demonstrate.




Hi,
did you try this?

data[data$A=='x' data$C2,]$B # data = your data frame

ABC
x21
x41
x32
y15
y26
y38


I want the top 3 values of B from the data.frame where A=X and C 2

I could extract all the rows where C2, then sort by B, then take the 
first 3.  But that seems like the wrong way around, and it also will 
get messy with real data of over 100 columns.


Any suggestions?


regards

ML


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Mohamed Lajnef
INSERM Unité 955. 
40 rue de Mesly. 94000 Créteil.
Courriel : mohamed.laj...@inserm.fr 
tel. : 01 49 81 31 31 (poste 18470)

Sec : 01 49 81 32 90
fax : 01 49 81 30 99 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: draw n-1 lines with n X and n Y

2009-08-26 Thread Petr PIKAL

Hi

r-help-boun...@r-project.org napsal dne 25.08.2009 18:15:25:

 
 I want to draw lines using a matrix with the X-axe on the first column, 
and
 the the Y-axe on the second column.
 These lines have to link the n points of a graph, so they are n-1, but 
the
 resout using  these two commands:
   X - matrix(scan(MaxInterEdges.R,0), ncol=2);
   lines(X[,1], X[,2], type = l, col = red);
 
 is wrong, becose the lines are many more.
 
 What can I do? 

Well, without reproducible code could help to identify a your problem. 
here is a guess

X[,1] is not ordered and the lines are connecting points as they appear in 
your data. You could order X by first column and then try to plot lines.

X.ord-X[order(X[,1]),]
lines(X.ord[,1], X.ord[,2], col = red)

Regards
Petr


 -- 
 View this message in context: 
http://www.nabble.com/draw-n-1-lines-with-n-X-
 and-n-Y-tp25137514p25137514.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Ottorino-Luca Pantani


df.mydata[df.mydata$A==X AND df.mydata$C  2, ]
will do the job ?

8rino

Noah Silverman ha scritto:

Hi,

I'm trying to find an easy way to do this.

I want to select the top three values of a specific column in a subset 
of rows in a data.frame.  I'll demonstrate.


ABC
x21
x41
x32
y15
y26
y38


I want the top 3 values of B from the data.frame where A=X and C 2

I could extract all the rows where C2, then sort by B, then take the 
first 3.  But that seems like the wrong way around, and it also will 
get messy with real data of over 100 columns.


Any suggestions?



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Noah Silverman


I only have a few values in my example, but the real data set might have 
20-100 rows with A=X.  So how do I pick just the three highest ones?

-N


On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote:
 df.mydata[df.mydata$A==X AND df.mydata$C  2, ]
 will do the job ?

 8rino

 Noah Silverman ha scritto:
 Hi,

 I'm trying to find an easy way to do this.

 I want to select the top three values of a specific column in a 
 subset of rows in a data.frame.  I'll demonstrate.

 ABC
 x21
 x41
 x32
 y15
 y26
 y38


 I want the top 3 values of B from the data.frame where A=X and C 2

 I could extract all the rows where C2, then sort by B, then take the 
 first 3.  But that seems like the wrong way around, and it also will 
 get messy with real data of over 100 columns.

 Any suggestions?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Issues with factors with duplicate (empty) levels

2009-08-26 Thread Frederik Elwert

Hello!

I imported a DJI survey[1] from an SPSS file. When looking at some of
the variables, I noticed problems with the `table` function and similar.
It seems to be caused by duplicate levels which are generated from the
value labels. Not all values have labels, so those who don’t get an
empty string as the level, which leads to duplicates.

I hope the code and output below illustrates the problem. Is it possible
to prevent this? I’d still like to use the labels, so using numeric
vectors instead of factors is not the best solution.

Regards,
Frederik


 library(foreign)
 Data - read.spss(js2003_16_29_db.sav, to.data.frame=TRUE,
reencode=latin1)
 table(Data$J203_A)

überhaupt nicht wichtig 
 352256   0 

  0   0   0 
   sehr wichtig Mehrfachnennung 
   4660   0 
 table(as.numeric(Data$J203_A))

   1234567 
  35   39   84  227  626 1280 4660 
 is.factor(Data$J203_A)
[1] TRUE
 levels(Data$J203_A)
[1] überhaupt nicht wichtig
[3]
[5]
[7] sehr wichtigMehrfachnennung




[1] http://213.133.108.158/surveys/index.php?m=msw,0sID=54


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Noah Silverman

Colin,

Unless I missed something, the head function doesn't sort.  So if I have 
1000 values that match, head just gives me the first three, not the 
HIGHEST three.


[df.mydata$A==X  df.mydata$C  2,


On 8/26/09 3:01 AM, Colin Millar wrote:
 Hi,

 This should work - head is quite a usefull summary function

 head(df.mydata[df.mydata$A==X  df.mydata$C  2, ],3)


 Colin.

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Noah Silverman
 Sent: 26 August 2009 10:54
 To: ottorino-luca.pant...@unifi.it
 Cc: r help
 Subject: Re: [R] Select top three values from data frame


 I only have a few values in my example, but the real data set might have

 20-100 rows with A=X.  So how do I pick just the three highest ones?

 -N


 On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote:

 df.mydata[df.mydata$A==X AND df.mydata$C  2, ]
 will do the job ?

 8rino

 Noah Silverman ha scritto:
  
 Hi,

 I'm trying to find an easy way to do this.

 I want to select the top three values of a specific column in a
 subset of rows in a data.frame.  I'll demonstrate.

 ABC
 x21
 x41
 x32
 y15
 y26
 y38


 I want the top 3 values of B from the data.frame where A=X and C2

 I could extract all the rows where C2, then sort by B, then take the


 first 3.  But that seems like the wrong way around, and it also will
 get messy with real data of over 100 columns.

 Any suggestions?


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 This email has been scanned by the MessageLabs Email Security System.
 For more information please visit http://www.messagelabs.com/email
 __


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Mohamed Lajnef


Noah Silverman a écrit :
I only have a few values in my example, but the real data set might have 
20-100 rows with A=X.  So how do I pick just the three highest ones?


-N

  

Hi,

and now?

df.mydata$B[order(df.mydata[df.mydata$A==X AND df.mydata$C  2, 
]$B)][length(df.mydata$B)-3:length(df.mydata$B)]


cheers,
ML






On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote:
  

df.mydata[df.mydata$A==X AND df.mydata$C  2, ]
will do the job ?

8rino

Noah Silverman ha scritto:


Hi,

I'm trying to find an easy way to do this.

I want to select the top three values of a specific column in a 
subset of rows in a data.frame.  I'll demonstrate.


ABC
x21
x41
x32
y15
y26
y38


I want the top 3 values of B from the data.frame where A=X and C 2

I could extract all the rows where C2, then sort by B, then take the 
first 3.  But that seems like the wrong way around, and it also will 
get messy with real data of over 100 columns.


Any suggestions?

  


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  



--
Mohamed Lajnef
INSERM Unité 955. 
40 rue de Mesly. 94000 Créteil.
Courriel : mohamed.laj...@inserm.fr 
tel. : 01 49 81 31 31 (poste 18470)

Sec : 01 49 81 32 90
fax : 01 49 81 30 99 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Simon Wood

This will not work...
 2) y~s(x1,  ,x36)
Estimating a 36 dimensional functions reasonably well would require a 
tremendous quantity of data, but in any case the 36 dimensional TPS smoothnes 
measure will involve such high order derivatives that it will no longer be 
practically useful: in fact you will not have enough data to estimate the 
unpenalized coefficients of the smoother (and if you did R would run out of 
memory first).  

In such a high dimensional situation, I think that GAMs are really only useful 
if you have some prior knowledge of which variables are likely to interact 
(and it's not too many of them). If there's no prior information saying 
roughly what sort of smooth additive structure might be useful then, I'm not 
sure that GAMs are the right way to go, and some sort of machine learning 
approach might be better.

Then again, the real problem with 
y~s(x1,  ,x36)
is that the data just won't contain enough information to estimate s, if all 
you can say is that s is smooth, but this also means that it's very unlikely 
that you really need to estimate s(x1,  ,x36) in order to predict well. 
In that case, starting from 
y ~ s(x1) +  + s(x36)
and building the model up might result in something that does a reasonable 
predictive job. 

On the subject of tensor product smoothing vs isotropic smoothing. Isotropic 
smooths are really only reasonable if you think  that the smooth should 
display approximately the same amount of wiggliness in all directions. If 
this is not the case then tensor product smoothing is a better bet. Centering 
and scaling alone is not enough to ensure that isotropy is reasonable 
(although in particular cases it may help, of course).

best,
Simon



 I am trying to build a predictive model. Since the the variables are
 centred and scaled, I think I need an isotropic smooth. I am also
 interested in having the interactions between the variables included, that
 is not a purely additive model.

 It is not clear to me when should I give preference to tensor smooths,
 possibly because I have not understood well how they work.

 I am reading Wood(2003) as recommended and I have also read rather
 extensively Simon N. Wood. Generalized Additive Models: An Introduction,
 2006, but still I am stuck. Any additional suggestion or reading
 recommendation would be greatly appreciated.

 I have also some difficulties in understanding the values you have chosen
 for k in the first example (why 60?).

 Thanks

 Best,

 On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
  [Note R-Devel is the wrong list for such questions. R-Help is where this
  should have been directed - redirected there now]
 
  On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
   Dear R-experts,
  
   I have a question on the formulas used in the gam function of the mgcv
   package.
  
   I am trying to understand the relationships between:
  
   y~s(x1)+s(x2)+s(x3)+s(x4)
  
   and
  
   y~s(x1,x2,x3,x4)
  
   Does the latter contain the former? what about the smoothers of all
   interaction terms?
 
  I'm not 100% certain how this scales to smooths of more than 2
  variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
  Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
  2 variables.
 
  Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
  used to produce the smoothers in the two models may not be the same in
  both models. One option to ensure nestedness is to fit the more
  complicated model as something like this:
 
  ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
  y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
^
  where the last term (^^^ above) has the same k as used in s(x1, x2)
 
  Note that these are isotropic smooths; are x1 and x2 measured in the
  same units etc.? Tensor product smooths may be more appropriate if not,
  and if we specify the bases when fitting models s(x1) + s(x2) *is*
  strictly nested in te(x1, x2), eg.
 
  y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10)
 
  is strictly nested within
 
  y ~ te(x1, x2, k = 10)
  ## is the same as y ~ te(x1, x2, bs = cr, k = 10)
 
  [Note that bs = cr is the default basis in te() smooths, hence we
  don't need to specify it, and k = 10 refers to each individual smooth in
  the te().]
 
  HTH
 
  G
 
   I have (tried to) read the manual pages of gam, formula.gam,
   smooth.terms, linear.functional.terms but could not understand
   properly.
  
   Regards

-- 
 Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
 +44 1225 386603  www.maths.bath.ac.uk/~sw283

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Corrado

Dear Simon,

thanks for your answer.

I am running the model with both s and te smoothing, to compare.

A few questions on your email:

1) Isotropic smoothness: my variables are centred and scaled. I assumed an 
isotropic smoother (that is, a smoother that treats all the variables in the 
same way) was good. What do you think? Is my understanding of isotropic 
smoothing wrong? 

2) s(x1,, xn): it does not contains (1), but I thought it was true that it 
does improve on (1) by being free of including some interaction, albeit not 
explicitly  is my interpretation wrong?

3) te: I am confused! What does it mean that the function space for (4) is 
built up from the function spaces used in (3)? Does it mean that 
te(xi,,xn) is an expansion on the te(xi), including all the terms 
te(x1)*te(x2)**te(xj)**te(xn) of the different orders?

Example: in the case of 4 variables, including te(x1)*te(x2), te(x2)*te(x3), 
 te(x1)*te(x2)*te(x3)  to te(x1)*te(x2)*te(x3)*te(x4) .

Sorry for being particularly daft 

Regards


On Wednesday 26 August 2009 09:56:13 you wrote:
   I am trying to understand the relationships between:
  
   y~s(x1)+s(x2)+s(x3)+s(x4)
  
   and
  
   y~s(x1,x2,x3,x4)
  
   Does the latter contain the former? what about the smoothers of all
   interaction terms?

 The first says that you want a model
 E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1)
 where the f_j are smooth functions. The additive decomposition is quite a
 strong assumption, since it assumes that the effect of x_j is not dependent
 on x_k unless j=k. The second model is just
 E(y) = f(x_1,x_2,x_3,x4)  (2)
 where f is a smooth function. This looks very general, but actually `s'
 terms assume isotropic smoothness, which is also quite a strong assumption.

 Now if I simply state that f and the f_j are `smooth functions', and leave
 it at that, then (2) would of course contain (1), but to actually estimate
 the models I need to state, mathematically, what I mean by `smooth'. Once
 I've done that I've pretty much determined the function spaces in which f
 and the f_j will lie, and in general (2) will no longer strictly contain
 (1). mgcv's `s' terms use a thin plate spline measure of smoothness for
 multivariate smooths, and this means that (1) will not be strictly nested
 within (2), since e.g. a 4D thin plate spline can not generally represent
 exactly what the sum of 4 1D splines can represent.

 If you want to acheive exact nesting then using tensor product smooths with
 something like

 y~te(x1)+te(x2)+te(x3)+te(x4)   (3)

 y~te(x1,x2,x3,x4) (4)

 will do the trick (because the function space for (4) is built up from the
 function spaces used in (3)).

 As to where all the 2 and 3 way interactions have gone in (4)... it's just
 like ANOVA - if you put in a 4 way interaction then the lower order
 interactions are not identifiable, unless you choose to add constraints to
 make them so. `mgcv' will allow you add main effects and interactions, and
 will handle the constraints automatically, but if this sort of functional
 ANOVA is a major component of what you want to do, then it is probably
 worth checking out the gss package and Chong Gu's book on smoothing spline
 ANOVA.

 best,
 Simon



-- 
Corrado Topi

Global Climate Change  Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Ottorino-Luca Pantani


Noah Silverman ha scritto:


I only have a few values in my example, but the real data set might 
have 20-100 rows with A=X.  So how do I pick just the three highest 
ones?


-N


On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote:

df.mydata[df.mydata$A==X AND df.mydata$C  2, ]
will do the job ?

8rino

Noah Silverman ha scritto:

Hi,

I'm trying to find an easy way to do this.

I want to select the top three values of a specific column in a 
subset of rows in a data.frame.  I'll demonstrate.


ABC
x21
x41
x32
y15
y26
y38


I want the top 3 values of B from the data.frame where A=X and C 2

I could extract all the rows where C2, then sort by B, then take 
the first 3.  But that seems like the wrong way around, and it also 
will get messy with real data of over 100 columns.


Any suggestions?

my.data - cbind.data.frame(expand.grid(A = c(X,  Y), B = 1:100),  C 
= rnorm(100))

myA.data - my.data[my.data$A == X, ]
myA.sorted.data - myA.data[order(myA.data$C, decreasing=TRUE), ][1:3.]

Do this solve your problem ?

--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 
olpant...@unifi.it  http://www4.unifi.it/dssnp/


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] simple graph question: manipulating variable names

2009-08-26 Thread Jim Lemon


Donald Braman wrote:

This is a simple problem that has stumped me: I'm trying to loop through a
few dozen variable names in graphs.  I've tried various approaches like
this:
attach(mydata)
ivs - c(oneiv, anotheriv, yetanotheriv)
dvs - c(onedv, anotherdv, yetanotherdv)
for (iv in ivs) {
for (dv in dvs) {
graphname - paste(iv, dv, .png, sep = )
png(file=graphname, width=300, height=300)
plot(dv ~ iv, pch=.)
lines(loess.smooth(iv, dv), lty=1)
dev.off()
}
}

Clearly that doesn't work.  I'm not sure how to make R see the iv and dv
strings as variables.  Advice?

  

Hi Donald,
I think the problem is that you are trying to plot the strings that you 
are using for your filename rather than the elements of mydata. Try this:


for(ivindex in 1:3) {
for(dvindex in 1:3) {
 graphname-paste(iv[ivindex],dv[dvindex],.png,sep=)
 png(graphname,width=300,height=300)
 plot(mydata[,3*iv-dv-1],mydata[,3*iv-dv],pch=.)
 lines(loess.smooth(mydata[,3*iv-dv-1],mydata[,3*iv-dv],lty=1)
 dev.off()
}
}

remembering that I have made up the indexing of mydata out of thin 
air. You will have to work out how to index the columns or rows of 
mydata to get the right iv and dv for each pass of the loops.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to set crontab for updating the repositories?

2009-08-26 Thread Sukhbir Rattan

Hi,

I have downloaded around 60GB package repositories of bioconductor to use it
locally and to set up mirror at my university site.

I have installed the mirror with rsync command and able to access also.

Now I have to set a cron job for its daily updating from bioconductor
website. How should I do it?

I know rsync have to be used but I don't know the proper syntax.

I request to send proper syntax.

Thanks,

Sukhbir Singh Rattan.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Colin Millar

Or perhaps use a temporary vector might be neater?

tmp - with(df.mydata, B[A==X  C  2])
df.mydata[order(tmp) %in% 1:3,] # gives df with highest three values of B

or

head(df.mydata[order(tmp),],3) # gives first 3 rows of df sorted by B


Colin.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Mohamed Lajnef
Sent: 26 August 2009 11:25
To: Noah Silverman
Cc: r help
Subject: Re: [R] Select top three values from data frame

Noah Silverman a écrit :
 I only have a few values in my example, but the real data set might have 
 20-100 rows with A=X.  So how do I pick just the three highest ones?

 -N

   
Hi,

and now?

df.mydata$B[order(df.mydata[df.mydata$A==X AND df.mydata$C  2, 
]$B)][length(df.mydata$B)-3:length(df.mydata$B)]


cheers,
ML





 On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote:
   
 df.mydata[df.mydata$A==X AND df.mydata$C  2, ]
 will do the job ?

 8rino

 Noah Silverman ha scritto:
 
 Hi,

 I'm trying to find an easy way to do this.

 I want to select the top three values of a specific column in a 
 subset of rows in a data.frame.  I'll demonstrate.

 ABC
 x21
 x41
 x32
 y15
 y26
 y38


 I want the top 3 values of B from the data.frame where A=X and C 2

 I could extract all the rows where C2, then sort by B, then take the 
 first 3.  But that seems like the wrong way around, and it also will 
 get messy with real data of over 100 columns.

 Any suggestions?

   

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

   


-- 
Mohamed Lajnef
INSERM Unité 955. 
40 rue de Mesly. 94000 Créteil.
Courriel : mohamed.laj...@inserm.fr 
tel. : 01 49 81 31 31 (poste 18470)
Sec : 01 49 81 32 90
fax : 01 49 81 30 99 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Re : error matrix, cross table,

2009-08-26 Thread Inchallah Yarab

Hi
see ?table
i hope that helps!!

inchallah





De : dalposso gustavodalpo...@hotmail.com
À : r-help@r-project.org
Envoyé le : Mercredi, 26 Août 2009, 1h20mn 31s
Objet : [R] error matrix, cross table,


how to create a cross table to quantify the classes of two thematic maps?

-- 
View this message in context: 
http://www.nabble.com/error-matrix%2C-cross-table%2C-tp25143926p25143926.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] changing the Date format.

2009-08-26 Thread rajclinasia


Hi everyone,

i have a data frame called 'sample'. in that 'sample' data frame variable
called 'starts' is there like this

starts
37987
37988
37989
37990 
37991

now i want change that variable into '-mm-dd' format. can any body help
in this aspect.

Thanks in Advance.
-- 
View this message in context: 
http://www.nabble.com/changing-the-Date-format.-tp25150494p25150494.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing the Date format.

2009-08-26 Thread Patrick Connolly

On Wed, 26-Aug-2009 at 03:48AM -0700, rajclinasia wrote:

| 
| Hi everyone,
| 
| i have a data frame called 'sample'. in that 'sample' data frame variable
| called 'starts' is there like this
| 
| starts
| 37987
| 37988
| 37989
| 37990 
| 37991
| 
| now i want change that variable into '-mm-dd' format. can any body help
| in this aspect.

Check out the as.Date function.
?as.Date

HTH

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Colin Millar


Hi,

This should work - head is quite a usefull summary function

head(df.mydata[df.mydata$A==X  df.mydata$C  2, ],3)


Colin.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Noah Silverman
Sent: 26 August 2009 10:54
To: ottorino-luca.pant...@unifi.it
Cc: r help
Subject: Re: [R] Select top three values from data frame


I only have a few values in my example, but the real data set might have

20-100 rows with A=X.  So how do I pick just the three highest ones?

-N


On 8/26/09 2:46 AM, Ottorino-Luca Pantani wrote:
 df.mydata[df.mydata$A==X AND df.mydata$C  2, ]
 will do the job ?

 8rino

 Noah Silverman ha scritto:
 Hi,

 I'm trying to find an easy way to do this.

 I want to select the top three values of a specific column in a 
 subset of rows in a data.frame.  I'll demonstrate.

 ABC
 x21
 x41
 x32
 y15
 y26
 y38


 I want the top 3 values of B from the data.frame where A=X and C 2

 I could extract all the rows where C2, then sort by B, then take the

 first 3.  But that seems like the wrong way around, and it also will 
 get messy with real data of over 100 columns.

 Any suggestions?


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] GLMs

2009-08-26 Thread Letizia Campioni


Hi,

I am starting to work with R. 

I need to performe a General linear model and a Generalized mixed model, what 
are the package I have to use for?

what is the difference between them?

thanks

letizia

 

_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering matrices

2009-08-26 Thread Peter Alspach

Tena koe Ben 

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of bwgoudey
 Sent: Wednesday, 26 August 2009 2:55 p.m.
 To: r-help@r-project.org
 Subject: Re: [R] Filtering matrices

  r-rcorr(d[[1]]) #d is matrix containing observation r[[1]] 
 #r values
age sex BMI
 age  1.000 -0.30010322 -0.13702263
 sex -0.3001032  1.  0.06300528
 BMI -0.1370226  0.06300528  1.
  r[[2]] #Number of obervations
 age sex BMI
 age 100 100 100
 sex 100 100 100
 BMI 100 100 100
  r[[3]] #P values
 age sex   BMI
 age  NA 0.002416954 0.1740134
 sex 0.002416954  NA 0.5334484
 BMI 0.174013354 0.533448366NA

 If I wanted to return a matrix containing all points where 
 correlation was above 0.75 and P-value was below 0.05, how 
 would I do this?

 Cheers
  Ben

It is not clear to me what 'matrix' you wish to have returned since
you'll be keeping an unknown number of values.  However, if you are
happy just to replace the unwanted values with a missing value then
something like

r[[1]][r[[1]]0.75 | (r[[3]]0.05  !is.na(r[[3]]))] - NA

might work.  I say 'might' because I am unfamiliar with rcorr() and thus
am not sure whether the components of r are matrices or dataframes, and
the above is untested.  You might be interested in abs(r[[1]]).

HTH 

Peter Alspach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help in plotting a legend

2009-08-26 Thread Jim Lemon


Ashutosh Nandeshwar wrote:

Hello, List,

 


I am a new user of the R project, and I need some help in plotting a legend.
I am using the PBSmapping library to plot map of Ohio and heat color it with
the count of employees in each county. As a guide, I am using Data Mashups
in R. I am able to plot the map with the colors; however, I would like to
put a legend with a single box full of these colors, but only show the max
and the min value on the left corner and the right corner respectively. 

 


Here's a  part of the code I am using:

#empdata has the employee data with the latitude and longitude for each
employee

library(PBSmapping)

addressEvents-as.EventData(empdataRC,projection=NA)

#myShapeFile is a shape file of Ohio imported using importShapefile

addressPolys-findPolys(addressEvents,myShapeFile) 

 


myTrtFC-
table(factor(addressPolys$PID,levels=levels(as.factor(myShapeFile$PID 


log(myTrtFC)-lTrt

mapColors-heat.colors(max(lTrt)+1,alpha=.6)[max(lTrt)-lTrt+1] 


mapColors[ is.na(mapColors) ] - white

pdf(RC-Employees.pdf,version=1.4)

plotPolys(myShapeFile,axes=FALSE,bg=white,main= Employees
,xlab=,ylab=,col=mapColors) 


#here is the current legend, but as you can see, the colors and the box
count doesn't match

#i would like to see only one box with the gradients of the colors that I
used, and the max and the min value on top. I even tried a rectangle but I
could not plot it
  

Hi Ashutosh,
Have a look at color.legend in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] faulty formatting of toLatex(sessionInfo())

2009-08-26 Thread Liviu Andronic

Dear all
I am writing an Sweave document and have encountered formatting issues
with the locale part of toLatex(sessionInfo()). The fact that there
is no spaces between the various locale variables means that LaTeX
cannot easily find an appropriate place to break the lines, and some
will get printed off screen.
Below is the text output, and this .pdf document [1] shows the
(faulty) tex result. Could anyone suggest how to get around this
issue?
Thank you
Liviu


[1] http://s000.tinyupload.com/index.php?file_id=06259544336960829228

 sessionInfo()
R version 2.9.1 (2009-06-26)
x86_64-pc-linux-gnu

locale:
LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] grDevices utils datasets  stats graphics  methods   base

other attached packages:
[1] fortunes_1.3-6 hints_1.0.1-1  boot_1.2-38relimp_1.0-1   xtable_1.5-5
[6] Hmisc_3.6-1

loaded via a namespace (and not attached):
[1] cluster_1.12.0  grid_2.9.1  lattice_0.17-25 tcltk_2.9.1





-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Number of CPU's

2009-08-26 Thread Henrique Dallazuanna

For windows you can do:

 Sys.getenv(NUMBER_OF_PROCESSORS)

On Mon, Aug 24, 2009 at 3:16 PM, Håvard Rue havard@math.ntnu.no wrote:

 Any way to get access to the number of CPU's, optionally their type,
 from within  R?   In linux I can just read /proc/cpuinfo  but for
 win/mac ?

 Thanks!
 Håvard

 --
  Håvard Rue
  Department of Mathematical Sciences
  Norwegian University of Science and Technology
  N-7491 Trondheim, Norway
  Voice: +47-7359-3533URL  : 
 http://www.math.ntnu.no/~hruehttp://www.math.ntnu.no/%7Ehrue
  Fax  : +47-7359-3524Email: havard@math.ntnu.no

  This message was created in a Microsoft-free computing environment.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] faulty formatting of toLatex(sessionInfo())

2009-08-26 Thread G. Jay Kerns

Dear Liviu,


On Wed, Aug 26, 2009 at 7:35 AM, Liviu Androniclandronim...@gmail.com wrote:
 Dear all
 I am writing an Sweave document and have encountered formatting issues
 with the locale part of toLatex(sessionInfo()). The fact that there
 is no spaces between the various locale variables means that LaTeX
 cannot easily find an appropriate place to break the lines, and some
 will get printed off screen.
 Below is the text output, and this .pdf document [1] shows the
 (faulty) tex result. Could anyone suggest how to get around this
 issue?
 Thank you
 Liviu


There was a closely related discussion last April:

https://stat.ethz.ch/pipermail/r-devel/2009-April/053094.html

and IIRC this was fixed for R version 2.10.

Hope this helps,
Jay





-- 
***
G. Jay Kerns, Ph.D.
Associate Professor
Department of Mathematics  Statistics
Youngstown State University
Youngstown, OH 44555-0002 USA
Office: 1035 Cushwa Hall
Phone: (330) 941-3310 Office (voice mail)
-3302 Department
-3170 FAX
VoIP: gjke...@ekiga.net
E-mail: gke...@ysu.edu
http://www.cc.ysu.edu/~gjkerns/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to set crontab for updating the repositories?

2009-08-26 Thread Paul Hiemstra


Sukhbir Rattan wrote:

Hi,

I have downloaded around 60GB package repositories of bioconductor to use it
locally and to set up mirror at my university site.

I have installed the mirror with rsync command and able to access also.

Now I have to set a cron job for its daily updating from bioconductor
website. How should I do it?

I know rsync have to be used but I don't know the proper syntax.

I request to send proper syntax.

Thanks,

Sukhbir Singh Rattan.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  

Hi,

Typing:

crontab -e

Will open the crontab file for the current user, here you can add 
commands that are to executed at certain times. The crontab will look 
something like:


# m h  dom mon dow   command
0 0 * * * /foo/bar

where the command /foo/bar will be executed every day (dom, mon and dow 
are a *, meaning 'for all') at 12 pm.


cheers,
Paul


--
Drs. Paul Hiemstra
Department of Physical Geography
Faculty of Geosciences
University of Utrecht
Heidelberglaan 2
P.O. Box 80.115
3508 TC Utrecht
Phone:  +3130 274 3113 Mon-Tue
Phone:  +3130 253 5773 Wed-Fri
http://intamap.geo.uu.nl/~paul

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] attach package

2009-08-26 Thread Jeremy MAZET

Hello,

Is there a solution to attach a package without run the hook function 
.onAttach()

Thanks

Jérémy Mazet

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Within factor random factor

2009-08-26 Thread 細田弘吉

Hi,
I am quite new to R and trying to analyze the following data.  I have 28
controls and 25 patients.  I measured X values of 4 different locations
(A,B,C,D) in the brain image of each subject.  And X ranges from 0 to 1.
I think control or patient is a between subject factor and location is
a within subject factor.  So,

controls: 28
patients: 25 (unbalanced data set)
respone measure: X values (ranging 0 to 1)
fixed factor: control vs. patient (between subject factor)
random factor: location (level: A,B,C,D ;no order) (within subject factor)
random factor: subjectID 1-53

My data looks like this;

CorPX   locationsubjectID
control 0.708   A   1
control 0.648   A   2
patient 0.638   C   3
control 0.547   D   4
patient 0.632   B   5
control 0.723   C   6
...

I want to know
(a) if there is a significant difference between controls and patients
in X values.
(b) where (A,B,C,D?) the difference is between controls and patients in
X values.  (There may be an interaction)

I constructed linear mixed model with lme as followings;

(1) model1 - lme(X ~ CorP*location, random= ~ 1| subjectID, mydata)

(2) model2 - lme(X ~ CorP*location, random= ~ location| subjectID, mydata)

I am not familiar with lme syntax.  I'm just wondering which formula
[(1) or (2)] is appropriate for my model to know answers of (a) and (b)
questions.  Or may be both of the formulas are wrong.

I would appreciate it very much if somebody could help me.

Sincerely,

Kohkichi Hosoda

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] counting subgroup sums within a data frame

2009-08-26 Thread Shaun Grannis

Hi,

I'm sure there's an easy approach to this issue, I'm just not seeing it.

I have a data frame of the following form:

  Date classsubclass   count
8/1/2009AX  1
8/1/2009BX  2
8/1/2009AY  9
8/1/2009BY  3
8/2/2009AX  1
8/2/2009BX  5
8/2/2009AY  4
8/2/2009BY  2
8/3/2009AX  6
8/3/2009BX  4
8/3/2009AY  3
8/3/2009BY  4
8/4/2009AX  1
8/4/2009BX  9
8/4/2009AY  3
8/4/2009BY  5
8/5/2009AX  3
8/5/2009BX  7
8/5/2009AY  2
8/5/2009BY  1

I would like to create a data frame of the sum the daily counts for,  
say, class 'A', like so:


  Date sum_of_counts
8/1/2009   10
8/2/20095
8/3/20099
8/4/20094
8/5/20095

I ultimately would like to do sum of counts on all classes and  
subclasses.  It seems that this is equivalent to a GROUP BY query in  
SQL.

I'm sure this is possible in R. Any suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] teaching R

2009-08-26 Thread Michael Nestrud

Hello all,

I am going to be running a small statistics workshop using R sometime
in November.  I am restricted to R because of the specific libraries I
will be using - a good thing in my book - however the attendees are
unfamiliar with R.  I plan on giving as little R information as
possible - just what is absolute necessary to run the statistics (the
workshop is short, no time to spend hours teaching R).  Has anyone
done anything like this?  Are there any public powerpoint/pdfs/etc.
available for this type of application?

Any advice / what works / what doesn't work is appreciated for those
that have tried this before me.

Sincerely,

-Michael

-- 
Michael A. Nestrud
Cornell U. Sensory Science PhD Student
m...@ataraxis.org
All that you taste... all that you eat.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Appending strings at the beginning of a text file

2009-08-26 Thread Henrique Dallazuanna

It works with paste, but I forgotten the close(fConn) after the writeLines.

On Tue, Aug 25, 2009 at 2:48 PM, Paul Smith phh...@gmail.com wrote:

 Thanks, Henrique.

 I think you mean the following:

 fConn - file('test.txt', 'r+')
 Lines - readLines(fConn)
 writeLines(c(Text at beginning of file\n, Lines), con = fConn)
 close(fConn)

 (With paste(), it does not work.)

 Paul

 On Tue, Aug 25, 2009 at 5:02 PM, Henrique Dallazuannawww...@gmail.com
 wrote:
  Try this;
 
  fConn - file('test.txt', 'r+')
  Lines - readLines(fConn)
  writeLines(paste(Text at beginning of file, Lines, sep = \n),
 con = fConn)
 
  On Tue, Aug 25, 2009 at 12:54 PM, Paul Smith phh...@gmail.com wrote:
 
  Dear All,
 
  I have a piece of text that I want to append to a text file at the
  beginning of the text file.
 
  I have thought about using cat() with the option 'append=T', but the
  appending, in this case, is done at the bottom of the text file. Any
  ideas?
 
  Thanks in advance,
 
  Paul
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  --
  Henrique Dallazuanna
  Curitiba-Paraná-Brasil
  25° 25' 40 S 49° 16' 22 O
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting subgroup sums within a data frame

2009-08-26 Thread Henrique Dallazuanna

Try this:

with(d, tapply(count, list(Date, class), sum))

On Wed, Aug 26, 2009 at 10:07 AM, Shaun Grannis sgran...@regenstrief.orgwrote:

 Hi,

 I'm sure there's an easy approach to this issue, I'm just not seeing it.

 I have a data frame of the following form:

  Date classsubclass   count
 8/1/2009AX  1
 8/1/2009BX  2
 8/1/2009AY  9
 8/1/2009BY  3
 8/2/2009AX  1
 8/2/2009BX  5
 8/2/2009AY  4
 8/2/2009BY  2
 8/3/2009AX  6
 8/3/2009BX  4
 8/3/2009AY  3
 8/3/2009BY  4
 8/4/2009AX  1
 8/4/2009BX  9
 8/4/2009AY  3
 8/4/2009BY  5
 8/5/2009AX  3
 8/5/2009BX  7
 8/5/2009AY  2
 8/5/2009BY  1

 I would like to create a data frame of the sum the daily counts for,
 say, class 'A', like so:


  Date sum_of_counts
 8/1/2009   10
 8/2/20095
 8/3/20099
 8/4/20094
 8/5/20095

 I ultimately would like to do sum of counts on all classes and
 subclasses.  It seems that this is equivalent to a GROUP BY query in
 SQL.

 I'm sure this is possible in R. Any suggestions?

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] teaching R

2009-08-26 Thread Barry Rowlingson

On Wed, Aug 26, 2009 at 2:08 PM, Michael Nestrudm...@ataraxis.org wrote:
 Hello all,

 I am going to be running a small statistics workshop using R sometime
 in November.  I am restricted to R because of the specific libraries I
 will be using - a good thing in my book - however the attendees are
 unfamiliar with R.  I plan on giving as little R information as
 possible - just what is absolute necessary to run the statistics (the
 workshop is short, no time to spend hours teaching R).  Has anyone
 done anything like this?  Are there any public powerpoint/pdfs/etc.
 available for this type of application?

 Any advice / what works / what doesn't work is appreciated for those
 that have tried this before me.


 You could try my 'R in one page' document (which will be two pages if
you use a non-double-sided printer).

 Side 1 is a very simple introduction, and side 2 has some more
examples for people to cut and paste.

 OpenOffice source and PDF here:

http://www.maths.lancs.ac.uk/~rowlings/R/Simple/

 Feel free to take these and edit to your needs.

 There's more user-contributed docs here:

http://cran.r-project.org/other-docs.html

although a 360 page PDF called An Introduction to R is probably a
bit verbose for your needs.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting subgroup sums within a data frame

2009-08-26 Thread Shaun Grannis

Wow.

That was fast -- and spot on!

Thanks so much.

Best Regards,

Shaun


On Aug 26, 2009, at 9:11 AM, Henrique Dallazuanna wrote:

 Try this:

 with(d, tapply(count, list(Date, class), sum))

 On Wed, Aug 26, 2009 at 10:07 AM, Shaun Grannis sgran...@regenstrief.org 
  wrote:
 Hi,

 I'm sure there's an easy approach to this issue, I'm just not seeing  
 it.

 I have a data frame of the following form:

  Date classsubclass   count
 8/1/2009AX  1
 8/1/2009BX  2
 8/1/2009AY  9
 8/1/2009BY  3
 8/2/2009AX  1
 8/2/2009BX  5
 8/2/2009AY  4
 8/2/2009BY  2
 8/3/2009AX  6
 8/3/2009BX  4
 8/3/2009AY  3
 8/3/2009BY  4
 8/4/2009AX  1
 8/4/2009BX  9
 8/4/2009AY  3
 8/4/2009BY  5
 8/5/2009AX  3
 8/5/2009BX  7
 8/5/2009AY  2
 8/5/2009BY  1

 I would like to create a data frame of the sum the daily counts for,
 say, class 'A', like so:


  Date sum_of_counts
 8/1/2009   10
 8/2/20095
 8/3/20099
 8/4/20094
 8/5/20095

 I ultimately would like to do sum of counts on all classes and
 subclasses.  It seems that this is equivalent to a GROUP BY query in
 SQL.

 I'm sure this is possible in R. Any suggestions?

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -- 
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Appending strings at the beginning of a text file

2009-08-26 Thread Paul Smith

Henrique,

With paste(), it works only if the file has only one line; otherwise,

Text at beginning of file

is repeated after each line of the file.

Paul


On Wed, Aug 26, 2009 at 2:03 PM, Henrique Dallazuannawww...@gmail.com wrote:
 It works with paste, but I forgotten the close(fConn) after the writeLines.

 On Tue, Aug 25, 2009 at 2:48 PM, Paul Smith phh...@gmail.com wrote:

 Thanks, Henrique.

 I think you mean the following:

 fConn - file('test.txt', 'r+')
 Lines - readLines(fConn)
 writeLines(c(Text at beginning of file\n, Lines), con = fConn)
 close(fConn)

 (With paste(), it does not work.)

 Paul

 On Tue, Aug 25, 2009 at 5:02 PM, Henrique Dallazuannawww...@gmail.com
 wrote:
  Try this;
 
  fConn - file('test.txt', 'r+')
  Lines - readLines(fConn)
  writeLines(paste(Text at beginning of file, Lines, sep = \n),
     con = fConn)
 
  On Tue, Aug 25, 2009 at 12:54 PM, Paul Smith phh...@gmail.com wrote:
 
  Dear All,
 
  I have a piece of text that I want to append to a text file at the
  beginning of the text file.
 
  I have thought about using cat() with the option 'append=T', but the
  appending, in this case, is done at the bottom of the text file. Any
  ideas?
 
  Thanks in advance,
 
  Paul
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
  --
  Henrique Dallazuanna
  Curitiba-Paraná-Brasil
  25° 25' 40 S 49° 16' 22 O
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] counting subgroup sums within a data frame

2009-08-26 Thread ONKELINX, Thierry

Have a look at the reshape package.

Assuming that your data is in a data.frame called dataset.

cast(Date ~ ., data = dataset, value = count, fun = sum) 
cast(Date ~ class, data = dataset, value = count, fun = sum) 
cast(Date + class ~ ., data = dataset, value = count, fun = sum) 

Or the plyr package

ddply(dataset, c(Date), function(x){c(Sum_of_counts = sum(x$count))})
ddply(dataset, c(Date, class), function(x){c(Sum_of_counts =
sum(x$count))})

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Shaun Grannis
Verzonden: woensdag 26 augustus 2009 15:07
Aan: r-help@r-project.org
Onderwerp: [R] counting subgroup sums within a data frame

Hi,

I'm sure there's an easy approach to this issue, I'm just not seeing it.

I have a data frame of the following form:

  Date classsubclass   count
8/1/2009AX  1
8/1/2009BX  2
8/1/2009AY  9
8/1/2009BY  3
8/2/2009AX  1
8/2/2009BX  5
8/2/2009AY  4
8/2/2009BY  2
8/3/2009AX  6
8/3/2009BX  4
8/3/2009AY  3
8/3/2009BY  4
8/4/2009AX  1
8/4/2009BX  9
8/4/2009AY  3
8/4/2009BY  5
8/5/2009AX  3
8/5/2009BX  7
8/5/2009AY  2
8/5/2009BY  1

I would like to create a data frame of the sum the daily counts for,
say, class 'A', like so:


  Date sum_of_counts
8/1/2009   10
8/2/20095
8/3/20099
8/4/20094
8/5/20095

I ultimately would like to do sum of counts on all classes and
subclasses.  It seems that this is equivalent to a GROUP BY query in
SQL.

I'm sure this is possible in R. Any suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering matrices

2009-08-26 Thread Steve Lianoglou


Hi,

On Aug 25, 2009, at 10:54 PM, bwgoudey wrote:




r-rcorr(d[[1]]) #d is matrix containing observation
r[[1]] #r values

  age sex BMI
age  1.000 -0.30010322 -0.13702263
sex -0.3001032  1.  0.06300528
BMI -0.1370226  0.06300528  1.

r[[2]] #Number of obervations

   age sex BMI
age 100 100 100
sex 100 100 100
BMI 100 100 100

r[[3]] #P values

   age sex   BMI
age  NA 0.002416954 0.1740134
sex 0.002416954  NA 0.5334484
BMI 0.174013354 0.533448366NA


Just a quick note: please provide data in an easy way for us to enter  
into our R session -- the way you provide the data requires more work  
on someone who is trying to help you in order to enter it in R, for  
instance this might have been better:


rvals - matrix(c(
  1.000, -0.30010322, -0.13702263,
  -0.3001032,  1.,  0.06300528,
  -0.1370226,  0.06300528,  1.), byrow=TRUE, nrow=3)

obs - matrix(c(
  100, 100, 100,
  100, 100, 100,
  100, 100, 100), byrow=TRUE, nrow=3)

pval - matrix(c(
 NA, 0.002416954, 0.1740134,
0.002416954,  NA, 0.5334484,
0.174013354, 0.533448366,NA), byrow=TRUE, nrow=3)

Since I can just copy and paste that into my R session and get right  
to answering your question.


If I wanted to return a matrix containing all points where  
correlation was

above 0.75 and P-value was below 0.05, how would I do this?


Where is your points matrix that you want to return the data from?

Assuming this is also a 3x3 matrix, you just build the indexing  
vectors using the data matrices of interest, and use those to pull the  
points out of your data/points matrix.


Let's assume my data/points matrix is called my.data:

#1 Get indices of points w/ correlation above 0.75
good.cor - rvals  .75

#2 Get indices of points w/ p-value  0.05, since you have NA values
# in your pval matrix, you have to *explicitly exclude* them from the  
query

good.p   - !is.na(pval)  pval  0.05

# since you want their intersection, take an  of the two vectors,
# but this is empty in this case. Either way, this would get
# the points you're after
my.data[good.cor  good.p]

Does that make sense?

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Corrado

Dear Simon,

thanks again.

Concerning the whole 36 variables  well, I have run a principal components 
analysis, and I am only using part of them (I am running a test with the pc 
which cover the 95% of variance and then the 99%). :)  so I will possibly 
end up with s(x1,,x8). I wonder if using isotropic smoothers on principal 
component is a good idea  the variance diminishes from component to 
component, so theoretically also the wiggliness of the smoother should be less 
and less  what do you think? am I saying something stupid?

If that is the case, and if I want to enclose some interaction, then I have so 
include the interaction terms manually  like s(x1,x2). Is that right?

Sorry for the avalanche of questions, but I am trying to understand the 
principles underlying the working of gam in mgcv. It looks very powerful, 
particularly for exploring dependencies.

I have run te() instead of s(), but the predictive power seems to be less than 
with s() in this particular situation. At the same time, does te() include the 
interaction? I did not understand well your previous point on interaction term 
in te(): is te(x1,,xn) build as an expansion from the t(x1),  ,t(xn)? 
Then all the interaction terms should be included 

Finally, is it possible to incorporate both s() and te() terms in the formula?

Machine learning: I am not too well versed in the area. Did you mean 
regression trees or maximum entropy models?

Best,


On Wednesday 26 August 2009 10:27:08 Simon Wood wrote:
 This will not work...

  2) y~s(x1,  ,x36)

 Estimating a 36 dimensional functions reasonably well would require a
 tremendous quantity of data, but in any case the 36 dimensional TPS
 smoothnes measure will involve such high order derivatives that it will no
 longer be practically useful: in fact you will not have enough data to
 estimate the unpenalized coefficients of the smoother (and if you did R
 would run out of memory first).

 In such a high dimensional situation, I think that GAMs are really only
 useful if you have some prior knowledge of which variables are likely to
 interact (and it's not too many of them). If there's no prior information
 saying roughly what sort of smooth additive structure might be useful then,
 I'm not sure that GAMs are the right way to go, and some sort of machine
 learning approach might be better.

 Then again, the real problem with
 y~s(x1,  ,x36)
 is that the data just won't contain enough information to estimate s, if
 all you can say is that s is smooth, but this also means that it's very
 unlikely that you really need to estimate s(x1,  ,x36) in order to
 predict well. In that case, starting from
 y ~ s(x1) +  + s(x36)
 and building the model up might result in something that does a reasonable
 predictive job.

 On the subject of tensor product smoothing vs isotropic smoothing.
 Isotropic smooths are really only reasonable if you think  that the smooth
 should display approximately the same amount of wiggliness in all
 directions. If this is not the case then tensor product smoothing is a
 better bet. Centering and scaling alone is not enough to ensure that
 isotropy is reasonable (although in particular cases it may help, of
 course).

 best,
 Simon

  I am trying to build a predictive model. Since the the variables are
  centred and scaled, I think I need an isotropic smooth. I am also
  interested in having the interactions between the variables included,
  that is not a purely additive model.
 
  It is not clear to me when should I give preference to tensor smooths,
  possibly because I have not understood well how they work.
 
  I am reading Wood(2003) as recommended and I have also read rather
  extensively Simon N. Wood. Generalized Additive Models: An Introduction,
  2006, but still I am stuck. Any additional suggestion or reading
  recommendation would be greatly appreciated.
 
  I have also some difficulties in understanding the values you have chosen
  for k in the first example (why 60?).
 
  Thanks
 
  Best,
 
  On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
   [Note R-Devel is the wrong list for such questions. R-Help is where
   this should have been directed - redirected there now]
  
   On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
Dear R-experts,
   
I have a question on the formulas used in the gam function of the
mgcv package.
   
I am trying to understand the relationships between:
   
y~s(x1)+s(x2)+s(x3)+s(x4)
   
and
   
y~s(x1,x2,x3,x4)
   
Does the latter contain the former? what about the smoothers of all
interaction terms?
  
   I'm not 100% certain how this scales to smooths of more than 2
   variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
   Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths
   of 2 variables.
  
   Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
   used to produce the smoothers in

Re: [R] teaching R

2009-08-26 Thread Liviu Andronic

Hello

On 8/26/09, Michael Nestrud m...@ataraxis.org wrote:
  Any advice / what works / what doesn't work is appreciated for those
  that have tried this before me.

It could prove helpful to forward similar questions to r-sig-teaching.
Also, there was a recent discussion on the topic [1].
Liviu

[1] https://stat.ethz.ch/pipermail/r-sig-teaching/2009q2/000157.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] Formulas in gam function of mgcv package

2009-08-26 Thread Gavin Simpson

On Tue, 2009-08-25 at 10:00 +0100, Corrado wrote:
 Dear Gavin / Rlings,
 
 thanks for your kind answer and sorry for posting to the dev mailing list.
 
 Concerning the specific of your answer:
 
 I am working with 6 to 36 covariates, and they are all centred and scaled. I 
 represented the problem with two variables to simplify the question.
 
 So ideally, the situation is:
 
 1) y ~ s(x1) +  + s(x36)
 
 vs.
 
 2) y~s(x1,  ,x36)

I think you are pushing things a bit with such a complicated smooth.
You're unlikely to be able to fit that either due to insufficient data
and / or hardware limits on your machine.

I see that Simon has responded to this as well, in a far more
comprehensive and informed manner than I could manage So I'll leave
it at that...

 
 I am trying to build a predictive model. Since the the variables are centred 
 and scaled, I think I need an isotropic smooth. I am also interested in 
 having 
 the interactions between the variables included, that is not a purely 
 additive 
 model.

That sounds a bit like data fishing; throw everything into the pot and
see what comes out of it.

snip /
 I have also some difficulties in understanding the values you have chosen for 
 k 
 in the first example (why 60?).

Sorry, that was a complication on my part. The main point was to show
that you need to try to get the same bases used in the s(x1) and s(x2)
parts of the formula; So if you had this model

y ~ s(x1, k = 20) + s(x2, k = 20)

You need something like 

y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2)

[If you wanted the bivariate smooth to be more complicated than the
default in mgcv, then you might have done:

y ~ s(x2, x2, k = 60) ## for example

in which case you could fit that model as 

y ~ s(x1, k = 20) + s(x2, k = 20) + s(x2, x2, k = 60)

] That was where the k = 60 came from, but in simplifying my response I
forgot to remove it.

Simon has since provided a more thorough response (Thanks Simon).

HTH

G

 
 Thanks
 
 Best,
 
 
 
 On Monday 24 August 2009 17:33:55 Gavin Simpson wrote:
  [Note R-Devel is the wrong list for such questions. R-Help is where this
  should have been directed - redirected there now]
 
  On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:
   Dear R-experts,
  
   I have a question on the formulas used in the gam function of the mgcv
   package.
  
   I am trying to understand the relationships between:
  
   y~s(x1)+s(x2)+s(x3)+s(x4)
  
   and
  
   y~s(x1,x2,x3,x4)
  
   Does the latter contain the former? what about the smoothers of all
   interaction terms?
 
  I'm not 100% certain how this scales to smooths of more than 2
  variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
  Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
  2 variables.
 
  Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
  used to produce the smoothers in the two models may not be the same in
  both models. One option to ensure nestedness is to fit the more
  complicated model as something like this:
 
  ## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
  y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
^
  where the last term (^^^ above) has the same k as used in s(x1, x2)
 
  Note that these are isotropic smooths; are x1 and x2 measured in the
  same units etc.? Tensor product smooths may be more appropriate if not,
  and if we specify the bases when fitting models s(x1) + s(x2) *is*
  strictly nested in te(x1, x2), eg.
 
  y ~ s(x1, bs = cr, k = 10) + s(x2, bs = cr, k = 10)
 
  is strictly nested within
 
  y ~ te(x1, x2, k = 10)
  ## is the same as y ~ te(x1, x2, bs = cr, k = 10)
 
  [Note that bs = cr is the default basis in te() smooths, hence we
  don't need to specify it, and k = 10 refers to each individual smooth in
  the te().]
 
  HTH
 
  G
 
   I have (tried to) read the manual pages of gam, formula.gam,
   smooth.terms, linear.functional.terms but could not understand properly.
  
   Regards
 
 
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Statistical question about logistic regression simulation

2009-08-26 Thread Denis Aydin


Hi R help list

I'm simulating logistic regression data with a specified odds ratio 
(beta) and have a problem/unexpected behaviour that occurs.



The datasets includes a lognormal exposure and diseased and healthy 
subjects.


Here is my loop:

ors - vector()

for(i in 1:200){

# First, I create a vector with a lognormally distributed exposure:

n - 1 # number of study subjects
mean - 6
sd - 1

expo - rlnorm(n, mean, sd)

# Then I assign each study subject a probability of disease with a
# specified Odds ratio (or beta coefficient) according to a logistic
# model:

inter - 0.01 # intercept
or - log(1.5) # an odds ratio of 1.5 or a beta of ln(1.5)

p - exp(inter + or * expo)/(1 + exp(inter + or * expo))

# Then I use the probability to decide who is having the disease and who 
# is not:


disease - rbinom(length(p), 1, p) # 1 = disease, 0 = healthy

# Then I calculate the logistic regression and extract the odds ratio

model - glm(disease ~ expo, family = binomial)

ors[i] - exp(summary(model)$coef[2]) # exponentiated beta = OR

}


Now to my questions:

1. I was expecting the mean of the odds ratios over all simulations to 
be close to the specified one (1.5 in this case). This is not the case 
if the mean of the lognormal distribution is, say 6.
If I reduce the mean of the exposure distribution to say 3, the mean of 
the simulated ORs is very close to the specified one. So the simulation 
seems to be quite sensitive to the parameters of the exposure distribution.


2. Is it somehow possible to stabilize the simulation so that it is 
not that sensitive to the parameters of the lognormal exposure 
distribution? I can't make up the parameters of the exposure 
distribution, they are estimations from real data.


3. Are there general flaws or errors in my approach?


Thanks a lot for any help on this!

All the best,
Denis

--
Denis Aydin
Institute of Social and Preventive Medicine at Swiss Tropical Institute 
Basel

Associated Institute of the University of Basel
Steinengraben 49 – 4051 Basel – Switzerland
Phone: +41 (0)61 270 22 04
Fax:   +41 (0)61 270 22 25
denis.ay...@unibas.ch
www.ispm-unibasel.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme: how to nest a random factor in a fixed factor?

2009-08-26 Thread Robert Buitenwerf


Dear all,

 

I have an experimental setup in which a random variable is nested within a 
fixed variable; however I have troubles specifying the correct LMM with lme. I 
have searched the lists but haven't been
able to find an example like my setup, which I unfortunately need to get this 
stuff right. Pinheiro  Bates is great but I still can't figure out how to do 
it. 

 

My experimental setup was as follows:

100 measurements per treatment plot

2 treatment plots per site

4 sites: 2 in one area and 2 in another area

 

Both treatment and area are fixed factors,while site is random. I am interested 
in the significance of the fixed effects,less in the magnitude of the random 
effect. 

 

I have tried:

 

mod1 - lme(response ~ area*treatment, data=data,random= ~1|site)

but now site is not nested in area…

 

mod2 - lme(response ~ area*treatment, data=data,random= ~1|area/site)

but now area is both a fixed and a random variable, which doesn't seem to make 
sense, plus I run out of df for treatment

 

mod3 - lme(response ~ area*treatment, data=data,random= ~1| plot)

but here plots are not grouped according to site

 

I hope someone would be willing to help me,
thank you in advance!

 

Robert Buitenwerf

Ecologist

South African Environmental Observation Network
_
[[elided Hotmail spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] teaching R

2009-08-26 Thread Robert W. Baer, Ph.D.


Hello all,

I am going to be running a small statistics workshop using R sometime
in November.  I am restricted to R because of the specific libraries I
will be using - a good thing in my book - however the attendees are
unfamiliar with R.  I plan on giving as little R information as
possible - just what is absolute necessary to run the statistics (the
workshop is short, no time to spend hours teaching R).  Has anyone
done anything like this?  Are there any public powerpoint/pdfs/etc.
available for this type of application?

Any advice / what works / what doesn't work is appreciated for those
that have tried this before me.


R can be very intimidating to an audience that has very little experience 
with command line interfaces.  Unlike a menu driven system there are no 
reminder clues, and it takes a little time to learn how to find help 
productively if you haven't lived in a command line world before.


If your goal is to focus on the task at hand, my recommendation is to 
prepare a short 1-2 page handout that summarizes 1) how to get datasets from 
a standard format into a usable form for you application; 2) how to do each 
of the manipulations you will talk about; and 3) how to print/save the 
output (graphical and data) that you produce.  An optional 4th section, if 
appropriate, might summarize how to learn more if they pursue the 
application on their own (i.e., use of appropriate help facilities).


Your audience can then focus on the big picture of what you are talking 
about with a security blanket of being able to reproduce it later just as if 
they had a menu to help remind them of what you said.  It may still be a 
little bit of an up hill battle with an inexperienced audience.




Sincerely,

-Michael

--
Michael A. Nestrud
Cornell U. Sensory Science PhD Student
m...@ataraxis.org
All that you taste... all that you eat.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Grid lines in cloud plot for lattice

2009-08-26 Thread Chris Jones

Hi all,

I was wondering if there was a way to draw out grid lines in the cloud  
plots for lattice. I hunted all throughout the panel.cloud and panel. 
3dscatter help pages and couldn't find anything to help me move  
forward with this. Do I have to get deeper into the drawing commands,  
or is there something more obvious I'm just not seeing? My goal would  
be to place grid lines on the bottom face of the box and make drop  
lines to the grid.


#example dat set
nirK.env.nmds$points = cbind(rnorm(30),rnorm(30),rnorm(30))

par.set -
 list(axis.line = list(col = transparent),clip = list(panel =  
off))

print(cloud(nirk.env.nmds$points[,3]~nirk.env.nmds$points[, 
1]*nirk.env.nmds$points[,2],
 col=red,pch=16,cex=1.5,type=c(p,h),
 strip = strip.custom(strip.names = TRUE),
 scales=list(arrows=FALSE,distance=2),
 xlab=NMS 1,
 ylab=NMS 2,
 zlab=NMS 3),
 par.settings=par.set,
 split = c(1,1,2,1), more=TRUE)

Many thanks!!

Chris Jones
Ph. D. Student, Department of Microbiology
Swedish University of Agricultural Sciences
chris.jo...@mikrob.slu.se




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Installing rJava RJDBC bad interpreter: Permission denied

2009-08-26 Thread Matias Silva

Trying to install the above two packages via the 
install.packages(package_name) command and
the R CMD INSTALL file.tar.gz.

I receive the following error either way sh: ./configure: /bin/sh: bad 
interpreter: Permission denied.
I have tried to chmod and chown permissions and also ran dos2unix in hopes that 
there is a CRLF
in some of the tar.gz files, but that doesn't seem to fix the problem.  I'm 
looking for any help
I can get.  Must I compile this to get this to work because of some 
incompatibility with the rpm packages??

* Installing *source* package ârJavaâ ...
sh: ./configure: /bin/sh: bad interpreter: Permission denied
ERROR: configuration failed for package ârJavaâ
* Removing â/usr/lib64/R/library/rJavaâ
* Installing *source* package âRJDBCâ ...
** R
** preparing package for lazy loading
Error : package 'rJava' required by 'RJDBC' could not be found
ERROR: lazy loading failed for package âRJDBCâ
* Removing â/usr/lib64/R/library/RJDBCâ

I'm using the rpm version of R found here 
http://cran.cnr.berkeley.edu/bin/linux/redhat/el5/x86_64
I ran the R CMD javareconf as root and set the JAVA_HOME environment variable.

Here is my system configuration:
OS: CentOS 5.2 x86_64

JDK: java version 1.6.0_12
Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)

R: 2.9.1
rJava: rJava_0.7-0.tar.gz
RJDBC: RJDBC_0.1-5.tar.gz

Thanks for your time and knowledge.

Best,
Matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mann whitney u

2009-08-26 Thread Mcdonald, Grant

Dear Sir,

I am comparing two samples using wilcox.test in R.  Literature appears to 
describe mann whitney u test as the most appropriate test to use on my data.

is the wilcox.test function equivalent to mann-whitney u?  Is there a way to 
gain the U-value as apposed to the W-value in R?

Thank you
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] increasing significant digits in smooth.spline function

2009-08-26 Thread Sergii, Ivakhno

Hello All

I have a very long vector of unique predictor values and 6 significant
digits setting for the smooth.spline rounds them off. Is there any way
of increasing the significant digits withour recompiling a lot if code
(simple editing and tham sourcing of smooth.spline.r function does not
work, probably due to presence of Fortan functional calls)? 

Thank you very much in advance

Sergii

 

R version 2.9.1 (2009-06-26)

x86_64-redhat-linux-gnu

 

locale:

LC_CTYPE=en_GB.UTF-8;LC_NUMERIC=C;LC_TIME=en_GB.UTF-8;LC_COLLATE=en_GB.U
TF-8;LC_MONETARY=C;LC_MESSAGES=en_GB.UTF-8;LC_PAPER=en_GB.UTF-8;LC_NAME=
C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATI
ON=C

 

attached base packages:

[1] stats graphics  grDevices utils datasets  methods   base

 

loaded via a namespace (and not attached):

[1] tools_2.9.1

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Batch replacement, by factor, of values in a data frame

2009-08-26 Thread Gavin Simpson

Dear List,

I'm wondering if there is a better/cleaner/more efficient way of
replacing 0 values in a variable with the minimum of the non-missing and
non-zero values of that same variable, but doing it within the levels of
a factor?

Consider the dummy example data presented at the end of my message.
Within each 'Site' there are some 0 values and possibly some NA's. I can
compute the minimum of the non-missing and non-zero values by 'Site' as
indicated below using aggregate for example. Save for looping over the
'Site's and replacing 0's with the relevant minimum is there a way of
using a vectorised approach to do the replacement?

Thanks in advance,

G

## dummy data
set.seed(123)
D - data.frame(Site = factor(rep(LETTERS[1:5], times = 10)),
Var = runif(5*10))
D - D[with(D, order(Site, Var)), ]
## simulate some 0's
D[c(1,3,11,12,23,27,34,36,41,49), Var] - 0
## just to complicate matters, some NA
D[sample(NROW(D), 3), Var] - NA
head(D)
## Compute minimums per Site
aggregate(D$Var, by = list(Site = D$Site),
  FUN = function(x) min(x[x0], na.rm = TRUE))
## How replace the appropriate 0's with the appropriate minimum?
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] attach package

2009-08-26 Thread Duncan Murdoch


On 8/26/2009 8:17 AM, Jeremy MAZET wrote:

Hello,

Is there a solution to attach a package without run the hook function 
.onAttach()


You could modify the source code to remove the hook, but why you'd want 
to do this, I don't know.  Presumably the author of the package had a 
reason to put the hook there, and the package may not work properly 
without it.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Statistical question about logistic regression simulation

2009-08-26 Thread Ravi Varadhan

Your exposure variable has very large values, so all your probabilities are
1. You also get a bunch of NaN's because the `expit' (inverse logit)
function to calculate the probabilities cannot be evaluated. You need to use
values of exposure that will yield some 0's and 1's so that the binomial
model can be estimated.

Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Denis Aydin
Sent: Wednesday, August 26, 2009 10:18 AM
To: r-help@r-project.org
Subject: [R] Statistical question about logistic regression simulation

Hi R help list

I'm simulating logistic regression data with a specified odds ratio 
(beta) and have a problem/unexpected behaviour that occurs.


The datasets includes a lognormal exposure and diseased and healthy 
subjects.

Here is my loop:

ors - vector()

for(i in 1:200){

# First, I create a vector with a lognormally distributed exposure:

n - 1 # number of study subjects
mean - 6
sd - 1

expo - rlnorm(n, mean, sd)

# Then I assign each study subject a probability of disease with a
# specified Odds ratio (or beta coefficient) according to a logistic
# model:

inter - 0.01 # intercept
or - log(1.5) # an odds ratio of 1.5 or a beta of ln(1.5)

p - exp(inter + or * expo)/(1 + exp(inter + or * expo))

# Then I use the probability to decide who is having the disease and who 
# is not:

disease - rbinom(length(p), 1, p) # 1 = disease, 0 = healthy

# Then I calculate the logistic regression and extract the odds ratio

model - glm(disease ~ expo, family = binomial)

ors[i] - exp(summary(model)$coef[2]) # exponentiated beta = OR

}


Now to my questions:

1. I was expecting the mean of the odds ratios over all simulations to 
be close to the specified one (1.5 in this case). This is not the case 
if the mean of the lognormal distribution is, say 6.
If I reduce the mean of the exposure distribution to say 3, the mean of 
the simulated ORs is very close to the specified one. So the simulation 
seems to be quite sensitive to the parameters of the exposure distribution.

2. Is it somehow possible to stabilize the simulation so that it is 
not that sensitive to the parameters of the lognormal exposure 
distribution? I can't make up the parameters of the exposure 
distribution, they are estimations from real data.

3. Are there general flaws or errors in my approach?


Thanks a lot for any help on this!

All the best,
Denis

-- 
Denis Aydin
Institute of Social and Preventive Medicine at Swiss Tropical Institute 
Basel
Associated Institute of the University of Basel
Steinengraben 49 - 4051 Basel - Switzerland
Phone: +41 (0)61 270 22 04
Fax:   +41 (0)61 270 22 25
denis.ay...@unibas.ch
www.ispm-unibasel.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmmPQL model selection

2009-08-26 Thread stephenb


Sorry for the late reply. 

Just use the first 90% of your data to fit and then predict the last 10% and
see which one is better.
If the random effects are not good it will become very obvious.

If the concern is with fixed effects then just use gls which puts the random
effects in the error and select model as usual.


Emmanuelle TASTARD wrote:
 
 Hi,
 Im sorry, I know that it is a recurrent question but I have not been
 able to find the response in the Rhelp archives.
 I think my data require the use of the glmmPQL function but I do not
 know how to make the model selection. Since the AIC and log-likelihood
 are apparently meaningless, how can we select the parameters for a model
 and compare the models to find which one fits best the data?  
 Thanks a lot
 Emmanuelle Tastard
  
 Emmanuelle TASTARD
 UMR 5174 'Evolution et Diversité Biologique'  
 Université Paul Sabatier Bat 4R3
 31062 TOULOUSE CEDEX 9 France
 tel : 05 61 55 67 59
  
 
   [[alternative HTML version deleted]]
 
 
 __
 r-h...@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html
 

-- 
View this message in context: 
http://www.nabble.com/glmmPQL-model-selection-tp3027224p25151417.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lme: how to nest a random factor in a fixed factor?

2009-08-26 Thread ONKELINX, Thierry

Dear Robert,

Since you have only 4 sites, a random effect is not so good. You would
need at least 6 sites for a good estimate of the variance. You have
enough data to treat site as a fixed effects. It only costs 2 extra
degrees of freedom. Therefore I would model this like: 

lm(response ~ (area/site)*treatment, data = data)

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Robert Buitenwerf
Verzonden: woensdag 26 augustus 2009 16:24
Aan: R Help
Onderwerp: [R] lme: how to nest a random factor in a fixed factor?


Dear all,



I have an experimental setup in which a random variable is nested within
a fixed variable; however I have troubles specifying the correct LMM
with lme. I have searched the lists but haven't been able to find an
example like my setup, which I unfortunately need to get this stuff
right. Pinheiro  Bates is great but I still can't figure out how to do
it. 



My experimental setup was as follows:

100 measurements per treatment plot

2 treatment plots per site

4 sites: 2 in one area and 2 in another area



Both treatment and area are fixed factors,while site is random. I am
interested in the significance of the fixed effects,less in the
magnitude of the random effect. 



I have tried:



mod1 - lme(response ~ area*treatment, data=data,random= ~1|site)

but now site is not nested in area...



mod2 - lme(response ~ area*treatment, data=data,random= ~1|area/site)

but now area is both a fixed and a random variable, which doesn't seem
to make sense, plus I run out of df for treatment



mod3 - lme(response ~ area*treatment, data=data,random= ~1| plot)

but here plots are not grouped according to site



I hope someone would be willing to help me, thank you in advance!



Robert Buitenwerf

Ecologist

South African Environmental Observation Network
_
[[elided Hotmail spam]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] faulty formatting of toLatex(sessionInfo())

2009-08-26 Thread Liviu Andronic

On 8/26/09, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote:
 G. Jay Kerns wrote:
  There was a closely related discussion last April:
 
 
 https://stat.ethz.ch/pipermail/r-devel/2009-April/053094.html
 
  and IIRC this was fixed for R version 2.10.
 
  If you still have trouble just do

  s - toLatex(sessionInfo())
  cat(s[-grep('Locale',s)], sep='\n')

This works great. Thank you both for the info.
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mann whitney u

2009-08-26 Thread David Winsemius



On Aug 26, 2009, at 7:18 AM, Mcdonald, Grant wrote:


Dear Sir,

I am comparing two samples using wilcox.test in R.  Literature  
appears to describe mann whitney u test as the most appropriate test  
to use on my data.


is the wilcox.test function equivalent to mann-whitney u?


When used in its two-sample mode the answer is yes.


Is there a way to gain the U-value as apposed to the W-value in R?


Answering that question was as simple as going to :
http://search.r-project.org/nmz.html

 typing in mann-whitney u

... and a few clicks.

Try it yourself.


--


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help regarding frequency distribution Graphs

2009-08-26 Thread David Winsemius



On Aug 26, 2009, at 7:32 AM, anupam sinha wrote:


Hi all,
   I am trying to construct a frequency distribution graph  
i.e.
suppose there is a variable *k* and it takes a range of values.  
What I
want to do is to plot *P(k) *(probability/frequency of finding a  
specific

value of *k*) vs *k*.I
would like to get a smooth curve  between P(k) vs k. I  have  
already tried
*hist* function in R but want something more specific. Can anyone  
help me

out ?  Thanks in advance.



?density

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Installing rJava RJDBC bad interpreter: Permission denied

2009-08-26 Thread William Dunlap

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Matias Silva
 Sent: Wednesday, August 26, 2009 7:37 AM
 To: r-help@r-project.org
 Subject: [R] Installing rJava RJDBC bad interpreter: Permission denied

 Trying to install the above two packages via the 
 install.packages(package_name) command and
 the R CMD INSTALL file.tar.gz.

 I receive the following error either way sh: ./configure: 
 /bin/sh: bad interpreter: Permission denied.

This may mean that your /bin/sh does not have execution
permission.  (It is similar to the message you get when
you start a script with #!/bin/nosuchshell and run it with
'sh -c script'.)

What does
ls -l /bin/sh /bin/bash
show?

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

 I have tried to chmod and chown permissions and also ran 
 dos2unix in hopes that there is a CRLF
 in some of the tar.gz files, but that doesn't seem to fix the 
 problem.  I'm looking for any help
 I can get.  Must I compile this to get this to work because 
 of some incompatibility with the rpm packages??

 * Installing *source* package ârJavaâ ...
 sh: ./configure: /bin/sh: bad interpreter: Permission denied
 ERROR: configuration failed for package ârJavaâ
 * Removing â/usr/lib64/R/library/rJavaâ
 * Installing *source* package âRJDBCâ ...
 ** R
 ** preparing package for lazy loading
 Error : package 'rJava' required by 'RJDBC' could not be found
 ERROR: lazy loading failed for package âRJDBCâ
 * Removing â/usr/lib64/R/library/RJDBCâ

 I'm using the rpm version of R found here 
 http://cran.cnr.berkeley.edu/bin/linux/redhat/el5/x86_64
 I ran the R CMD javareconf as root and set the JAVA_HOME 
 environment variable.

 Here is my system configuration:
 OS: CentOS 5.2 x86_64

 JDK: java version 1.6.0_12
 Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)

 R: 2.9.1
 rJava: rJava_0.7-0.tar.gz
 RJDBC: RJDBC_0.1-5.tar.gz

 Thanks for your time and knowledge.

 Best,
 Matt

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] increasing significant digits in smooth.spline function

2009-08-26 Thread David Winsemius



On Aug 26, 2009, at 9:26 AM, Sergii, Ivakhno wrote:


Hello All

I have a very long vector of unique predictor values and 6 significant
digits setting for the smooth.spline rounds them off. Is there any way
of increasing the significant digits withour recompiling a lot if code
(simple editing and tham sourcing of smooth.spline.r function does  
not

work, probably due to presence of Fortan functional calls)?


?options

options(digits=12)

What you see on the screen is not necessarily what is happening  
inside. There is no rounding unless you force such.


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Statistical question about logistic regression simulatio

2009-08-26 Thread Ted Harding

On 26-Aug-09 14:17:40, Denis Aydin wrote:
 Hi R help list
 I'm simulating logistic regression data with a specified odds ratio 
 (beta) and have a problem/unexpected behaviour that occurs.
 
 The datasets includes a lognormal exposure and diseased and healthy 
 subjects.
 
 Here is my loop:
 
 ors - vector()
 for(i in 1:200){
 
# First, I create a vector with a lognormally distributed exposure:
 n - 1 # number of study subjects
 mean - 6
 sd - 1
 expo - rlnorm(n, mean, sd)
 
# Then I assign each study subject a probability of disease with a
# specified Odds ratio (or beta coefficient) according to a logistic
# model:
 inter - 0.01 # intercept
 or - log(1.5) # an odds ratio of 1.5 or a beta of ln(1.5)
 p - exp(inter + or * expo)/(1 + exp(inter + or * expo))
 
# Then I use the probability to decide who is having the disease and who
# is not:
 disease - rbinom(length(p), 1, p) # 1 = disease, 0 = healthy
 
# Then I calculate the logistic regression and extract the odds ratio
 model - glm(disease ~ expo, family = binomial)
 ors[i] - exp(summary(model)$coef[2]) # exponentiated beta = OR
 }
 
 Now to my questions:
 
 1. I was expecting the mean of the odds ratios over all simulations to 
 be close to the specified one (1.5 in this case). This is not the case 
 if the mean of the lognormal distribution is, say 6.
 If I reduce the mean of the exposure distribution to say 3, the mean of
 the simulated ORs is very close to the specified one. So the simulation
 seems to be quite sensitive to the parameters of the exposure
 distribution.
 
 2. Is it somehow possible to stabilize the simulation so that it is 
 not that sensitive to the parameters of the lognormal exposure 
 distribution? I can't make up the parameters of the exposure 
 distribution, they are estimations from real data.
 
 3. Are there general flaws or errors in my approach?
 
 
 Thanks a lot for any help on this!
 
 All the best,
 Denis
 
 -- 
 Denis Aydin

You need to look at the probabilities 'p' being generated by your code.

Taking first your case mean - 6 (and sorting 'expo', and using
whatever seed my system had current at the time):

  n - 1 # number of study subjects
  mean - 6
  sd - 1
  expo - sort(rlnorm(n, mean, sd))
  p - exp(inter + or * expo)/(1 + exp(inter + or * expo))

  p[1:20]
  #  [1] 0.9763438 0.9918962 0.9924002 0.9965314 0.9980887 0.9984698
  #  [7] 0.9993116 0.9993167 0.9994007 0.9994243 0.9996288 0.9997037
  # [13] 0.9998728 0.9998832 0.284 0.346 0.446 0.528
  # [19] 0.561 0.645

so that almost all of your 'p's are very close to 1.0, which means
that almost all or even all) of your responses will be 1. Indeed,
continuiung from the above:

  disease - rbinom(length(p), 1, p)
  disease[1:20]
  #  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
  sum(disease)
  # [1] NaN
  sum(is.nan(disease))
  # [1] 710

What has happened here is that the higher values of 'expo' are so
large (in the 1000s) that the calculation of 'p' gives NA, because
the value of exp(inter + or * expo) is +Inf, so the calculation of
'p' is in terms of (+Inf)/(+Inf), which is NA.

Now compare with what happens when mean - 3:

  mean - 3
  sd - 1
  expo - sort(rlnorm(n, mean, sd))
  p - exp(inter + or * expo)/(1 + exp(inter + or * expo))
  p[1:20]
  #  [1] 0.5514112 0.5543155 0.5702318 0.5830025 0.5885994 0.5889078
  #  [7] 0.5908860 0.6004657 0.6029123 0.6042805 0.6048688 0.6122290
  # [13] 0.6123407 0.6135233 0.6137499 0.6139299 0.6153900 0.6181017
  # [19] 0.6184093 0.6203757
  sum(is.na(p))
  #  [1] 0
  max(expo)
  #  [1] 728.0519

So now no NAs (max(expo), though large, is now not large enough to
make the calculation of 'p' yield NA).

These smaller probabilities are now well away from 1.0, so a good mix
of 0 and 1 responses can be expected, although a good number of
the 'p's will still be very close to 1 or will be set equal to 1.

  disease - rbinom(length(p), 1, p)
  disease[1:20]
  #  [1] 0 1 0 0 1 0 1 1 1 0 1 0 0 1 0 1 1 0 0 1

(as expected), and

  sum(disease)
  # [1] 9740

As well as the problem with p = NA --- disease = NaN, when you
have all the probabiltiies close to 1 and (as above) get all the
'disease' outcomes = 1, the resulting attempt to fit the glm will
yield nonsense.

In summary: do not use silly paramater values for the model you
are simulating. It will almost always not work (for reasons
illustrated above), and even if it appreas to work the result
will be highly unreliable. If in doubt, have a look at what you
are getting, along the line, as illustrated above!

The above reasons almost certainly underlie your finding that the
mean of simulated OR estimates is markedly different from the value
which you set when you run the case mean - 6, and the much
better finding when you run the case mean - 3.

Hoping this helps,
Ted.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 26-Aug-09

Re: [R] mann whitney u

2009-08-26 Thread Stefan Grosse

On Wed, 26 Aug 2009 12:18:53 +0100 Mcdonald, Grant
grant.mcdonal...@imperial.ac.uk wrote:

MG is the wilcox.test function equivalent to mann-whitney u?  Is there

Yes, the test is the same. It is also called wilcoxon mann whitney test
because the authors created the test independently.

MG a way to gain the U-value as apposed to the W-value in R?

What would be the purpose of this when you already have the p-value
given? As far as I know W and U are related. Look at
http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U
and calculate the U1 U2 with the values of the example(wilcox.test). 

You have not described how your data look like. In case of tie's
(equal values) use wilcox_test of the coin package.

hth
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glmmPQL and variance structure

2009-08-26 Thread stephenb


this is very late, but I saw this now as I am dealing with it now:

I think varPower should not be needed here. The family should be one of the
quasi families eg quasibinomial and that will automatically allow
variance/dispersion to become a function of the fit. This is a feature of
glm inherently, which does not exist in lme, so it is called with varPower
(in lme).

If you see that, pls post whether it works as expected on the original
dataset.

regards
Stephen




Spencer Graves wrote:
 
 Thanks for providing a partially reproducible example.  I believe the 
 error message you cite came from lme.  I say this, because I modified 
 your call to glmmPQL2 to call lme and got the following:
 
   library(nlme)
   fit.lme - lme(y ~ trt + I(week  2), random = ~ 1 | ID,
 +  data = bacteria, weights=varPower(~1))
 Error in unlist(x, recursive, use.names) :
   argument not a list
 
 I consulted Pinheiro and Bates (2000) Mixed-Effects Models in S and 
 S-Plus (Springer, sec. 5.2, p.211) to see that the syntax for varPower 
 appears to be correct.  I removed ~ and it worked, mostly:
 
   fit.lme - lme(y ~ trt + I(week  2), random = ~ 1 | ID,
 +  data = bacteria, weights=varPower(1))
 Warning message:
 - not meaningful for factors in: Ops.factor(y[revOrder], Fitted)
 
 I got an answer, though with a warning and not for the problem you 
 want to solve.  However, I then made this modification to a call to my 
 own modification of Venables and Ripley's glmmPQL and it worked:
 
   fit. - glmmPQL.(y ~ trt + I(week  2), random = ~ 1 | ID,
 +  family = binomial, data = bacteria,
 +  weights.lme=varPower(1))
 iteration 1
 iteration 2
 iteration 3
   fit.
 Linear mixed-effects model fit by maximum likelihood
Data: bacteria
Log-likelihood: -541.0882
Fixed: y ~ trt + I(week  2)
  (Intercept) trtdrugtrtdrug+ I(week  2)TRUE
2.7742329  -1.0852566  -0.5896635  -1.2682626
 
 Random effects:
   Formula: ~1 | ID
   (Intercept) Residual
 StdDev: 4.940885e-05 2.519018
 
 Variance function:
   Structure: Power of variance covariate
   Formula: ~fitted(.)
   Parameter estimates:
  power
 0.3926788
 Number of Observations: 220
 Number of Groups: 50
  
 This function glmmPQL. adds an argument weights.lme to Venables 
 and Ripley's glmmPQL and uses that in place of 
 'quote(varFixed(~invwt))' when provided;  see below.
 
 hope this helps.
 spencer graves
 glmmPQL. -
 function (fixed, random, family, data, correlation, weights,
  weights.lme, control, niter = 10, verbose = TRUE, ...)
 {
  if (!require(nlme))
  stop(package 'nlme' is essential)
  if (is.character(family))
  family - get(family)
  if (is.function(family))
  family - family()
  if (is.null(family$family)) {
  print(family)
  stop('family' not recognized)
  }
  m - mcall - Call - match.call()
  nm - names(m)[-1]
  wts.lme - mcall$weights.lme
  keep - is.element(nm, c(weights, data, subset, na.action))
  for (i in nm[!keep]) m[[i]] - NULL
  allvars - if (is.list(random))
  allvars - c(all.vars(fixed), names(random),
 unlist(lapply(random,
  function(x) all.vars(formula(x)
  else c(all.vars(fixed), all.vars(random))
  Terms - if (missing(data))
  terms(fixed)
  else terms(fixed, data = data)
  off - attr(Terms, offset)
  if (length(off - attr(Terms, offset)))
  allvars - c(allvars, as.character(attr(Terms, variables))[off
 +
  1])
  m$formula - as.formula(paste(~, paste(allvars, collapse = +)))
  environment(m$formula) - environment(fixed)
  m$drop.unused.levels - TRUE
  m[[1]] - as.name(model.frame)
  mf - eval.parent(m)
  off - model.offset(mf)
  if (is.null(off))
  off - 0
  w - model.weights(mf)
  if (is.null(w))
  w - rep(1, nrow(mf))
  mf$wts - w
  fit0 - glm(formula = fixed, family = family, data = mf,
  weights = wts, ...)
  w - fit0$prior.weights
  eta - fit0$linear.predictor
  zz - eta + fit0$residuals - off
  wz - fit0$weights
  fam - family
  nm - names(mcall)[-1]
  keep - is.element(nm, c(fixed, random, data, subset,
  na.action, control))
  for (i in nm[!keep]) mcall[[i]] - NULL
  fixed[[2]] - quote(zz)
  mcall[[fixed]] - fixed
  mcall[[1]] - as.name(lme)
  mcall$random - random
  mcall$method - ML
  if (!missing(correlation))
  mcall$correlation - correlation
 #   weights.lme
  {
   if(is.null(wts.lme))
 mcall$weights - quote(varFixed(~invwt))
   else
 mcall$weights - wts.lme
 }
  mf$zz - zz
  mf$invwt - 1/wz
  mcall$data - mf
  for (i in 1:niter) {
  if (verbose)
  cat(iteration, i, \n)
  fit - eval(mcall)
  etaold - eta
  eta -

Re: [R] Batch replacement, by factor, of values in a data frame

2009-08-26 Thread Phil Spector


The ave function is very handy for things like this:

mins = ave(D$Var,D$Site,FUN=function(x)min(x[x0],na.rm=TRUE))
D$Var = ifelse(is.na(D$Var) | D$Var == 0,mins,D$Var)

should do the required replacements.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Wed, 26 Aug 2009, Gavin Simpson wrote:


Dear List,

I'm wondering if there is a better/cleaner/more efficient way of
replacing 0 values in a variable with the minimum of the non-missing and
non-zero values of that same variable, but doing it within the levels of
a factor?

Consider the dummy example data presented at the end of my message.
Within each 'Site' there are some 0 values and possibly some NA's. I can
compute the minimum of the non-missing and non-zero values by 'Site' as
indicated below using aggregate for example. Save for looping over the
'Site's and replacing 0's with the relevant minimum is there a way of
using a vectorised approach to do the replacement?

Thanks in advance,

G

## dummy data
set.seed(123)
D - data.frame(Site = factor(rep(LETTERS[1:5], times = 10)),
   Var = runif(5*10))
D - D[with(D, order(Site, Var)), ]
## simulate some 0's
D[c(1,3,11,12,23,27,34,36,41,49), Var] - 0
## just to complicate matters, some NA
D[sample(NROW(D), 3), Var] - NA
head(D)
## Compute minimums per Site
aggregate(D$Var, by = list(Site = D$Site),
 FUN = function(x) min(x[x0], na.rm = TRUE))
## How replace the appropriate 0's with the appropriate minimum?
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Select top three values from data frame

2009-08-26 Thread Don MacQueen

Do you want just the values (i.e., a vector), or do you also want the 
corresponding rows of the data frame?


What if there is a tie, or do you know in advance that within any 
particular subset the values of B are unique?


What if the subset that meets the constraints has fewer than 3 unique 
values? (which I think is the case in your example)



   tail(  unique( sort( df$B[  df$A=='x'  df$C  2 ] ) ) ,3 )

Should do it (but I haven't tested).

Why does it get messy with over 100 columns?
I'll pretend for the moment that you have exactly 100 columns:
   1)  you will be doing this many times, each time with a different 
sets of 3 columns?
   2)  you want the three highest values in each of 98 columns based 
on constraints on the other two?
   3)  you want the three highest values of B based on constraints on 
all of the other 99 columns?


Depending on what changes when more columns are involved, you might 
be able to loop over columns with syntax like,


   for (nm in c('B','D','E') )   tail(  unique( sort( df[[nm]][ 
df$A=='x'  df$C  2 ] ) ) ,3 )


-Don

At 1:36 AM -0700 8/26/09, Noah Silverman wrote:

Hi,

I'm trying to find an easy way to do this.

I want to select the top three values of a specific column in a 
subset of rows in a data.frame.  I'll demonstrate.


ABC
x21
x41
x32
y15
y26
y38


I want the top 3 values of B from the data.frame where A=X and C 2

I could extract all the rows where C2, then sort by B, then take 
the first 3.  But that seems like the wrong way around, and it also 
will get messy with real data of over 100 columns.


Any suggestions?

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help regarding frequency distribution Graphs

2009-08-26 Thread milton ruser

Google R gallery :-)

On Wed, Aug 26, 2009 at 7:32 AM, anupam sinha anupam.cont...@gmail.comwrote:

 Hi all,
I am trying to construct a frequency distribution graph i.e.
 suppose there is a variable *k* and it takes a range of values. What I
 want to do is to plot *P(k) *(probability/frequency of finding a specific
 value of *k*) vs *k*.I
 would like to get a smooth curve  between P(k) vs k. I  have already
 tried
 *hist* function in R but want something more specific. Can anyone help me
 out ?  Thanks in advance.


 Regards,


 Anupam Sinha

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GLMs

2009-08-26 Thread Steve Lianoglou


Hi,

On Aug 26, 2009, at 6:53 AM, Letizia Campioni wrote:



Hi,

I am starting to work with R.

I need to performe a General linear model and a Generalized mixed  
model, what are the package I have to use for?



R RSiteSearch(general linear model)
R RSiteSearch(Generalized mixed model)

Will provide many answers.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with RHEL 5 repo and the latest R RPMs

2009-08-26 Thread Hugh Brown

Martyn Plummer wrote:
 It turns out that the default checksum for createrepo, the command
 that creates repository metadata, has changed from sha1 to sha256,
 whereas Enterprise Linux 5 still requires sha1. I have modified the
 scripts to use the older checksum for EL4 and EL5.

 Sorry for the slow reply. Lance Brown from Duke University already
 raised this problem with Bob Kinney and me.  When I saw a query from
 Hugh Brown I was not paying enough attention to see that you were
 somebody else.

This appears to have solved the problem.  Thanks very much for your help!

--
Hugh Brown, Systems Manager
The Centre for High-Throughput Biology
hbr...@chibi.ubc.ca

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Scripting - sort of

2009-08-26 Thread Charles Annis, P.E.

Dear R-ians:

 

I'm running R2.9.2 on a 6 year old Windows XP DELL with 2 Gig RAM and a 3MHz
Pentium 4 chip.

 

I've written a package using the User Menus Under Windows commands
(winMenuAdd, etc).  It works very well.  

 

I have 6 test cases and running any one of them requires many selections
form the menus and some keyboard entry, and it takes me hours to exercise
them all to see that my changes to the R code produced no unexpected
results.

 

Is there a way to record my mouse movements and mouse clicks, as well as
keyboard entry, in a script, so that I can just watch to see if all unfolds
correctly?

 

Many thanks for any suggestions.

 

 

Charles Annis, P.E.

 mailto:charles.an...@statisticalengineering.com
charles.an...@statisticalengineering.com
phone: 561-352-9699
eFax:  614-455-3265
 http://www.StatisticalEngineering.com
http://www.StatisticalEngineering.com

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tweedie and lmer

2009-08-26 Thread Mohammad AlMarzouq


Hello all,

I have count data with about 36% of observations being zeros. I found  
in some of the examples of the r-help mail archives that a tweedie  
family of distributions could be used to fit a model with random  
effects. Upon installing the tweedie package and attempting to fit the  
following model:


lmer(SUS ~ 1 + (1| 
GRP),REML=FALSE,data=mydata,family=tweedie(var.power=1.55,link.power=0))


I get the following error:

Error in famType(glmFit$family) : unknown GLM family: ‘Tweedie’

If it helps, im on a mac with R V 2.9.1, lme4 V.0.999375-31, Tweedie  
V2.0.


Thanks,

Mohammad AlMarzouq

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scripting - sort of

2009-08-26 Thread Gabor Grothendieck

autoit and autohotkey are two free Windows utilities based on the
basic language that can be used for GUI scripting.

On Wed, Aug 26, 2009 at 12:22 PM, Charles Annis,
P.E.charles.an...@statisticalengineering.com wrote:
 Dear R-ians:



 I'm running R2.9.2 on a 6 year old Windows XP DELL with 2 Gig RAM and a 3MHz
 Pentium 4 chip.



 I've written a package using the User Menus Under Windows commands
 (winMenuAdd, etc).  It works very well.



 I have 6 test cases and running any one of them requires many selections
 form the menus and some keyboard entry, and it takes me hours to exercise
 them all to see that my changes to the R code produced no unexpected
 results.



 Is there a way to record my mouse movements and mouse clicks, as well as
 keyboard entry, in a script, so that I can just watch to see if all unfolds
 correctly?



 Many thanks for any suggestions.





 Charles Annis, P.E.

  mailto:charles.an...@statisticalengineering.com
 charles.an...@statisticalengineering.com
 phone: 561-352-9699
 eFax:  614-455-3265
  http://www.StatisticalEngineering.com
 http://www.StatisticalEngineering.com






        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying do.call to a data.frame using function arguments

2009-08-26 Thread miller_2555



miller_2555 wrote:
 
 I'm trying to convert a data.frame to a series of strings (row-wise).
 There was a very good discussion awhile back (2002) entitled [R] string
 concatenate across rows of a matrix?? where Tony Plate recommended the
 following two alternatives (x2 is an R object of type data frame -- a
 matrix also works for solution #1): 
 1) apply(format(x2), 1, paste, collapse= );
 2) do.call(paste,x2)
 

Nevermind. Stupid question. The solution is:
do.call(paste,c(x2,sep=',')); 

Hope this helps somebody.
-- 
View this message in context: 
http://www.nabble.com/Applying-do.call-to-a-data.frame-using-function-arguments-tp25151441p25151445.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] access to source code of a website ..?

2009-08-26 Thread Martin Batholdy


hi,

is it possible to read the source code of a website within R?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying do.call to a data.frame using function arguments

2009-08-26 Thread hadley wickham

On Wed, Aug 26, 2009 at 11:31 AM,
miller_2555nabble.30.miller_2...@spamgourmet.com wrote:


 miller_2555 wrote:

 I'm trying to convert a data.frame to a series of strings (row-wise).
 There was a very good discussion awhile back (2002) entitled [R] string
 concatenate across rows of a matrix?? where Tony Plate recommended the
 following two alternatives (x2 is an R object of type data frame -- a
 matrix also works for solution #1):
     1) apply(format(x2), 1, paste, collapse= );
     2) do.call(paste,x2)


 Nevermind. Stupid question. The solution is:
 do.call(paste,c(x2,sep=','));

I think you're missing some quotes:
cat(do.call(paste,c(x2,sep=','))[1], \n)

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] GLMs

2009-08-26 Thread Ben Bolker




Letizia Campioni wrote:
 
 
 Hi,
 
 I am starting to work with R. 
 
 I need to performe a General linear model and a Generalized mixed model,
 what are the package I have to use for?
 
 what is the difference between them?
 
 

General linear models (called GLM mostly by SAS users) are implemented by lm
(no need
to install or load additional packages).
General*ized* linear models (called GLM by everyone else): use glm (still no
need for
extra packages).
Linear mixed models (LMMs): nlme (library(nlme), no need to install
extras.
Generalized linear mixed models (GLMMS): lme4 -- install.packages(lme4);
library(lme4), ?glmer

  This list is not generally for general statistics questions -- you're
expected to know *what*
you want to do, and we will help you with *how* to do it in R (provided you
have read
the posting guide and formulated a useful question).  I would suggest a book
-- check out
the R books page ( http://www.r-project.org/doc/bib/R-books.html ) -- maybe
Faraway?

  

-- 
View this message in context: 
http://www.nabble.com/GLMs-tp25150823p25151455.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] changing equal values on matrix by same random number

2009-08-26 Thread milton ruser

Dear all,

I have about 30,000 matrix (512x512), with values from 1 to N.
Each value on a matrix represent a habitat patch on my
matrix (i.e. my landscape). Non-habitat are stored as ZERO.
No I need to change each 1-to-N values for the same random
number.

Just supose my matrix is:
mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
0,0,0,0,2,2,2,0,0,0,0,
0,0,0,0,2,2,2,0,0,0,0,
3,3,0,0,0,0,0,0,0,4,4,
3,3,0,0,0,0,0,0,0,0,0), nrow=5)

I would like that all cells with 1 come to be
runif(1,min=0.4, max=0.7), and cells with 2
be replace by another runif(...).

I can do it using for(), but it is very time expensive.
Any help are welcome.

cheers

milton

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing equal values on matrix by same random number

2009-08-26 Thread David Huffer

You want to replace all 1s each with the same random nuumber from the uniform 
distribution, then all 2s each with the same random nuumber from the uniform 
distribution, and so forth?  Are the arguments (i.e., the min and max of the 
distribution) to each call to runif identical?

--
 David
 
 -
 David Huffer, Ph.D.   Senior Statistician
 CSOSA/Washington, DC   david.huf...@csosa.gov
 -

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of milton ruser
Sent: Wednesday, August 26, 2009 12:54 PM
To: r-help@r-project.org
Subject: [R] changing equal values on matrix by same random number

Dear all,

I have about 30,000 matrix (512x512), with values from 1 to N.
Each value on a matrix represent a habitat patch on my
matrix (i.e. my landscape). Non-habitat are stored as ZERO.
No I need to change each 1-to-N values for the same random
number.

Just supose my matrix is:
mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
0,0,0,0,2,2,2,0,0,0,0,
0,0,0,0,2,2,2,0,0,0,0,
3,3,0,0,0,0,0,0,0,4,4,
3,3,0,0,0,0,0,0,0,0,0), nrow=5)

I would like that all cells with 1 come to be
runif(1,min=0.4, max=0.7), and cells with 2
be replace by another runif(...).

I can do it using for(), but it is very time expensive.
Any help are welcome.

cheers

milton

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing equal values on matrix by same random number

2009-08-26 Thread David Winsemius



On Aug 26, 2009, at 12:53 PM, milton ruser wrote:


Dear all,

I have about 30,000 matrix (512x512), with values from 1 to N.
Each value on a matrix represent a habitat patch on my
matrix (i.e. my landscape). Non-habitat are stored as ZERO.
No I need to change each 1-to-N values for the same random
number.

Just supose my matrix is:
mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
0,0,0,0,2,2,2,0,0,0,0,
0,0,0,0,2,2,2,0,0,0,0,
3,3,0,0,0,0,0,0,0,4,4,
3,3,0,0,0,0,0,0,0,0,0), nrow=5)

I would like that all cells with 1 come to be
runif(1,min=0.4, max=0.7), and cells with 2
be replace by another runif(...).


First the wrong way and then the right way:

 mymat[mymat==1] - runif(1,min=0.4,max=0.7)
 mymat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 0.457316100200000 3 0
[2,] 0.457316100202000 0 0
[3,] 0.457316100202004 0 0
[4,] 0.00000002304 0 0
[5,] 0.00000000303 0 0

All the values are the same, clearly not what was desired.

Put it back to your starting point:

 mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
+ 0,0,0,0,2,2,2,0,0,0,0,
+ 0,0,0,0,2,2,2,0,0,0,0,
+ 3,3,0,0,0,0,0,0,0,4,4,
+ 3,3,0,0,0,0,0,0,0,0,0), nrow=5)

# So supply the proper number of random realizations:

 mymat[mymat==1] - runif(sum(mymat==1),min=0.4,max=0.7)
 mymat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 0.574566500200000 3 0
[2,] 0.695641800202000 0 0
[3,] 0.693546600202004 0 0
[4,] 0.00000002304 0 0
[5,] 0.00000000303 0 0

If you want to supply a matrix of max and min values for the other  
integers there would probably be an *apply approach that could be used.




I can do it using for(), but it is very time expensive.
Any help are welcome.

cheers



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing equal values on matrix by same random number

2009-08-26 Thread milton ruser

Hi David,
Thanks for the reply. This is what I need:

 mymat[mymat==1] - runif(1,min=0.4,max=0.7)
 mymat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 0.457316100200000 3 0
[2,] 0.457316100202000 0 0
[3,] 0.457316100202004 0 0
[4,] 0.00000002304 0 0
[5,] 0.00000000303 0 0

But as my real landscapes have values from 1 to large number (~10,),
so I think that if I put this on a for() looping it will be very time
expensive,
and as I have a lot of landscapes, I need to speed up it.

Any suggestion?

bests

milton



On Wed, Aug 26, 2009 at 1:12 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Aug 26, 2009, at 12:53 PM, milton ruser wrote:

 Dear all,

 I have about 30,000 matrix (512x512), with values from 1 to N.
 Each value on a matrix represent a habitat patch on my
 matrix (i.e. my landscape). Non-habitat are stored as ZERO.
 No I need to change each 1-to-N values for the same random
 number.

 Just supose my matrix is:
 mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
 0,0,0,0,2,2,2,0,0,0,0,
 0,0,0,0,2,2,2,0,0,0,0,
 3,3,0,0,0,0,0,0,0,4,4,
 3,3,0,0,0,0,0,0,0,0,0), nrow=5)

 I would like that all cells with 1 come to be
 runif(1,min=0.4, max=0.7), and cells with 2
 be replace by another runif(...).


 First the wrong way and then the right way:

  mymat[mymat==1] - runif(1,min=0.4,max=0.7)
  mymat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,] 0.457316100200000 3 0
 [2,] 0.457316100202000 0 0
 [3,] 0.457316100202004 0 0
 [4,] 0.00000002304 0 0
 [5,] 0.00000000303 0 0

 All the values are the same, clearly not what was desired.

 Put it back to your starting point:

  mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
 + 0,0,0,0,2,2,2,0,0,0,0,
 + 0,0,0,0,2,2,2,0,0,0,0,
 + 3,3,0,0,0,0,0,0,0,4,4,
 + 3,3,0,0,0,0,0,0,0,0,0), nrow=5)

 # So supply the proper number of random realizations:

  mymat[mymat==1] - runif(sum(mymat==1),min=0.4,max=0.7)
  mymat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,] 0.574566500200000 3 0
 [2,] 0.695641800202000 0 0
 [3,] 0.693546600202004 0 0
 [4,] 0.00000002304 0 0
 [5,] 0.00000000303 0 0

 If you want to supply a matrix of max and min values for the other integers
 there would probably be an *apply approach that could be used.



 I can do it using for(), but it is very time expensive.
 Any help are welcome.

 cheers


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] access to source code of a website ..?

2009-08-26 Thread Duncan Murdoch


On 8/26/2009 12:44 PM, Martin Batholdy wrote:

hi,

is it possible to read the source code of a website within R?


url(http://example.com;) creates a connection that you can do what you 
like with.  For example,


readLines(url(http://www.r-project.org;))

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Applying do.call to a data.frame using function arguments (nabble: message 8 of 20)

2009-08-26 Thread nabble . 30 . miller_2555

On Wed, Aug 26, 2009 at 12:47 PM, hadley wickham -
 I think you're missing some quotes:
 cat(do.call(paste,c(x2,sep=','))[1], \n)

Thanks - the strings are actually substrings of larger strings
(specifically, SQL statements), which will wrap with the leading and
trailing quotes (though I should have pointed this out in my original
post).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing equal values on matrix by same random number

2009-08-26 Thread David Huffer

You could try either of the two examples below. They both assume that min and 
max are invariant: 

mymat - matrix (
  c (
1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2
, 2 , 2 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2 , 2 , 2 , 0 , 0
, 0 , 0 , 3 , 3 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 4 , 4 , 3 , 3
, 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
  ) , nrow = 5
)

##  this way gives same random values for
##  each integer between 0 and
##  max (mymat):

milton - function ( x , min = 0.4 , max = 0.7 ) {
  .rep - runif ( n = max ( x ) , min = min , max = max )
  for ( i in 1:max ( x ) ) {
  x[x == i] - .rep[i]
  }
  x
}
milton ( x = mymat )

##  this way gives different random values
##  for each integer between
##  0 and max (mymat):

milton - function ( x , min = 0.4 , max = 0.7 ) {
  for ( i in 1:max ( x ) ) {
x [ x == i ] - runif (
  sum ( x == i )
  , min = min
  , max = max
)
  }
  x
}
milton ( x = mymat )
 
--
 David
 
 -
 David Huffer, Ph.D.   Senior Statistician
 CSOSA/Washington, DC   david.huf...@csosa.gov
 -


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of milton ruser
Sent: Wednesday, August 26, 2009 1:18 PM
To: David Winsemius
Cc: r-help@r-project.org
Subject: Re: [R] changing equal values on matrix by same random number

Hi David,
Thanks for the reply. This is what I need:

 mymat[mymat==1] - runif(1,min=0.4,max=0.7)
 mymat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,] 0.457316100200000 3 0
[2,] 0.457316100202000 0 0
[3,] 0.457316100202004 0 0
[4,] 0.00000002304 0 0
[5,] 0.00000000303 0 0

But as my real landscapes have values from 1 to large number (~10,),
so I think that if I put this on a for() looping it will be very time
expensive,
and as I have a lot of landscapes, I need to speed up it.

Any suggestion?

bests

milton



On Wed, Aug 26, 2009 at 1:12 PM, David Winsemius dwinsem...@comcast.netwrote:


 On Aug 26, 2009, at 12:53 PM, milton ruser wrote:

 Dear all,

 I have about 30,000 matrix (512x512), with values from 1 to N.
 Each value on a matrix represent a habitat patch on my
 matrix (i.e. my landscape). Non-habitat are stored as ZERO.
 No I need to change each 1-to-N values for the same random
 number.

 Just supose my matrix is:
 mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
 0,0,0,0,2,2,2,0,0,0,0,
 0,0,0,0,2,2,2,0,0,0,0,
 3,3,0,0,0,0,0,0,0,4,4,
 3,3,0,0,0,0,0,0,0,0,0), nrow=5)

 I would like that all cells with 1 come to be
 runif(1,min=0.4, max=0.7), and cells with 2
 be replace by another runif(...).


 First the wrong way and then the right way:

  mymat[mymat==1] - runif(1,min=0.4,max=0.7)
  mymat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,] 0.457316100200000 3 0
 [2,] 0.457316100202000 0 0
 [3,] 0.457316100202004 0 0
 [4,] 0.00000002304 0 0
 [5,] 0.00000000303 0 0

 All the values are the same, clearly not what was desired.

 Put it back to your starting point:

  mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
 + 0,0,0,0,2,2,2,0,0,0,0,
 + 0,0,0,0,2,2,2,0,0,0,0,
 + 3,3,0,0,0,0,0,0,0,4,4,
 + 3,3,0,0,0,0,0,0,0,0,0), nrow=5)

 # So supply the proper number of random realizations:

  mymat[mymat==1] - runif(sum(mymat==1),min=0.4,max=0.7)
  mymat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,] 0.574566500200000 3 0
 [2,] 0.695641800202000 0 0
 [3,] 0.693546600202004 0 0
 [4,] 0.00000002304 0 0
 [5,] 0.00000000303 0 0

 If you want to supply a matrix of max and min values for the other integers
 there would probably be an *apply approach that could be used.



 I can do it using for(), but it is very time expensive.
 Any help are welcome.

 cheers


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

[R] rJava error for large XML object return in StatET plugin

2009-08-26 Thread Harsh

Hi R List,
I get this error using StatET R plugin in Eclipse.

sessionInfo()
R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

I am running Eclipse Platform Version: 3.4.1 Build id: M20080911-1700 on
Windows XP

My R process creates a XML object which when I cat, gives the error below.
The XML object is large, containing 1000+ rows  like the one below.

  data name=subset y=1.0121137732 x=0.8018022766 ypred=0.86
yloess=1.0472/

The XML object when created with less than 800 rows does get created.

Does it have anything to do with the size of xml.


org.eclipse.core.runtime.CoreException: Communication error
at
de.walware.statet.r.nico.impl.RjsController.rjsRunMainLoop(RjsController.java:191)
at
de.walware.statet.r.nico.impl.RjsController.doSubmit(RjsController.java:410)
at
de.walware.statet.nico.core.runtime.ToolController.submitToConsole(ToolController.java:1009)
at
de.walware.statet.nico.core.runtime.ToolController$ConsoleCommandRunnable.run(ToolController.java:124)
at
de.walware.statet.nico.core.runtime.ToolController.loopRunTask(ToolController.java:822)
at
de.walware.statet.nico.core.runtime.ToolController.loop(ToolController.java:767)
at
de.walware.statet.nico.core.runtime.ToolController.run(ToolController.java:293)
at
de.walware.statet.nico.core.runtime.ToolRunner.run(ToolRunner.java:37)
at
de.walware.statet.nico.core.runtime.ToolRunner.access$0(ToolRunner.java:35)
at
de.walware.statet.nico.core.runtime.ToolRunner$1.run(ToolRunner.java:47)
Caused by: java.rmi.UnmarshalException: error unmarshalling return; nested
exception is:
java.io.UTFDataFormatException
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
java.rmi.server.RemoteObjectInvocationHandler.invokeRemoteMethod(Unknown
Source)
at java.rmi.server.RemoteObjectInvocationHandler.invoke(Unknown Source)
at $Proxy0.runMainLoop(Unknown Source)
at
de.walware.statet.r.nico.impl.RjsController.rjsRunMainLoop(RjsController.java:172)
... 9 more
Caused by: java.io.UTFDataFormatException
at java.io.ObjectInputStream$BlockDataInputStream.readUTFSpan(Unknown
Source)
at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(Unknown
Source)
at java.io.ObjectInputStream$BlockDataInputStream.readUTF(Unknown
Source)
at java.io.ObjectInputStream.readUTF(Unknown Source)
at de.walware.rj.server.ConsoleCmdItem.init(ConsoleCmdItem.java:58)
at de.walware.rj.server.MainCmdList.readExternal(MainCmdList.java:70)
at java.io.ObjectInputStream.readExternalData(Unknown Source)
at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
at java.io.ObjectInputStream.readObject0(Unknown Source)
at java.io.ObjectInputStream.readObject(Unknown Source)
at sun.rmi.server.UnicastRef.unmarshalValue(Unknown Source)
... 14 more


Thanks and regards,
Harsh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Batch replacement, by factor, of values in a data frame

2009-08-26 Thread Gavin Simpson

On Wed, 2009-08-26 at 08:06 -0700, Phil Spector wrote:
 The ave function is very handy for things like this:
 
 mins = ave(D$Var,D$Site,FUN=function(x)min(x[x0],na.rm=TRUE))
 D$Var = ifelse(is.na(D$Var) | D$Var == 0,mins,D$Var)
 
 should do the required replacements.

Thanks Phil, that's great. I hadn't come across ave() before.

Shortly after I received your email, it dawned on me to just rep the
minimums the required number of times (by the number of non-missing 0's)
for each site, and then insert this vector into the data frame whilst
subsetting on whether it was zero or not.

Cheers,

G

 
   - Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spec...@stat.berkeley.edu
 
 
 On Wed, 26 Aug 2009, Gavin Simpson wrote:
 
  Dear List,
 
  I'm wondering if there is a better/cleaner/more efficient way of
  replacing 0 values in a variable with the minimum of the non-missing and
  non-zero values of that same variable, but doing it within the levels of
  a factor?
 
  Consider the dummy example data presented at the end of my message.
  Within each 'Site' there are some 0 values and possibly some NA's. I can
  compute the minimum of the non-missing and non-zero values by 'Site' as
  indicated below using aggregate for example. Save for looping over the
  'Site's and replacing 0's with the relevant minimum is there a way of
  using a vectorised approach to do the replacement?
 
  Thanks in advance,
 
  G
 
  ## dummy data
  set.seed(123)
  D - data.frame(Site = factor(rep(LETTERS[1:5], times = 10)),
 Var = runif(5*10))
  D - D[with(D, order(Site, Var)), ]
  ## simulate some 0's
  D[c(1,3,11,12,23,27,34,36,41,49), Var] - 0
  ## just to complicate matters, some NA
  D[sample(NROW(D), 3), Var] - NA
  head(D)
  ## Compute minimums per Site
  aggregate(D$Var, by = list(Site = D$Site),
   FUN = function(x) min(x[x0], na.rm = TRUE))
  ## How replace the appropriate 0's with the appropriate minimum?
  --
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
  Dr. Gavin Simpson [t] +44 (0)20 7679 0522
  ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
  Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
  Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
  UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
  %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing equal values on matrix by same random number

2009-08-26 Thread milton ruser

Hi David,

Now is perfect!!

Thanks a lot

milton

On Wed, Aug 26, 2009 at 1:30 PM, David Huffer david.huf...@csosa.govwrote:

 You could try either of the two examples below. They both assume that min
 and max are invariant:

 mymat - matrix (
  c (
1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2
, 2 , 2 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2 , 2 , 2 , 0 , 0
, 0 , 0 , 3 , 3 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 4 , 4 , 3 , 3
, 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
  ) , nrow = 5
 )

 ##  this way gives same random values for
 ##  each integer between 0 and
 ##  max (mymat):

 milton - function ( x , min = 0.4 , max = 0.7 ) {
  .rep - runif ( n = max ( x ) , min = min , max = max )
  for ( i in 1:max ( x ) ) {
  x[x == i] - .rep[i]
  }
  x
 }
 milton ( x = mymat )

 ##  this way gives different random values
 ##  for each integer between
 ##  0 and max (mymat):

 milton - function ( x , min = 0.4 , max = 0.7 ) {
  for ( i in 1:max ( x ) ) {
x [ x == i ] - runif (
  sum ( x == i )
  , min = min
  , max = max
)
  }
  x
 }
 milton ( x = mymat )

 --
  David

  -
  David Huffer, Ph.D.   Senior Statistician
  CSOSA/Washington, DC   david.huf...@csosa.gov
  -


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of milton ruser
  Sent: Wednesday, August 26, 2009 1:18 PM
 To: David Winsemius
 Cc: r-help@r-project.org
 Subject: Re: [R] changing equal values on matrix by same random number

 Hi David,
 Thanks for the reply. This is what I need:

  mymat[mymat==1] - runif(1,min=0.4,max=0.7)
  mymat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,] 0.457316100200000 3 0
 [2,] 0.457316100202000 0 0
 [3,] 0.457316100202004 0 0
 [4,] 0.00000002304 0 0
 [5,] 0.00000000303 0 0

 But as my real landscapes have values from 1 to large number (~10,),
 so I think that if I put this on a for() looping it will be very time
 expensive,
 and as I have a lot of landscapes, I need to speed up it.

 Any suggestion?

 bests

 milton



 On Wed, Aug 26, 2009 at 1:12 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 
  On Aug 26, 2009, at 12:53 PM, milton ruser wrote:
 
  Dear all,
 
  I have about 30,000 matrix (512x512), with values from 1 to N.
  Each value on a matrix represent a habitat patch on my
  matrix (i.e. my landscape). Non-habitat are stored as ZERO.
  No I need to change each 1-to-N values for the same random
  number.
 
  Just supose my matrix is:
  mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
  0,0,0,0,2,2,2,0,0,0,0,
  0,0,0,0,2,2,2,0,0,0,0,
  3,3,0,0,0,0,0,0,0,4,4,
  3,3,0,0,0,0,0,0,0,0,0), nrow=5)
 
  I would like that all cells with 1 come to be
  runif(1,min=0.4, max=0.7), and cells with 2
  be replace by another runif(...).
 
 
  First the wrong way and then the right way:
 
   mymat[mymat==1] - runif(1,min=0.4,max=0.7)
   mymat
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
  [1,] 0.457316100200000 3 0
  [2,] 0.457316100202000 0 0
  [3,] 0.457316100202004 0 0
  [4,] 0.00000002304 0 0
  [5,] 0.00000000303 0 0
 
  All the values are the same, clearly not what was desired.
 
  Put it back to your starting point:
 
   mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
  + 0,0,0,0,2,2,2,0,0,0,0,
  + 0,0,0,0,2,2,2,0,0,0,0,
  + 3,3,0,0,0,0,0,0,0,4,4,
  + 3,3,0,0,0,0,0,0,0,0,0), nrow=5)
 
  # So supply the proper number of random realizations:
 
   mymat[mymat==1] - runif(sum(mymat==1),min=0.4,max=0.7)
   mymat
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
  [1,] 0.574566500200000 3 0
  [2,] 0.695641800202000 0 0
  [3,] 0.693546600202004 0 0
  [4,] 0.00000002304 0 0
  [5,] 0.00000000303 0 0
 
  If you want to supply a matrix of max and min values for the other
 integers
  there would probably be an *apply approach that could be used.
 
 
 
  I can do it using for(), but it is very time expensive.
  Any help are welcome.
 
  cheers
 
 
  David Winsemius, MD
  Heritage Laboratories
  West Hartford, CT
 
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

Re: [R] rJava error for large XML object return in StatET plugin

2009-08-26 Thread Steve Lianoglou


Hi,

On Aug 26, 2009, at 1:31 PM, Harsh wrote:


Hi R List,
I get this error using StatET R plugin in Eclipse.


You might have more luck asking the StatET user mailing list:

https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/statet-user

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] specify a model in differential equations (nlme)

2009-08-26 Thread Jun Shen

Dear all,

I wonder if there is a way to specify a model in differential equations for
nlme(). Or in other packages? Appreciate any comment. Thanks.

Jun Shen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ggplot2: geom_smooth and legend

2009-08-26 Thread hadley wickham

Hi Benoit,

You could turn the standard errors off with se = F.  Then they'll be
removed from the legend as well.

Hadley

On Tue, Aug 18, 2009 at 7:43 AM, Benoit
Boulinguiezbenoit.boulingu...@ensc-rennes.fr wrote:
 Sorry I forgot the code that goes with

 **CODE
 desorb_plot-ggplot() +

        geom_smooth(data=DATA.B1_SA_N2,
                        aes(Temp,DrTGA*100,colour=B1),span=0.1,size=1.6) +
        geom_smooth(data=DATA.FM30K_SA_N2,
                        aes(Temp,DrTGA*100,colour=FM30K),span=0.2,size=1.6)
 +
        geom_smooth(data=DATA.NC60_SA_N2,
                        aes(Temp,-DrTGA*100,colour=NC60),span=0.1,size=1.6)
 +
        geom_smooth(data=DATA.THC515_SA_N2,

  aes(Temp,DrTGA*100,colour=THC515),span=0.2,size=1.6) +

        scale_colour_hue(name=Adsorbent) +
        labs(x=Temp~(degree*C),y=Weight~Derivative~(%/*degree*C)) +
        opts(panel.grid.minor = theme_line(colour = grey94))

 print(desorb_plot)


 Cordialement / Regards

 ---
 Benoit Boulinguiez
 Ecole de Chimie de Rennes (ENSCR) Bureau 1.20
 Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes
 Avenue du Général Leclerc
 CS 50837
 35708 Rennes CEDEX 7
 Tel 33 (0)2 23 23 80 83
 Fax 33 (0)2 23 23 81 20
 http://www.ensc-rennes.fr/


 Quoting Benoit Boulinguiez benoit.boulingu...@ensc-rennes.fr:

 Hi all,

 Is that possible to remove the grey colour in the legend key that goes
 with the geom_smooth? In my case it doesn't ease the reading of the
 legend.

 http://www.4shared.com/file/125864977/e10644f8/desorb.html


 Cordialement / Regards

 ---
 Benoit Boulinguiez
 Ecole de Chimie de Rennes (ENSCR) Bureau 1.20
 Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes
 Avenue du Général Leclerc
 CS 50837
 35708 Rennes CEDEX 7
 Tel 33 (0)2 23 23 80 83
 Fax 33 (0)2 23 23 81 20
 http://www.ensc-rennes.fr/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing equal values on matrix by same random number

2009-08-26 Thread milton ruser

Hi David  all,

It is me again. When I try with a sample matrix (10x10) it appears to be
good.
But when I apply to a real 512x512 landscape, with values ranging from 1 to
10,000
It is still time expensive (please, see below):

mymat - matrix (
 c (
   1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2
   , 2 , 2 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2 , 2 , 2 , 0 , 0
   , 0 , 0 , 3 , 3 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 4 , 4 , 3 , 3
   , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
 ) , nrow = 5, byrow=T
)

##  this way gives same random values for
##  each integer between 0 and
##  max (mymat):
milton - function ( x , min = 0.4 , max = 0.7 ) {
 .rep - runif ( n = max ( x ) , min = min , max = max )
 for ( i in 1:max ( x ) ) {
 x[x == i] - .rep[i]
 }
 x
}
milton ( x = mymat )

mymat -matrix(sample(0:1,size=512*512, replace=T), ncol=512)
milton ( x = mymat )

May be using apply series it could be more efficient? I think I need to
avoid for() looping to speed up it.

cheers

milton
On Wed, Aug 26, 2009 at 1:30 PM, David Huffer david.huf...@csosa.govwrote:

 You could try either of the two examples below. They both assume that min
 and max are invariant:

 mymat - matrix (
  c (
1 , 1 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2
, 2 , 2 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 2 , 2 , 2 , 0 , 0
, 0 , 0 , 3 , 3 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 4 , 4 , 3 , 3
, 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
  ) , nrow = 5
 )

 ##  this way gives same random values for
 ##  each integer between 0 and
 ##  max (mymat):

 milton - function ( x , min = 0.4 , max = 0.7 ) {
  .rep - runif ( n = max ( x ) , min = min , max = max )
  for ( i in 1:max ( x ) ) {
  x[x == i] - .rep[i]
  }
  x
 }
 milton ( x = mymat )

 ##  this way gives different random values
 ##  for each integer between
 ##  0 and max (mymat):

 milton - function ( x , min = 0.4 , max = 0.7 ) {
  for ( i in 1:max ( x ) ) {
x [ x == i ] - runif (
  sum ( x == i )
  , min = min
  , max = max
)
  }
  x
 }
 milton ( x = mymat )

 --
  David

  -
  David Huffer, Ph.D.   Senior Statistician
  CSOSA/Washington, DC   david.huf...@csosa.gov
  -


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of milton ruser
  Sent: Wednesday, August 26, 2009 1:18 PM
 To: David Winsemius
 Cc: r-help@r-project.org
 Subject: Re: [R] changing equal values on matrix by same random number

 Hi David,
 Thanks for the reply. This is what I need:

  mymat[mymat==1] - runif(1,min=0.4,max=0.7)
  mymat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
 [1,] 0.457316100200000 3 0
 [2,] 0.457316100202000 0 0
 [3,] 0.457316100202004 0 0
 [4,] 0.00000002304 0 0
 [5,] 0.00000000303 0 0

 But as my real landscapes have values from 1 to large number (~10,),
 so I think that if I put this on a for() looping it will be very time
 expensive,
 and as I have a lot of landscapes, I need to speed up it.

 Any suggestion?

 bests

 milton



 On Wed, Aug 26, 2009 at 1:12 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 
  On Aug 26, 2009, at 12:53 PM, milton ruser wrote:
 
  Dear all,
 
  I have about 30,000 matrix (512x512), with values from 1 to N.
  Each value on a matrix represent a habitat patch on my
  matrix (i.e. my landscape). Non-habitat are stored as ZERO.
  No I need to change each 1-to-N values for the same random
  number.
 
  Just supose my matrix is:
  mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
  0,0,0,0,2,2,2,0,0,0,0,
  0,0,0,0,2,2,2,0,0,0,0,
  3,3,0,0,0,0,0,0,0,4,4,
  3,3,0,0,0,0,0,0,0,0,0), nrow=5)
 
  I would like that all cells with 1 come to be
  runif(1,min=0.4, max=0.7), and cells with 2
  be replace by another runif(...).
 
 
  First the wrong way and then the right way:
 
   mymat[mymat==1] - runif(1,min=0.4,max=0.7)
   mymat
   [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
  [1,] 0.457316100200000 3 0
  [2,] 0.457316100202000 0 0
  [3,] 0.457316100202004 0 0
  [4,] 0.00000002304 0 0
  [5,] 0.00000000303 0 0
 
  All the values are the same, clearly not what was desired.
 
  Put it back to your starting point:
 
   mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
  + 0,0,0,0,2,2,2,0,0,0,0,
  + 0,0,0,0,2,2,2,0,0,0,0,
  + 3,3,0,0,0,0,0,0,0,4,4,
  + 3,3,0,0,0,0,0,0,0,0,0), nrow=5)
 
  # So supply the proper number of random realizations:
 
   mymat[mymat==1] - runif(sum(mymat==1),min=0.4,max=0.7)
   mymat
   [,1] [,2] [,3] [,4] [,5]

Re: [R] specify a model in differential equations (nlme)

2009-08-26 Thread Bert Gunter

For nlme, no. However, take a look at the CRAN Task View for pharmokinetics
and packages recommended there, especially nlmeODE .

You might also try R's search capabilities:

RSiteSearch(differential equations)
?RSiteSearch

of other non-R search engines

Bert Gunter
Genentech Nonclinical Biostatisics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Jun Shen
Sent: Wednesday, August 26, 2009 10:45 AM
To: r-help@r-project.org
Subject: [R] specify a model in differential equations (nlme)

Dear all,

I wonder if there is a way to specify a model in differential equations for
nlme(). Or in other packages? Appreciate any comment. Thanks.

Jun Shen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Trying to make Nas 0

2009-08-26 Thread Dumblauskas, Jerry

I have an lm object called mro

A summary gives
 summary(mro)

Call:
lm(formula = REGRESSIONSTRING, data = wData)

Residuals:
 Min   1Q   Median   3Q  Max 
-8.18077 -1.06867 -0.09387  1.03153 11.20201 

Coefficients: (1 not defined because of singularities)
 Estimate Std. Error t value Pr(|t|)
(Intercept)7.2096 1.0345   6.969 5.37e-11 ***
log(H2403_P1) -0.3113 0.1305  -2.386   0.0180 *  
I(LEVDSCRT/100)3.4425 0.7818   4.403 1.79e-05 ***
REG_UTIL_FLAG_P1   NA NA  NA   NA
REALESTATE_ROI 0.2413 0.6211   0.389   0.6980
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 2.116 on 186 degrees of freedom
Multiple R-squared: 0.1455, Adjusted R-squared: 0.1317 
F-statistic: 10.56 on 3 and 186 DF,  p-value: 1.917e-06 

Now, I know my REG_UTIL_FLAG_P1 var is NA because it is missing/constant
in the data frame.

What I am trying to do is this

tring(summary(mro)$coef[,3])
[1] 6.96900610837729, -2.38577388980627, 4.403441093625,
0.388556365853674
 

I'd really like the NA to not be omitted, but cast to 0 (I am going to
load this into a DB)

So I'd like to see
[1] 6.96900610837729, -2.38577388980627, 4.403441093625, 0,
0.388556365853674


I was able to do that with the coefficients via

mro$coef[is.na(mro$coef)]-0
mro$coef

I can do this in a brute force way by by getting a vector of aliased
columns and iterating thru it -- but was hoping for a more elegant
solution.

Thx!


=== 
 Please access the attached hyperlink for an important electronic 
communications disclaimer: 
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Managing output

2009-08-26 Thread Noah Silverman


Hi,


Is there a way to build up a vector, item by item.  In perl, we can 
push an item onto an array.  How can we can do this in R?
I have a loop that generates values as it goes.  I want to end up with a 
vector of all the loop results.


In perl it woud be:

for(item in list){
result - 2*item^2 (Or whatever formula, this is just a pseudo example)
Push(@result_list, result)  (This is the step I can't do in R)
}


Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Managing output

2009-08-26 Thread Erik Iverson

How about ?append, but R is vectorized, so why not just

result_list - 2*item^2 , or for more complicated tasks, the 
apply/sapply/lapply/mapply family of functions?

In general, the for loop construct can be avoided so you don't have to think 
about messy indexing.  What exactly are you trying to do? 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Noah Silverman
Sent: Wednesday, August 26, 2009 2:20 PM
To: r help
Subject: [R] Managing output

Hi,


Is there a way to build up a vector, item by item.  In perl, we can 
push an item onto an array.  How can we can do this in R?
I have a loop that generates values as it goes.  I want to end up with a 
vector of all the loop results.

In perl it woud be:

for(item in list){
 result - 2*item^2 (Or whatever formula, this is just a pseudo example)
 Push(@result_list, result)  (This is the step I can't do in R)
}


Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Managing output

2009-08-26 Thread Noah Silverman

The actually process is REALLY complicate, I just gave a simple example 
for the list.

I have a  lot of steps to process the data before I get a final 
score.  (nested loops, conditional statements, etc.)

Right now, I'm just printing the scores to the screen.  I'd like to 
accumulate them in some kind of data structure so I can either write 
them to disk or graph them.

-N

On 8/26/09 12:27 PM, Erik Iverson wrote:
 How about ?append, but R is vectorized, so why not just

 result_list- 2*item^2 , or for more complicated tasks, the 
 apply/sapply/lapply/mapply family of functions?

 In general, the for loop construct can be avoided so you don't have to 
 think about messy indexing.  What exactly are you trying to do?

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Noah Silverman
 Sent: Wednesday, August 26, 2009 2:20 PM
 To: r help
 Subject: [R] Managing output

 Hi,


 Is there a way to build up a vector, item by item.  In perl, we can
 push an item onto an array.  How can we can do this in R?
 I have a loop that generates values as it goes.  I want to end up with a
 vector of all the loop results.

 In perl it woud be:

 for(item in list){
   result- 2*item^2 (Or whatever formula, this is just a pseudo example)
   Push(@result_list, result)  (This is the step I can't do in R)
 }


 Thanks!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changing equal values on matrix by same random number

2009-08-26 Thread William Dunlap



Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of milton ruser
 Sent: Wednesday, August 26, 2009 9:54 AM
 To: r-help@r-project.org
 Subject: [R] changing equal values on matrix by same random number
 
 Dear all,
 
 I have about 30,000 matrix (512x512), with values from 1 to N.
 Each value on a matrix represent a habitat patch on my
 matrix (i.e. my landscape). Non-habitat are stored as ZERO.
 No I need to change each 1-to-N values for the same random
 number.
 
 Just supose my matrix is:
 mymat-matrix(c(1,1,1,0,0,0,0,0,0,0,0,
 0,0,0,0,2,2,2,0,0,0,0,
 0,0,0,0,2,2,2,0,0,0,0,
 3,3,0,0,0,0,0,0,0,4,4,
 3,3,0,0,0,0,0,0,0,0,0), nrow=5)
 
 I would like that all cells with 1 come to be
 runif(1,min=0.4, max=0.7), and cells with 2
 be replace by another runif(...).
 
 I can do it using for(), but it is very time expensive.
 Any help are welcome.

Is the following what you want?  It uses the small
integers in mymat as indices into a vector of N random
numbers.

 mymat
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
[1,]100200000 3 0
[2,]100202000 0 0
[3,]100202004 0 0
[4,]000002304 0 0
[5,]000000303 0 0
 range(unique(mymat))
[1] 0 4
 f-function(mat){
 N-max(mat)
 tmp - mat[mat0]
 tmp - runif(N, 0.4, 0.7)[tmp]
 mat[mat0] - tmp
 mat
 }
 f(mymat)
  [,1] [,2] [,3]  [,4] [,5]  [,6]  [,7] [,8]
[,9]
[1,] 0.536142700 0.67274910 0.000 0.0000
0.000
[2,] 0.536142700 0.67274910 0.6727491 0.0000
0.000
[3,] 0.536142700 0.67274910 0.6727491 0.0000
0.5357566
[4,] 0.00000 0.0000 0.6727491 0.64178320
0.5357566
[5,] 0.00000 0.0000 0.000 0.64178320
0.6417832
 [,10] [,11]
[1,] 0.6417832 0
[2,] 0.000 0
[3,] 0.000 0
[4,] 0.000 0
[5,] 0.000 0

The 3 lines involving tmp could be collapsed into one, making
things more obscure
mat[mat0] - runif(N, 0.4, 0.7)[mat[mat0]] 

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com 

 cheers
 
 milton
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Managing output

2009-08-26 Thread Duncan Murdoch


On 26/08/2009 3:20 PM, Noah Silverman wrote:

Hi,


Is there a way to build up a vector, item by item.  In perl, we can 
push an item onto an array.  How can we can do this in R?
I have a loop that generates values as it goes.  I want to end up with a 
vector of all the loop results.


In perl it woud be:

for(item in list){
 result - 2*item^2 (Or whatever formula, this is just a pseudo example)
 Push(@result_list, result)  (This is the step I can't do in R)
}


Thanks!


Use the c() function, e.g.

result - numeric(0)
for (item in list) {
  result - 2*item^2
  result_list - c(result_list, result)
}

If the list is long, this is an extremely inefficient style of R 
programming, but for a short list it should be fine.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] contourLines() documentation

2009-08-26 Thread Derek Lacoursiere


Hello, 

I have searched for documentation on the function contourLines's algorithm
but cannot find a thing.  I am about to submit a paper to a journal but
cannot yet do so because I need to provide some reference for this function. 
Does anyone know what algorithm is used for this function?

Thanks,

Derek Lacoursiere
-- 
View this message in context: 
http://www.nabble.com/contourLines%28%29-documentation-tp25151467p25151467.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Managing output

2009-08-26 Thread Rolf Turner



On 27/08/2009, at 8:00 AM, Duncan Murdoch wrote:


On 26/08/2009 3:20 PM, Noah Silverman wrote:

Hi,


Is there a way to build up a vector, item by item.  In perl, we can
push an item onto an array.  How can we can do this in R?
I have a loop that generates values as it goes.  I want to end up  
with a

vector of all the loop results.

In perl it woud be:

for(item in list){
 result - 2*item^2 (Or whatever formula, this is just a  
pseudo example)

 Push(@result_list, result)  (This is the step I can't do in R)
}


Thanks!


Use the c() function, e.g.

result - numeric(0)


That should be result_list - numeric(0)!!! :-)

cheers,

Rolf


for (item in list) {
   result - 2*item^2
   result_list - c(result_list, result)
}

If the list is long, this is an extremely inefficient style of R
programming, but for a short list it should be fine.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting- 
guide.html

and provide commented, minimal, self-contained, reproducible code.



##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Plotting to stdout

2009-08-26 Thread Oliver Bandel

Hello,

is there a way to write the result of a plot
to stdout? I mean even a binary thingy like
a png-file, written to stdout?!
I tried with the file-argument of png() and jpeg(),
but did not get working results.


Ciao,
   Oliver

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 142 matches

Mail list logo