Re: [R] how to analysisi spectrum of a dataset with NA value

2013-12-18 Thread Jeff Newmiller
This behavior is typical for fast Fourier Transform algorithms. You need to 
decide what theory you want to use in dealing with your missing data, and apply 
that yourself. There are some packages that can help you (e.g. Lomb-Scargle or 
discrete slow Fourier transform). Search with package sos or use the 
RSiteSearch function.

Please read the Posting Guide regarding not using HTML formatted email... your 
code fragment was unreadable as is common with non-plain text.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Jie Tang totang...@gmail.com wrote:
hi R users

I have a large 1D dataset and some of them is NA value .

I found I cound get the spectrum by such a command.

ua=c��10��30 ��40��50��NA��
spectrum(ua)
 and I could not use na.rm just like mean or sd function

How could I get the spectrum of ua ?

 thank you .

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multicore and mclapply problem in calculation server

2013-12-18 Thread Juan Antonio Balbuena

   Hello
   I am using package multicore for parallel computing in a Altix UltraViolet
   1000 server with 64 CPUs and 960 GB of RAM memory. Access is managed by
   means of a SGE queue system. This is the first time I am using parallel
   computing and my experience with supercomputers is quite limited. So any
   help will be much, much appreciated.
   My experiment consists of a number of runs (N.runs) each involving a number
   of permutations (N. perms). (An excerpt of the code is included below.) The
   permutations are very time consuming and I am using mclapply to distribute
   the job among a given number of cpus (usually 12 to 24). The problem is that
   the system administrators notice that threads keep increasing as the program
   is executed to the point that they compromise the functioning of the whole
   system and have to abort the job.
   I have tried to specify in the bash file sent to the queue RAM limits (using
   ulimit) and the number of cpus to be used but it doesn't help.
   An example of the code I am using may be
   #LOAD LIBRARIES NEEDED:
   library(ape)
   library(phytools)
   library(phangorn)
   library(multicore)
   #
    SOME 100 LINES HERE DEVOTED TO DEFINE FUNCTIONS -- OMITTED FOR BREVITY
   ###
   body  -  function  (N.perm) {   #MAIN BODY = 1 RUN -- IT PUTS
   FUNCTIONS TOGETHER
 HP - HPrandomizer(NH,NP,N.assoc)
 linH = readLines(conH, n= N.perm)
 linP = readLines(conP, n= N.perm)
 stat.matrix - matrix((rep(NA, 6*N.perm)), ,6)
 #
 wrapper - function (x) {   # THIS FUCNTION IS SUPPOSED TO BE
   PARALLELIZED (SEE BELOW)
   treeH - read.tree(text=linH[x])
   treeP - read.tree(text=linP[x])
   mrcaL - MRCALink.simul (treeH, treeP, HP)
   stat.matrix[x,] - three.stat(mrcaL)
   }
 x - c(1: length(linH))#NOTE THAT linH IS SUPPOSED TO BE =
   N.perm
 stat.matrix - do.call(rbind, mclapply(x, wrapper, mc.cores= 6)) # USE OF
   MCLAPPLY
 Pstat - apply(stat.matrix, 2, rank)[1,]/length(linH)
 write(c(stat.matrix[1,], Pstat), file =
   /scratch/ba/balbuena/30H30P40.txt, sep =\t, append =TRUE,ncolumns=12)
   }
   ptm - proc.time()  # THE PROGRAM STARTS HERE
   NH= 30
   NP= 30
   N.assoc= 40
   N.runs = 1000
   N.perm = 999
   conH = file(/scratch/ba/balbuena/1MH_30.tre, open=rt) # READS TEXT DATA
   FROM EXTERNAL FILE
   conP  = file(/scratch/ba/balbuena/1MP_30.tre, open=rt) #
   
   replicate (N.runs, body(N.perm))  # LOOPING body NUMBER OF RUNS
   close(conH)
   close(conP)
   proc.time() - ptm
   Than you very much for your attention
   Juan A. Balbuena

   --

   Dr. Juan A. Balbuena
   Marine Zoology Unit
   Cavanilles Institute of Biodiversity and Evolutionary Biology
   University of
   Valencia
   [1]http://www.uv.es/~balbuena
   P.O. Box 22085
   [2]http://www.uv.es/cavanilles/zoomarin/index.htm
   46071 Valencia, Spain
   [3]http://cetus.uv.es/mullpardb/index.html
   e-mail: [4]j.a.balbu...@uv.estel. +34 963 543 658fax +34 963 543 733
   
   NOTE! For shipments by EXPRESS COURIER use the following street address:
   C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.
   

References

   1. http://www.uv.es/%7Ebalbuena
   2. http://www.uv.es/cavanilles/zoomarin/index.htm
   3. http://cetus.uv.es/mullpardb/index.html
   4. mailto:j.a.balbu...@uv.es
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tinn-R user list on Google groups

2013-12-18 Thread Jose Claudio Faria
Dears Tinn-R users,

Below a link to the new user list on Google groups:
https://groups.google.com/forum/#!forum/tinn-r

All the best,
-- 
///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\
Jose Claudio Faria
Estatistica
UESC/DCET/Brasil
joseclaudio.faria at gmail.com
Telefones:
55(73)3680.5545 - UESC
55(73)9100.7351 - TIM
55(73)8817.6159 - OI
///\\\///\\\///\\\///\\\///\\\///\\\///\\\///\\\

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse statement with two vectors of different length

2013-12-18 Thread arun
Hi,
Please show a reproducible example.

countrydiff - c(Albania, Algeria, Belarus, Canada, Germany)
long_df - data.frame(country_name = c(Algeria, Guyana, Hungary, 
Algeria, Canada, Iran, Iran, Norway,Uruguay, Zimbabwe) )
 ifelse(long_df$country_name %in% countrydiff,1,0)
# [1] 1 0 0 1 1 0 0 0 0 0
#or
1*(long_df$country_name %in% countrydiff)
# [1] 1 0 0 1 1 0 0 0 0 0
A.K.




Dear list-members, 

I have the following problem: I have a vector (countrydiff) with
 length 72 and another vector (long_df$country_name) which is about 
12000 long. Basically what I want to do is to if the factor level (or 
string name) in long_df$country_name appears on the countrydiff, then 
long_df$povdat should be equal to 1, if it does not appear on the 
countrydiff vector then long_df$povdat should be equal to zero. I have 
tried different combinations and read some. The following code should in
 my mind do it, but it doesn’t: 

long_df$povdat-ifelse(long_df$country_name == countrydiff, 1, 0) 

long_df$povdat-ifelse(long_df$country_name %in% countrydiff, 1, 0) 

Additional information: the factor vector countrydiff contains 
unique country names (Albania, Zimbabwe etc.), whereas 
long_df$country_name also contains country names albeit not unique since
 it is in longform. The unique names that appear in long_df$country_name
 is around 200. 


Any suggestions? 
Thanks in advance. 

Best 
Adel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ifelse statement with two vectors of different length

2013-12-18 Thread Adel

Dear list-members,

I have the following problem: I have a vector (countrydiff) with length 72
and another vector (long_df$country_name) which is about 12000 long.
Basically what I want to do is to if the factor level (or string name) in
long_df$country_name appears on the countrydiff, then long_df$povdat should
be equal to 1, if it does not appear on the countrydiff vector then
long_df$povdat should be equal to zero. I have tried different combinations
and read some. The following code should in my mind do it, but it doesn’t:

long_df$povdat-ifelse(long_df$country_name == countrydiff, 1, 0)

long_df$povdat-ifelse(long_df$country_name %in% countrydiff, 1, 0)

Additional information: the factor vector countrydiff contains unique
country names (Albania, Zimbabwe etc.), whereas long_df$country_name also
contains country names albeit not unique since it is in longform. The unique
names that appear in long_df$country_name is around 200.


Any suggestions?
Thanks in advance.

Best
Adel




--
View this message in context: 
http://r.789695.n4.nabble.com/ifelse-statement-with-two-vectors-of-different-length-tp4682401.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] calculating power function

2013-12-18 Thread mb
Hello,

i know there are several functions on r to calculate the power of a test.
f.e the functions of pwr packet. In this packet the use the power function
(1-\beta) to calculate. My question is how we get this function for anova
(pwr.anova.test)?
I know that 
1-\beta=P(H1|H1) = P(FF_[1-\alpha,k-1,N-k]) = 1-P(FF_[1-\alpha,k-1,N-k]) =
1-P(MQE/MQRF_[1-\alpha,k-1,N-k]) = ???? = 
1-P(F[k-1,N-k]F_[1-\alpha,k-1,N-k]-k*n*f^2) = 
p.body = pf(qf(\alpha,k-1,(n-1)k,low=F),k-1,(n-1)k,k*n*f^2,low=F)
With f^2 the effect
and k*n*f^2 ncp
but how does the function which has a F[k-1,N-k] distribution look like? It
has to be the (variance between groups/df)/(variance within
groups/df)-k*n*f^2, but how can I write these so that I see the
dirstribution.
I have never found the exact way how to get to this formula. So may you can
help.

Thanks a lot.



--
View this message in context: 
http://r.789695.n4.nabble.com/calculating-power-function-tp4682396.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D Surface plot

2013-12-18 Thread Simon Delay-Fortier
Hi everyone,

I am a very new user of r (doing most of my previous stuff in vba). I am now 
mandated to draw a 3-d surface of a mine pit hole. I have all the location 
points (around 5000 points) of the pit in a CSV file under 3 column X, Y  Z. 
However, going from page to page on the web, I could not figure out how to code 
the figure. Something I noted is that most of the time X and Y  have to be in 
ascending order but the data I have are not (since they are finite points of 
the pit, if X gets  in ascending order, Y and Z are not). Also with the data, 
sometimes 2 rows fallowing each ohter have the same X and Y value with a 
different Z. That would be nice if things could be modeled with rgl function in 
order to have a rotating pit, but I would be glad to only have a 3d surface to 
start!

Hope you can help.
Sincerly,

Simon Delay-Fortier
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA repeated mesures

2013-12-18 Thread Mª Teresa Martinez Soriano
Hi to everyone, I am tring to make a Anova with repeated measures,my data set 
looks like:
participantes - c(1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4,  
 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8)
grupo - factor(c(rep(A, 8), rep(B, 8), rep(C, 8)))valor - c(1, 2, 4, 1, 
1, 2, 2, 3, 3, 4, 4, 2, 3, 4, 4, 3, 4, 5, 3, 5, 5, 3,  4, 6)df2 - 
data.frame(participantes, grupo, valor)
I want to find if there is differences in the grupo  in each participant. I 
don't know how to find this p-value for each participant, could you help me, 
please??Thanks a lot   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read .slk file

2013-12-18 Thread PIKAL Petr
Hi

Extension does not specify file format. You can rename any file with any 
extension without changing its nature. However slk stays for symbolic link and 
therefore it just brings actual file to Excel. 

Maybe you could start to play with

http://stat.ethz.ch/R-manual/R-devel/library/base/html/files.html

Regards
Petr

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Santosh
 Sent: Tuesday, December 17, 2013 12:11 AM
 To: r-help
 Subject: [R] read .slk file
 
 Dear Rxperts..
 
 I recently received a data file with the extension .slk. If I save
 the file as MS Excel file, I am able to read in R without issues.  Is
 it possible to read this .slk file without converting into another R-
 readable data format?
 
 Regards,
 Santosh
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse statement with two vectors of different length

2013-12-18 Thread Adel

Dear Arun

Thanks for your reply, it made me realize that the problem was not in the
code but in the levels() of the factors. Some countries had some extra
spacing which made the ifelse() function not work. So if I modify your code
(added space to countrydiff), it will then look something like this:

countrydiff - c(Albania, Algeria, Belarus, Canada   ,
Germany   ) 
long_df - data.frame(country_name = c(Algeria, Guyana, Hungary,
Algeria, Canada, Iran, Iran, Norway,Uruguay, Zimbabwe) ) 

I had to use the gsub to fix this first.


Interestingly, the setdiff() function did not react on spacing difference
which I used before coming to the ifelse statement and therefore I did not
react on this in the first place

#no reaction from R on spacing diff.
setdiff(countrydiff, long_df$country_name)


Nevertheless, thanks again for being helpful!
Adel




--
View this message in context: 
http://r.789695.n4.nabble.com/ifelse-statement-with-two-vectors-of-different-length-tp4682401p4682403.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ifelse statement with two vectors of different length

2013-12-18 Thread arun
Hi Adel,

If the problem is the spacing, then
library(stringr)
1*(long_df$country_name %in% str_trim(countrydiff))
# [1] 1 0 0 1 1 0 0 0 0 0
A.K.


Dear Arun 

Thanks for your reply, it made me realize that the problem was 
not in the code but in the levels() of the factors. Some countries had 
some extra spacing which made the ifelse() function not work. So if I 
modify your code (added space to countrydiff), it will then look 
something like this: 

countrydiff - c(Albania    , Algeria    , Belarus    , Canada   , 
Germany   ) 
long_df - data.frame(country_name = c(Algeria, Guyana, 
Hungary, Algeria, Canada, Iran, Iran, Norway,Uruguay, 
Zimbabwe) ) 

I had to use the gsub to fix this first. 


Interestingly, the setdiff() function did not react on 
spacing difference which I used before coming to the ifelse statement 
and therefore I did not react on this in the first place 

#no reaction from R on spacing diff. 
setdiff(countrydiff, long_df$country_name) 


Nevertheless, thanks again for being helpful! 
Adel 


On Wednesday, December 18, 2013 9:58 AM, Adel adel.da...@sociology.gu.se 
wrote:

Dear list-members,

I have the following problem: I have a vector (countrydiff) with length 72
and another vector (long_df$country_name) which is about 12000 long.
Basically what I want to do is to if the factor level (or string name) in
long_df$country_name appears on the countrydiff, then long_df$povdat should
be equal to 1, if it does not appear on the countrydiff vector then
long_df$povdat should be equal to zero. I have tried different combinations
and read some. The following code should in my mind do it, but it doesn’t:

long_df$povdat-ifelse(long_df$country_name == countrydiff, 1, 0)

long_df$povdat-ifelse(long_df$country_name %in% countrydiff, 1, 0)

Additional information: the factor vector countrydiff contains unique
country names (Albania, Zimbabwe etc.), whereas long_df$country_name also
contains country names albeit not unique since it is in longform. The unique
names that appear in long_df$country_name is around 200.


Any suggestions?
Thanks in advance.

Best
Adel




--
View this message in context: 
http://r.789695.n4.nabble.com/ifelse-statement-with-two-vectors-of-different-length-tp4682401.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3D Surface Plot

2013-12-18 Thread Simon Delay-Fortier
Hi everyone,I am a very new user of r. I am now mandated to draw a 3-d surface 
(and possibly rotating) of a mine pit hole. I have all the location points 
(around 5000 points) of the pit in a CSV file under 3 column X, Y  Z. The 
fllowing gives a 3d scatter but i would like to have a surface.

attach(EASTPITCREST)
par(bg=grey6, col.lab=white, col.axis=white, col.main=white, 
col.sub=white)
scatterplot3d(X, Y, Z, color = red, pch=19, main=Mine and Drilling Map)

I noted is that most of the time X and Y  have to be in ascending order but the 
data I have are not (since they are finite points of the pit, if X gets  in 
ascending order, Y and Z are not). Also with the data, sometimes 2 rows 
fallowing each ohter have the same X and Y value with a different Z.

Hope you can help.
Sincerly,

Simon Delay-Fortier
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3D Surface plot

2013-12-18 Thread Barry Rowlingson
On Wed, Dec 18, 2013 at 2:52 PM, Simon Delay-Fortier
simon.delay-fort...@mail.mcgill.ca wrote:
 Hi everyone,

 I am a very new user of r (doing most of my previous stuff in vba). I am now 
 mandated to draw a 3-d surface of a mine pit hole. I have all the location 
 points (around 5000 points) of the pit in a CSV file under 3 column X, Y  Z. 
 However, going from page to page on the web, I could not figure out how to 
 code the figure. Something I noted is that most of the time X and Y  have to 
 be in ascending order but the data I have are not (since they are finite 
 points of the pit, if X gets  in ascending order, Y and Z are not). Also with 
 the data, sometimes 2 rows fallowing each ohter have the same X and Y value 
 with a different Z. That would be nice if things could be modeled with rgl 
 function in order to have a rotating pit, but I would be glad to only have a 
 3d surface to start!

 How 3d is your surface here? Because there's 3d and there's
what's known as 2.5d

 A 3d surface could be something like the surface of a sphere, or a
cave, or an overhanging cliff, whereas a 2.5d surface is a
single-valued function of x and y.

 If you have a true 3d surface then that's beyond me and involves
working out surface normals and all sorts of other clever stuff which
I don't know about.

 If its really 2.5d then yes, you do need to compute the values of
your surface on a grid in order to display it using rgl's surface3d
function. You can do this with some 2d interpolation code, such as
kriging or inverse distance weighting, using assorted packages such as
automap (which makes it easy but beware) or gstat.

 Since this seems to be a real-world geography problem, you might haul
over to the r-sig-geo mailing list and as there where the spatial guys
hang out.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help using mapply to run multiple models

2013-12-18 Thread Simon Kiss
Thanks! that works, more or less. Although the wonky behaviour of mapply that 
David pointed out is irritating. I tried deleting the $call item from the 
models produced and passing them to stargazer for reporting the results, but 
stargazer won't recognize the results even though the class is explicitly glm 
lm.  
Does anyone know why mapply produces such weird results?
On 2013-12-18, at 3:29 AM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:
 
 Here's a way to generate a list of model objects. Once you have the
 list, you can write or call functions to extract useful pieces of
 information from each model object and use lapply() to call each list
 component recursively.
 
 sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5),
  var2=rbinom(50, size=2, prob=0.5),
  var3=rbinom(50, size=3, prob=0.5),
  var4=rbinom(50, size=2, prob=0.5),
  var5=rbinom(50, size=2, prob=0.5))
 
 # vector of x-variable names
 xvars - names(sample.df)[-1]
 
 # function to paste a variable x into a formula object and
 # then pass it to glm()
 f - function(x)
 {
form - as.formula(paste(var1, x, sep =  ~ ))
glm(form, data = sample.df)
 }
 
 # Apply the function f to each variable in xvars
 modlist - lapply(xvars, f)
 
 To give you an idea of some of the things you can do with the list:
 
 sapply(modlist, class)# return class of each component
 lapply(modlist, summary)   # return the summary of each model
 
 # combine the model coefficients into a two-column matrix
 do.call(rbind, lapply(modlist, coef))
 
 
 You'd probably want to rename the second column since the slopes are
 associated with different x variables.
 
 Dennis
 
 On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss sjk...@gmail.com wrote:
 I think I'm missing something.  I have a data frame that looks below.
 sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, 
 size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, 
 size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5))
 
 I'd like to run a series of univariate general linear models where var1 is 
 always the dependent variable and each of the other variables is the 
 independent. Then I'd like to summarize each in a table.
 I've tried :
 
 sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5)
 mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial')
 
 And that works pretty well, except, I'm left with a matrix that contains all 
 the information I need. I can't figure out how to use summary() properly on 
 this information to usefully report that information.
 
 Thank you for any suggestions.
 
 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help using mapply to run multiple models

2013-12-18 Thread Simon Kiss
Dennis, how would your function be modified to allow it to be more flexible in 
future. 
I'm thinking like:
 f - function(x='Dependent variable', y='List of Independent Variables', 
 z='Data Frame')
 {
form - as.formula(paste(x, y, sep =  ~ ))
glm(form, data =z)
 }

I tried that then using 
modlist - lapply(xvars, f), but it didn't work. 

On 2013-12-18, at 3:29 AM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:
 
 Here's a way to generate a list of model objects. Once you have the
 list, you can write or call functions to extract useful pieces of
 information from each model object and use lapply() to call each list
 component recursively.
 
 sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5),
  var2=rbinom(50, size=2, prob=0.5),
  var3=rbinom(50, size=3, prob=0.5),
  var4=rbinom(50, size=2, prob=0.5),
  var5=rbinom(50, size=2, prob=0.5))
 
 # vector of x-variable names
 xvars - names(sample.df)[-1]
 
 # function to paste a variable x into a formula object and
 # then pass it to glm()
 f - function(x)
 {
form - as.formula(paste(var1, x, sep =  ~ ))
glm(form, data = sample.df)
 }
 
 # Apply the function f to each variable in xvars
 modlist - lapply(xvars, f)
 
 To give you an idea of some of the things you can do with the list:
 
 sapply(modlist, class)# return class of each component
 lapply(modlist, summary)   # return the summary of each model
 
 # combine the model coefficients into a two-column matrix
 do.call(rbind, lapply(modlist, coef))
 
 
 You'd probably want to rename the second column since the slopes are
 associated with different x variables.
 
 Dennis
 
 On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss sjk...@gmail.com wrote:
 I think I'm missing something.  I have a data frame that looks below.
 sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, 
 size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, 
 size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5))
 
 I'd like to run a series of univariate general linear models where var1 is 
 always the dependent variable and each of the other variables is the 
 independent. Then I'd like to summarize each in a table.
 I've tried :
 
 sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5)
 mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial')
 
 And that works pretty well, except, I'm left with a matrix that contains all 
 the information I need. I can't figure out how to use summary() properly on 
 this information to usefully report that information.
 
 Thank you for any suggestions.
 
 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 905 746 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help using mapply to run multiple models

2013-12-18 Thread Bert Gunter
Folks:

1. Haven't closely followed the thread. I'm responding only to Simon's post.

2. ?formula ## Especially note the use of .

So just make an appropriately constructed data frame for the data
argument of glm:

## example

 df - data.frame(y=rnorm(9),x1=runif(9), x2=1:9)
 glm(y~.,data=df)
## y does not need to be in the data frame.

Another way to handle the OP is via substitute() or bquote(). I'll
leave that to thers to explain.

-- Bert




On Wed, Dec 18, 2013 at 9:11 AM, Simon Kiss sjk...@gmail.com wrote:
 Dennis, how would your function be modified to allow it to be more flexible 
 in future.
 I'm thinking like:
 f - function(x='Dependent variable', y='List of Independent Variables', 
 z='Data Frame')
 {
form - as.formula(paste(x, y, sep =  ~ ))
glm(form, data =z)
 }

 I tried that then using
 modlist - lapply(xvars, f), but it didn't work.

 On 2013-12-18, at 3:29 AM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:

 Here's a way to generate a list of model objects. Once you have the
 list, you can write or call functions to extract useful pieces of
 information from each model object and use lapply() to call each list
 component recursively.

 sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5),
  var2=rbinom(50, size=2, prob=0.5),
  var3=rbinom(50, size=3, prob=0.5),
  var4=rbinom(50, size=2, prob=0.5),
  var5=rbinom(50, size=2, prob=0.5))

 # vector of x-variable names
 xvars - names(sample.df)[-1]

 # function to paste a variable x into a formula object and
 # then pass it to glm()
 f - function(x)
 {
form - as.formula(paste(var1, x, sep =  ~ ))
glm(form, data = sample.df)
 }

 # Apply the function f to each variable in xvars
 modlist - lapply(xvars, f)

 To give you an idea of some of the things you can do with the list:

 sapply(modlist, class)# return class of each component
 lapply(modlist, summary)   # return the summary of each model

 # combine the model coefficients into a two-column matrix
 do.call(rbind, lapply(modlist, coef))


 You'd probably want to rename the second column since the slopes are
 associated with different x variables.

 Dennis

 On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss sjk...@gmail.com wrote:
 I think I'm missing something.  I have a data frame that looks below.
 sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, 
 size=2, prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, 
 size=2, prob=0.5), var5=rbinom(50, size=2, prob=0.5))

 I'd like to run a series of univariate general linear models where var1 is 
 always the dependent variable and each of the other variables is the 
 independent. Then I'd like to summarize each in a table.
 I've tried :

 sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5)
 mapply(glm, formula=sample.formula, data=list(sample.df), family='binomial')

 And that works pretty well, except, I'm left with a matrix that contains 
 all the information I need. I can't figure out how to use summary() 
 properly on this information to usefully report that information.

 Thank you for any suggestions.

 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9
 Cell: +1 905 746 7606

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to manipulate this data?

2013-12-18 Thread Nitisha jha
Hi,

I want my output to be like this:
 Value
BXR
abc

 DHH abc

 DHK abc def
 DSL def ghi
 DSM abc def ghi  DSS def ghi
 DST ghi

 DIW abc

 DIL abc ghi

My input dataset is this with colnames name and Value:

 name Value  abc BXR  abc DHH  abc DHK  def DHK  def DSL  ghi DSL  abc DSM
def DSM  ghi DSM  def DSS  ghi DSS  ghi DST  abc DIW  abc DIL  ghi DIL

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] estimating survival function from Cox model

2013-12-18 Thread Terry Therneau

The standard error of the curve cannot be extracted from the summary 
information you have.
The variance is made up of two terms, one of which is a sum over all the death times, of a 
quadratic term per death time.  That term involves the variance matrix of the Cox model 
coefficients, the target value for x (the curve you want to calculate) and the average 
value of x at that time in the data set from which the Cox model was created.
  Just like linear regression, the se are higher when you predict far from the center 
of the original data set.


Terry Therneau


On 12/18/2013 05:00 AM, r-help-requ...@r-project.org wrote:

Hi, I have some questions on how to estimate the survival function from a Cox 
model. I know how to do this in R using survfit().


But let's say the model was done is another software, and I was only given the estimate of baseline cumulative hazard 
A0(t=10) at the specified time t=10 (baseline cumulative hazard refers to when covariate 
X=0)and the beta estimate b for the covariate used in Cox model X.


So the survival function at time 10 for a given covariate value x can be 
calculated as:

A(t=10|X=x) = exp(b*x)*A0(t=10) where A is cumulative hazard when X=x
S(t=10|X=x) = exp(-A(t=10|X=x)) where S is survival function to be calculated

Now I want to calculate confidence interval for S(t=10|X=x). I think I need to 
calculate the CI for cumulative hazard A(t=10|X=x) first and then exponentiate 
it to get CI for S, based on the relationship S = exp(-A).

To get CI for A, I need to calculate the estimate of standard error of A. I 
know the other software can give me the standard error of A0, the baseline 
cumulative hazard. Based on the relationship A = exp(b*x)*A0, I guess I'll need 
the standard error for b. But how do I calculate the standard error for A based 
on standard errors for A0 and b?

Any insights would be greatly appreciated!

John


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to manipulate this data?

2013-12-18 Thread arun
Hi,

Try:
dat1- read.table(text=name Value
  abc BXR
  abc DHH
  abc DHK
  def DHK
  def DSL
  ghi DSL
  abc DSM
  def DSM
  ghi DSM
  def DSS
  ghi DSS
  ghi DST
  abc DIW
  abc DIL
  ghi DIL,sep=,header=TRUE,stringsAsFactors=FALSE) 


aggregate(name~Value,data=dat1,paste,collapse= )
#or
library(plyr)
 ddply(dat1,.(Value),summarize, name=lapply(list(Value),paste,collapse= ))
A.K.


Hi, 

I want my output to be like this: 
 Value 
BXR 
abc 

 DHH abc 

 DHK abc def 
 DSL def ghi 
 DSM abc def ghi  DSS def ghi 
 DST ghi 

 DIW abc 

 DIL abc ghi 

My input dataset is this with colnames name and Value: 

 name Value  abc BXR  abc DHH  abc DHK  def DHK  def DSL  ghi DSL  abc DSM 
def DSM  ghi DSM  def DSS  ghi DSS  ghi DST  abc DIW  abc DIL  ghi DIL 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Predicting response from fitted linear model with incomplete new sample data

2013-12-18 Thread Chris Wilkinson
I would like to predict a new response from a fitted linear model where the
new data is a single case with a missing value. My reading of the help on
predict() is inconclusive on whether this is possible.

Leaving out the missing value or setting it to NA both fail but differently,
see example code below.

 y - runif(50)
 x1 - rnorm(50)
 x2 - rnorm(50)
 dat - data.frame(y, x1, x2)
 mod - lm(y~.,data=dat)
 summary(mod)

Call:
lm(formula = y ~ ., data = dat)
Residuals:
 Min   1Q   Median   3Q  Max 
-0.50467 -0.28997  0.01457  0.27970  0.47791 
Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0.500980.04577  10.945  1.6e-14 ***
x1  -0.017620.04172  -0.4220.675
x2  -0.027530.04920  -0.5600.578
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3177 on 47 degrees of freedom
Multiple R-squared:  0.009301,  Adjusted R-squared:  -0.03286 
F-statistic: 0.2206 on 2 and 47 DF,  p-value: 0.8028

 predict(mod, newdata=data.frame(x1=0.1, x2=0.3))   #OK as expected
1 
0.4909624 

 predict(mod, newdata=data.frame(x1=0.1))  # x2 missing
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels) : 
  variable lengths differ (found for 'x2')
In addition: Warning message:
'newdata' had 1 row but variables found have 50 rows 
 predict(mod, newdata=data.frame(x1=0.1, x2=NA))   #x2=NA
Error: variable 'x2' was fitted with type numeric but type logical was
supplied


Thanks
Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot() function: color transparency

2013-12-18 Thread capricy gao
I found all the color transparency was defined with character color, or rgb 
color. What if I have number code and still try to modify the transparency?

For example:

x=c(1:5)
 color=c(2,2,3,4,5)
 plot(x, col=color)
 plot(x, col=color,pch=20)

here I defined color by numbers, how can I modify the transparency?

Thanks a lot for any input!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Coxph convergence

2013-12-18 Thread Terry Therneau

I'll re-enter the fray.
The data set is an example where coxph is incorrect; due to round off error it 
is treating
a 5 column rank 3 matrix as if it were rank 4.  This of course results in 0 digits of 
precision.

  Immediate fix, for the user, is to add toler.chol= 1e-10 to their coxph 
call. It is
very likely that they will never ever have to change this value.

I will look into changing the default from its current value of 
.Machine$double.eps^.75.
However, I first have to check that this does not break anything else.  All 
epsilon
constants are a delicate dance between mutiple data sets, and anyone with long 
experience
in numerical anlysis will tell you that it is impossible to find constants that 
will work
for every data set.  This is true for linear models, logistic, Cox, ... you 
name it.

In summary:
I appreciate the example.
I'll add to my list of nasty problems.
I may be able to fix long term, and maybe not.  Changing the constant may break something 
else.

I've given a good short term fix.

Terry T.



On 12/18/2013 05:00 AM, r-help-requ...@r-project.org wrote:

Your comprehension of the issue seem to be entirely wrong. Between r11513 and r11516, 
some tuning of internal parmeters were done, so the process of finding the rank of a 
singular matrix no longer converges (within the time/tolerance implicitly specified). 
There are warnings issued, but then there are misc warnings before and after (and one 
gets desensitised about them). Also the nature of the problem, which is to 
test for possibility of interactions - or lacking thereof -

outcome ~ factor A + factor B + factor A x factor B

or just extra terms in outcome ~ factor A + factor B + ... as an exploration 
of auxiliary effects, more often than not extra terms won't make
any difference and the matrix involved just isn't the nicest to manipulate; it 
is in the nature of that kind of exploratory work.

Professor Therneau replied that it is possible to get the older convergent 
behaviour by manual tuning of some of the convergence criteria parameters; I 
have responded that while that is possible, often one is simultaneously 
exploring many models with many possible auxiliary effects (and lacking 
thereof), manual tuning for each is neither feasible nor appropriate; and we 
sort of left it at that.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot() function: color transparency

2013-12-18 Thread Duncan Murdoch

On 13-12-18 2:08 PM, capricy gao wrote:

I found all the color transparency was defined with character color, or rgb 
color. What if I have number code and still try to modify the transparency?

For example:


x=c(1:5)
color=c(2,2,3,4,5)
plot(x, col=color)
plot(x, col=color,pch=20)


here I defined color by numbers, how can I modify the transparency?


See the examples for ?palette.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Predicting response from fitted linear model with incomplete new sample data

2013-12-18 Thread Rolf Turner



As far as I can discern, your question makes no sense at all.

Suppose you *know* that y = 2 + 3*x1 + 4*x2.

Now what should you predict when x1 = 6 (with x2 missing/unknown)?

See fortune(magic).

On 19/12/13 07:18, Chris Wilkinson wrote:

I would like to predict a new response from a fitted linear model where the
new data is a single case with a missing value. My reading of the help on
predict() is inconclusive on whether this is possible.

Leaving out the missing value or setting it to NA both fail but differently,
see example code below.


y - runif(50)
x1 - rnorm(50)
x2 - rnorm(50)
dat - data.frame(y, x1, x2)
mod - lm(y~.,data=dat)
summary(mod)

Call:
lm(formula = y ~ ., data = dat)
Residuals:
  Min   1Q   Median   3Q  Max
-0.50467 -0.28997  0.01457  0.27970  0.47791
Coefficients:
 Estimate Std. Error t value Pr(|t|)
(Intercept)  0.500980.04577  10.945  1.6e-14 ***
x1  -0.017620.04172  -0.4220.675
x2  -0.027530.04920  -0.5600.578
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.3177 on 47 degrees of freedom
Multiple R-squared:  0.009301,  Adjusted R-squared:  -0.03286
F-statistic: 0.2206 on 2 and 47 DF,  p-value: 0.8028


predict(mod, newdata=data.frame(x1=0.1, x2=0.3))   #OK as expected

 1
0.4909624


predict(mod, newdata=data.frame(x1=0.1))  # x2 missing

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev =
object$xlevels) :
   variable lengths differ (found for 'x2')
In addition: Warning message:
'newdata' had 1 row but variables found have 50 rows

predict(mod, newdata=data.frame(x1=0.1, x2=NA))   #x2=NA

Error: variable 'x2' was fitted with type numeric but type logical was
supplied


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using assign with mapply

2013-12-18 Thread Greg Snow
The take home message that you should be learning from your struggles
is to Not Use The 'assign' Function! and Do Not Use Global
Variables Like This.

R has lists (and environments) that make working with objects that are
associated with each other much simpler and fits better with the
functional programming style of R.

For example you can create a list from your data frame quickly and
easily with code like:

mydata - as.list(kkk$vals)
names(mydata) - kkk$vars

or

mydata - setNames( as.list(kkk$vals), kkk$vars )

then you will have you variables (with names) inside the list (mydata
in this example, but name it whatever you want)

This list can then be passed out of a function or otherwise used.

To access a specific variable by name you can do:

mydata$var1

or

mydata[['var2']]

or

with(mydata, var3)

but you can also do things like (and this is often a follow-up
question to questions like yours):

varname - 'var3'
mydata[[ varname ]]

and you can also use lapply and sapply to do the same action on every
variable in your list:

sapply( mydata, function(x) x + 5 )

instead of having to loop through a bunch of global variables.

And if you want to save or delete these, now you just save or delete
the entire list rather than having to loop through the set of global
variables.

If you tell us more about how you want to use these variables we can
give more suggestions, but the main point is that in the long run you
will be happier learning to use lists (and possibly environments) in
place of trying to create and work with global variables like you
asked about.




On Mon, Dec 16, 2013 at 8:55 AM, Julio Sergio Santana
julioser...@gmail.com wrote:
 Julio Sergio Santana juliosergio at gmail.com writes:


 I have a data frame whose first colum contains the names of the variables
 and whose second colum contains the values to assign to them:

: kkk - data.frame(vars=c(var1, var2, var3),
  vals=c(10, 20, 30), stringsAsFactors=F)


 For those interested in the problem this is how I solved the problem:


 I want to have something similar to:
 #
 #   var1 - 10
 #   var2 - 20
 #   var3 - 30

 my first trial was:

mapply(assign,  kkk$vars, kkk$vals)
 ## var1 var2 var3
 ## 10   20   30
 #

 This is, however, what I got:

var1
 ## Error: object 'var1' not found

 David Winsemius suggested me something similar to


mapply(assign,  kkk$vars, kkk$vals, MoreArgs = list(pos = 1))
 # or:
mapply(assign,  kkk$vars, kkk$vals, MoreArgs = list(envir = .GlobalEnv))

 var1
 ## [1] 10

 This almost works, but what if this construction is used inside a function?

example - function () {
   var1 - 250
   kkk - data.frame(vars=c(var1, var2, var3),
 vals=c(10, 20, 30), stringsAsFactors=F)
  mapply(assign,  kkk$vars, kkk$vals, MoreArgs = list(pos = 1))
  print (var2)
  print (var1)
}

example()
 ## [1] 20
 ## [1] 250

 var1, which was defined inside the function, isn't modified

 To fix this, I defined the function as follows:

example - function () {
  var1 - 250
  kkk - data.frame(vars=c(var1, var2, var3),
vals=c(10, 20, 30), stringsAsFactors=F)
  mapply(assign,  kkk$vars, kkk$vals,
 MoreArgs = list(pos = sys.frame(sys.nframe(
  # sys.nframe() is the number of the frame created inside the function
  # and sys.frame() establishes it as the one assign uses to set values
  print (var2)
  print (var1)
}

example()
 ## [1] 20
 ## [1] 10

 And the purpose is got

 Thanks,

   -Sergio.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help using mapply to run multiple models

2013-12-18 Thread William Dunlap
Try something like the following.  Because lm() evaluates many
of its arguments in nonstandard ways, f() manipulates the call
and then evaluates it in the frame from which f() was called.
It also puts that environment on the formula that it creates so
it can refer to variables in that environment.
f - function (responseName, predictorNames, data, ..., envir = 
parent.frame())
{
call - match.call()
call$formula - formula(envir = envir, paste(responseName, sep =  ~ ,
paste0(`, predictorNames, `, collapse =  + )))
call[[1]] - quote(glm) # 'f' - 'glm'
call$responseName - NULL # omit responseName=
call$predictorNames - NULL # omit 'predictorNames='
eval(call, envir = envir)
}
as in
z - lapply(list(c(hp,drat), c(cyl), c(am,gear)), 
FUN=function(preds)f(carb, preds, data=mtcars, family=poisson))
lapply(z, summary)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Simon Kiss
 Sent: Wednesday, December 18, 2013 9:11 AM
 To: Dennis Murphy
 Cc: r-help@r-project.org
 Subject: Re: [R] Help using mapply to run multiple models
 
 Dennis, how would your function be modified to allow it to be more flexible 
 in future.
 I'm thinking like:
  f - function(x='Dependent variable', y='List of Independent Variables', 
  z='Data Frame')
  {
 form - as.formula(paste(x, y, sep =  ~ ))
 glm(form, data =z)
  }
 
 I tried that then using
 modlist - lapply(xvars, f), but it didn't work.
 
 On 2013-12-18, at 3:29 AM, Dennis Murphy djmu...@gmail.com wrote:
 
  Hi:
 
  Here's a way to generate a list of model objects. Once you have the
  list, you can write or call functions to extract useful pieces of
  information from each model object and use lapply() to call each list
  component recursively.
 
  sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5),
   var2=rbinom(50, size=2, prob=0.5),
   var3=rbinom(50, size=3, prob=0.5),
   var4=rbinom(50, size=2, prob=0.5),
   var5=rbinom(50, size=2, prob=0.5))
 
  # vector of x-variable names
  xvars - names(sample.df)[-1]
 
  # function to paste a variable x into a formula object and
  # then pass it to glm()
  f - function(x)
  {
 form - as.formula(paste(var1, x, sep =  ~ ))
 glm(form, data = sample.df)
  }
 
  # Apply the function f to each variable in xvars
  modlist - lapply(xvars, f)
 
  To give you an idea of some of the things you can do with the list:
 
  sapply(modlist, class)# return class of each component
  lapply(modlist, summary)   # return the summary of each model
 
  # combine the model coefficients into a two-column matrix
  do.call(rbind, lapply(modlist, coef))
 
 
  You'd probably want to rename the second column since the slopes are
  associated with different x variables.
 
  Dennis
 
  On Tue, Dec 17, 2013 at 5:53 PM, Simon Kiss sjk...@gmail.com wrote:
  I think I'm missing something.  I have a data frame that looks below.
  sample.df-data.frame(var1=rbinom(50, size=1, prob=0.5), var2=rbinom(50, 
  size=2,
 prob=0.5), var3=rbinom(50, size=3, prob=0.5), var4=rbinom(50, size=2, 
 prob=0.5),
 var5=rbinom(50, size=2, prob=0.5))
 
  I'd like to run a series of univariate general linear models where var1 is 
  always the
 dependent variable and each of the other variables is the independent. Then 
 I'd like to
 summarize each in a table.
  I've tried :
 
  sample.formula=list(var1~var2, var1 ~var3, var1 ~var4, var1~var5)
  mapply(glm, formula=sample.formula, data=list(sample.df), 
  family='binomial')
 
  And that works pretty well, except, I'm left with a matrix that contains 
  all the
 information I need. I can't figure out how to use summary() properly on this 
 information
 to usefully report that information.
 
  Thank you for any suggestions.
 
  *
  Simon J. Kiss, PhD
  Assistant Professor, Wilfrid Laurier University
  73 George Street
  Brantford, Ontario, Canada
  N3T 2C9
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9
 Cell: +1 905 746 7606
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list

Re: [R] Exporting R graphics into Word without losing graph quality

2013-12-18 Thread Greg Snow
Another option to consider if your goal is to create a word file with
1 or more plots in it (possibly intermingled with text and other
output) is to use the knitr or pander packages (or odfWeave or sweave
or ...).  This way you can create a script (or template file) that
sets a couple of options up front (width, height, resolution, file
type) and creates the graphs (and possibly other output to be
included), run the script (and maybe the external program pandoc) and
you have a word document with the plots without needing to copy/paste
or import.

This becomes a real time saver when the client comes back to you and
says there was a typo in the data, the 13 on line 27 needs to be an 18
and can you rerun everything with that change?  (if something along
those lines has not happened to you yet, it will).

On Mon, Dec 16, 2013 at 9:33 PM, david hamer j.david.ha...@gmail.com wrote:
 Thanks to everyone for the helpful suggestions.   --   David.


 On Mon, Dec 16, 2013 at 7:23 PM, Steve Taylor steve.tay...@aut.ac.nzwrote:

  From: Duncan Murdoch...

  Don't use a bitmap format (png).
 I disagree.  Each vector format comes with its own problems.

  Don't produce your graph in one format (screen display), then convert to
  another (png).  Open the device in the format you want for the final
 file.
 Agreed.

  Use a vector format for output.
 Why?  Sure, that's good advice in the ideal (pdflatex) world, but not
 necessarily the best of advice for Word users.

  I don't know what kinds Word supports, but
  EPS or PDF would likely be best; if it can't read those, then Windows
 metafile
  (via windows() to open the device) would be best.
 None of these works well, if at all, in my experience with Word.

  Don't use Word.
 Some of us don't really have a choice.




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problems with Installing R ('failure to expand shell folder constant userdocs)

2013-12-18 Thread Barry DeCicco
 
Hello,
 
I’m trying to install R 3.02 for windows on a Windows 7 machine (we were just 
upgraded from Windows XP).
It’s a networked machine in a work environment.  I have administrator rights.
Program files go on C; data is stored on D (it's a partitioned drive).
 
I get the message 'failure to expand shell folder constant userdocs
 
I searched, and found some stuff on the Microsoft website, which made me 
believe that  I needed to unlock a folder with my user name.  There are two, 
one in ‘C:\Users’, which has a padlock symbol on it, and another in 
‘D:\’m which doesn’t have a padlock icon.  Both had been set to read 
only; I removed that.  
 
I’ve not been able to find anything on the CRAN website about this.
 
Has anybody run into this problem before?
 
 
Thank you very much, 
 
Barry DeCicco
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot() function: color transparency

2013-12-18 Thread capricy gao
I checked as you suggested. However, I found that the number in those functions 
are the number of colors. In contrast, my number here means a specific color, 
for example, 2 in my code means red, 3 in my code means green





On Wednesday, December 18, 2013 1:18 PM, Duncan Murdoch 
murdoch.dun...@gmail.com wrote:
 
On 13-12-18 2:08 PM, capricy gao wrote:

 I found all the color transparency was defined with character color, or rgb 
 color. What if I have number code and still try to modify the transparency?

 For example:

 x=c(1:5)
 color=c(2,2,3,4,5)
 plot(x, col=color)
 plot(x, col=color,pch=20)

 here I defined color by numbers, how can I modify the transparency?

See the examples for ?palette.

Duncan Murdoch
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot() function: color transparency

2013-12-18 Thread Christoph Scherber
you will need to specify colours as RGB values and then set transparency 
via the alpha argument.


e.g.: color=rgb(0,0,0,alpha=0.3)

# will  give black (0,0,0) and a transparency of 30%.

Best wishes
Christoph


On 18/12/2013 23:23, capricy gao wrote:

I checked as you suggested. However, I found that the number in those functions are the number of 
colors. In contrast, my number here means a specific color, for example, 2 in my code means 
red, 3 in my code means green





On Wednesday, December 18, 2013 1:18 PM, Duncan Murdoch 
murdoch.dun...@gmail.com wrote:
  
On 13-12-18 2:08 PM, capricy gao wrote:



I found all the color transparency was defined with character color, or rgb 
color. What if I have number code and still try to modify the transparency?

For example:


x=c(1:5)
color=c(2,2,3,4,5)
plot(x, col=color)
plot(x, col=color,pch=20)

here I defined color by numbers, how can I modify the transparency?

See the examples for ?palette.

Duncan Murdoch
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Exporting R graphics into Word without losing graph quality

2013-12-18 Thread Kevin Wright
Another nice thing about your solution is that circles look like circles,
and not like diamonds (when viewed on screen).

Thanks.

Kevin Wright



On Mon, Dec 16, 2013 at 8:02 PM, Steve Taylor steve.tay...@aut.ac.nzwrote:

 Unfortunately the win.metafile() device does not support semi-transparent
 colours, which I like using.

 In my experience, the best way to get R graphics into Word is to use
 compressed high-resolution tiff, like this:

 word.tif = function(filename=Word_Figure_%03d.tif, zoom=4, width=17,
 height=10, pointsize=10, ...) {
   if (!grepl([.]ti[f]+$, filename, ignore.case=TRUE))
   filename = paste0(filename,.tif)
   tiff(filename=filename, compression=lzw, res=96*zoom,
width=width, height=height, units='cm', pointsize=pointsize, ...)
 }
 word.tif('test')
 plot(rnorm(100))
 dev.off()

 Now drag the file test.tif into your Word document.

 Sure, it's a bitmap format rather than a vector format, but the quality is
 excellent and the file sizes are still quite small.  None of the vector
 formats works as well as this.

 cheers,
 Steve

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot() function: color transparency

2013-12-18 Thread Richard Kwock
As Duncan suggested, this will probably get you what you want.

You can set the transparency using alpha.f in adjustcolor().

x - c(1:5)
color - c(2,2,3,4,5)
color_transparent - adjustcolor(color, alpha.f = 0.3)

plot(x, col = color, pch = 20, cex = 4)
plot(x, col = color_transparent, pch = 20, cex = 4)

?adjustcolor

Richard



On Wed, Dec 18, 2013 at 2:35 PM, Christoph Scherber csche...@gwdg.dewrote:

 you will need to specify colours as RGB values and then set transparency
 via the alpha argument.

 e.g.: color=rgb(0,0,0,alpha=0.3)

 # will  give black (0,0,0) and a transparency of 30%.

 Best wishes
 Christoph



 On 18/12/2013 23:23, capricy gao wrote:

 I checked as you suggested. However, I found that the number in those
 functions are the number of colors. In contrast, my number here means a
 specific color, for example, 2 in my code means red, 3 in my code means
 green





 On Wednesday, December 18, 2013 1:18 PM, Duncan Murdoch 
 murdoch.dun...@gmail.com wrote:
   On 13-12-18 2:08 PM, capricy gao wrote:

  I found all the color transparency was defined with character color, or
 rgb color. What if I have number code and still try to modify the
 transparency?

 For example:

  x=c(1:5)
 color=c(2,2,3,4,5)
 plot(x, col=color)
 plot(x, col=color,pch=20)

 here I defined color by numbers, how can I modify the transparency?

 See the examples for ?palette.

 Duncan Murdoch
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/
 posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/
 posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot() function: color transparency

2013-12-18 Thread Duncan Murdoch

On 13-12-18 5:23 PM, capricy gao wrote:

I checked as you suggested. However, I found that the number in those
functions are the number of colors. In contrast, my number here means a
specific color, for example, 2 in my code means red, 3 in my code
means green



You didn't read very carefully.

Duncan Murdoch




On Wednesday, December 18, 2013 1:18 PM, Duncan Murdoch
murdoch.dun...@gmail.com wrote:
On 13-12-18 2:08 PM, capricy gao wrote:

  I found all the color transparency was defined with character color,
or rgb color. What if I have number code and still try to modify the
transparency?
 
  For example:
 
  x=c(1:5)
  color=c(2,2,3,4,5)
  plot(x, col=color)
  plot(x, col=color,pch=20)
 
  here I defined color by numbers, how can I modify the transparency?


See the examples for ?palette.

Duncan Murdoch






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] binary symmetric matrix combination

2013-12-18 Thread arun
Hi,
You could try:
Either:

dat1 - read.table(Test.txt,header=TRUE)
dim(dat1)
#[1] 4735 4735
dat2 - read.table(1991res.txt,header=TRUE)
dim(dat2)
#[1] 574 574
m1 - as.matrix(dat1)
m2 - as.matrix(dat2)
library(data.table)
d1 - 
data.table(Name1=as.vector(outer(rownames(m1),colnames(m1),paste0)),value1=as.vector(m1),key='Name1')
d2 - 
data.table(Name1=as.vector(outer(rownames(m2),colnames(m2),paste0)),value2=as.vector(m2),key='Name1')
res - d2[d1]
res1 - as.data.frame(res)
res1[,3][!is.na(res1[,2])] - res1[,2][!is.na(res1[,2])]
vec1 - as.vector(outer(rownames(m1),colnames(m1),paste0))
res2 - res1[match(vec1,res1[,1]),-2]
res3 - matrix(res2[,2],dimnames=list(rownames(m1),colnames(m1)),ncol=ncol(m1))


#or

vec1 - paste0(rownames(m1)[row(m1)],colnames(m1)[col(m1)])
 vec2 - paste0(rownames(m2)[row(m2)],colnames(m2)[col(m2)])
indx - match(vec1,vec2)
indx1 - indx[!is.na(indx)]
 indx2 - match(vec2,vec1)
indx2N - indx2[!is.na(indx2)]
 m1[indx2N] - m2[indx1]



A.K.


On Monday, December 16, 2013 1:54 PM, Elio Shijaku sel...@gmail.com wrote:

Hi Arun,

I hope you remember me. You helped me build several symmetrical matrices in R. 
I have one question though. I have now build all the matrices and I want to 
combine each of them to a larger one. To make the matter simpler to understand, 
I am attaching the text version of both files. What I would like is to merge 
the contents of 1991res (574x574 symmetric matrix) into the Test file 
(4703x4703, symmetric matrix which includes the row names of the 1991res. 

Could you help me by showing the steps I should take to merge the matrices?

Thank you in advance and sorry for disturbing.

Best,

Elio




On Mon, Sep 23, 2013 at 2:19 AM, arun smartpink...@yahoo.com wrote:

Hi Elio,

There was only one matrix with that error.  Glad you were able to correct it.

I sent an linkedin request to you.
Regards
Arun








From: Elio Shijaku sel...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Sunday, September 22, 2013 7:21 PM

Subject: Re: binary symmetric matrix combination



Hi Arun,

You are very right, that mistake which i corrected by removal fixed the issue, 
now the matrix works perfectly.

Many thanks for all your help, please if you are in LinkedIn, I would be 
delighted to add you as a friend.

Here is my profile in case you're interested:
es.linkedin.com/pub/elio-shijaku/11/b7b/147/

Hopefully I can disturb you in case I have further questions, I am just 
learning R and everything is new to me.

All the best,

Elio




On Mon, Sep 23, 2013 at 12:05 AM, arun smartpink...@yahoo.com wrote:

Hi,
Actually, you have duplicate names.


p226 p226 s112 

What do you need to do in those situations?  Looks like it is a mistake in 
the file.





From: Elio Shijaku sel...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Sunday, September 22, 2013 5:41 PM

Subject: Re: binary symmetric matrix combination



Hi Arun,

Here is the file, I tried many options, once I enter the command

lst2- lapply(lst1[lapply(lst1,length)0],function(x) 
as.matrix(read.table(text=x,row.name=1)))


I get:

Error in read.table(text = x, row.name = 1) :  duplicate 'row.names' are not 
allowed


Any idea? Thanks a lot for the help.


Elio





On Sun, Sep 22, 2013 at 5:02 PM, arun smartpink...@yahoo.com wrote:

Hi Elio,
Check the new text file.  I used the first line:

lines1-str_trim(gsub(\t, ,readLines(file.txt)))
because the file was \t separated

In the new file, it could be just space separated.  So, you may only need:
lines1- readLines(file.txt)

If possible, could you email me the file.  I can take a look into it.
A.K.











From: Elio Shijaku sel...@gmail.com
To: arun smartpink...@yahoo.com
Sent: Sunday, September 22, 2013 7:55 AM

Subject: Re: binary symmetric matrix combination




Hi Arun,

I am replying to this e-mail address to not break the forum's rules. 
Everything worked great. However, when I use another text file called 
file, I use the commands you gave me but I get this problem:

library(stringr)

lines1-str_trim(gsub(\t, ,readLines(file.txt)))
 lst1-lapply(split(lines1,
cumsum(lines1==)),function(x) x[x!=])


lst2- lapply(lst1[lapply(lst1,length)0],function(x) 
as.matrix(read.table(text=x,row.names=1)))

Error in read.table(text = x, row.names = 1) :  duplicate 'row.names' are 
not allowed

#I cannot then continue with the rest:


names(lst2)- paste0(m,seq_along(lst2))
dat- do.call(rbind,lapply(names(lst2),function(x) {x1- lst2[[x]]; 
cbind(expand.grid(rep(list(colnames(x1)),2),stringsAsFactors=FALSE),value=as.vector(x1))}))
library(reshape2)
res- dcast(dat,Var1~Var2,value.var=value,sum)
 row.names(res)- res[,1]
 res- as.matrix(res[,-1])
 dim(res)


Any idea on why such problem?

Thanks yet again




On Wed, Sep 18, 2013 at 7:44 PM, smartpink...@yahoo.com wrote:

Hi,
Forgot to comment about the second Error.
The error suggests 'Libro3.txt' is not found in your working directory.  

[R] plot different groups as factors

2013-12-18 Thread Luigi Marongiu
dear all,
i would like to plot the value of different response groups. when i simply
use  plot(y ~ x) i obtain a series of boxplots. i would rather use dots. i
also tried with stripchart(y ~ x), which gives better results but does not
place properly the labels since place them alphabetically.
in addition i actually have 6 response groups: 3 classes (low, medium,
high) and 2 sampling time (12 and 18 months).
how can i generate these individual groups and plot them in the correct
order (low, medium, high and 12, 18)? i believe is something to do with the
factors but i don't know how to implement them.
what i am looking for is to generate a figure such as the one i sketched in
the attached file. i also attached a dataframe version of my data. the
vectors containing the same data are:
time -c(  18,18,18,18,18,18,18,18,18,
18,18,18,18,18,18,18,18,18,18,
18,18,18,18,18,18,18,18,18,12,
12,12,12,12,12,12,12,12,12,12,
12,12,12,12,12,12,12,12,12,12,
12,12,12,12,12,12,12,12,12,12,
12,12,12,12,12,12,12)
class -c(medium,medium,medium,medium,medium,
medium,medium,medium,medium,medium,medium,
medium,medium,medium,high,high,high,high,
high,high,high,low,low,low,low,low,low,
low,medium,medium,medium,medium,medium,medium,
medium,medium,medium,medium,medium,medium,
medium,medium,medium,medium,medium,medium,high,
high,high,high,high,high,high,high,high,
high,low,low,low,low,low,low,low,low,
low,low)
value -c(2.92,0.01,0.36,3.16,0.99,0.38,
0.01,5.1,0.04,0.01,1.33,4.13,0.15,0.15,
14.18,4290.14,26.8,5.33,17.58,14.29,248.5,
0.01,0,0,0,0,0,0,0.151395382,
5.327863403,5.10096383,1.32567787,4.352404124,
0.458606982,2.915908912,0.011996374,0.364710382,
0.033016026,3.161701212,0.381564497,0.010971385,
0.035646472,0.014781805,4.129708296,0.153094117,
0.018497847,15.09178491,17.58393041,14.17643928,
4290.143561,26.79730719,294.6367065,14.2888441,
248.495231,209.3131795,2014.506722,0.010751273,
0.002325138,0.000637473,0.003984336,0.006018154,
0.003620907,0.745936,0.000142311,0.002460417,
0.001280189)

Thank you very much for any help you could provide!
regards
Luigi
timeclass   value
1   18  medium  2.92
2   18  medium  0.01
3   18  medium  0.36
4   18  medium  3.16
5   18  medium  0.99
6   18  medium  0.38
7   18  medium  0.01
8   18  medium  5.1
9   18  medium  0.04
10  18  medium  0.01
11  18  medium  1.33
12  18  medium  4.13
13  18  medium  0.15
14  18  medium  0.15
15  18  high14.18
16  18  high4,290.14
17  18  high26.8
18  18  high5.33
19  18  high17.58
20  18  high14.29
21  18  high248.5
22  18  low 0.01
23  18  low 0
24  18  low 0
25  18  low 0
26  18  low 0
27  18  low 0
28  18  low 0
29  12  medium  0.151395382
30  12  medium  5.327863403
31  12  medium  5.10096383
32  12  medium  1.32567787
33  12  medium  4.352404124
34  12  medium  0.458606982
35  12  medium  2.915908912
36  12  medium  0.011996374
37  12  medium  0.364710382
38  12  medium  0.033016026
39  12  medium  3.161701212
40  12  medium  0.381564497
41  12  medium  0.010971385
42  12  medium  0.035646472
43  12  medium  0.014781805
44  12  medium  4.129708296
45  12  medium  0.153094117
46  12  medium  0.018497847
47  12  high15.09178491
48  12  high17.58393041
49  12  high14.17643928
50  12  high4290.143561
51  12  high26.79730719
52  12  high294.6367065
53  12  high14.2888441
54  12  high248.495231
55  12  high209.3131795
56  12  high2014.506722
57  12  low 0.010751273
58  12  low 0.002325138
59  12  low 0.000637473
60  12  low 0.003984336
61  12  low 0.006018154
62  12  low 0.003620907
63  12  low 7.46E-05
64  12  low 0.000142311
65  12  low 0.002460417
66  12  low 0.001280189
__

[R] assessment of validity of PLS predictions

2013-12-18 Thread dx 5212
Hello,

I've built a PLSR model to predict the concentrations of mixture components
from experimental data using the 'pls' library. I calculated the Q residual
(or lack of fit) and T squared value for each of the samples used to build
the model in order to assess how well each sample is described by the
model. This is straightforward to do for these data because their X scores
are returned by the 'plsr' function in addition to the X loadings of the
model- both of these are needed to calculate the Q residual and T squared
value.

I'd like to calculate values for Q residual and T squared for predictions
for samples for which the concentrations aren't known. However, the 'plsr'
function doesn't calculate the X scores for predictions. I can solve for
them if I solve the equation (in R code):

X = T%*%trans(P)

for T:

trans(T) = (P*)%*%trans(X)

where X is the data matrix from prediction samples, T is X scores matrix
for prediction samples, P is X loadings matrix from the model and P* is the
pseudo-inverse of this matrix. From these calculated X scores of the
prediction samples and the X loadings of the model, I can calculate T
squared and the Q residual values.

My main question: Is my approach a reasonable one to identify samples that
may not be well described by a given model? If not, can anyone direct me to
a resource that describes better methods?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read .slk file

2013-12-18 Thread Jeff Newmiller
I suspect this could be SYLK format, a very old spreadsheet exchange format. 
The only hit that I get from RSiteSearch(sylk) is read.gnumeric.sheet, which 
depends on an external program ssconvert to extract CSV. In the plus 
department, it is a text-based format that has been implemented numerous times 
so you could roll your own reader based on open source code. In the minus 
department, there is no published specification, so if your files have quirks 
then you will have to work around them one by one, and it really is like 
Latin... a dead language.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

PIKAL Petr petr.pi...@precheza.cz wrote:
Hi

Extension does not specify file format. You can rename any file with
any extension without changing its nature. However slk stays for
symbolic link and therefore it just brings actual file to Excel. 

Maybe you could start to play with

http://stat.ethz.ch/R-manual/R-devel/library/base/html/files.html

Regards
Petr

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Santosh
 Sent: Tuesday, December 17, 2013 12:11 AM
 To: r-help
 Subject: [R] read .slk file
 
 Dear Rxperts..
 
 I recently received a data file with the extension .slk. If I save
 the file as MS Excel file, I am able to read in R without issues.  Is
 it possible to read this .slk file without converting into another
R-
 readable data format?
 
 Regards,
 Santosh
 
  [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot different groups as factors

2013-12-18 Thread Jim Lemon

On 12/19/2013 11:18 AM, Luigi Marongiu wrote:

dear all,
i would like to plot the value of different response groups. when i simply
use  plot(y ~ x) i obtain a series of boxplots. i would rather use dots. i
also tried with stripchart(y ~ x), which gives better results but does not
place properly the labels since place them alphabetically.
in addition i actually have 6 response groups: 3 classes (low, medium,
high) and 2 sampling time (12 and 18 months).
how can i generate these individual groups and plot them in the correct
order (low, medium, high and 12, 18)? i believe is something to do with the
factors but i don't know how to implement them.


Hi Luigi,
Perhaps you want something like this:

lmdf-read.table(text=time class value
18 medium 2.92
18 medium 0.01
18 medium 0.36
18 medium 3.16
18 medium 0.99
18 medium 0.38
18 medium 0.01
18 medium 5.1
18 medium 0.04
18 medium 0.01
18 medium 1.33
18 medium 4.13
18 medium 0.15
18 medium 0.15
18 high 14.18
18 high 4290.14
18 high 26.8
18 high 5.33
18 high 17.58
18 high 14.29
18 high 248.5
18 low 0.01
18 low 0
18 low 0
18 low 0
18 low 0
18 low 0
18 low 0
12 medium 0.151395382
12 medium 5.327863403
12 medium 5.10096383
12 medium 1.32567787
12 medium 4.352404124
12 medium 0.458606982
12 medium 2.915908912
12 medium 0.011996374
12 medium 0.364710382
12 medium 0.033016026
12 medium 3.161701212
12 medium 0.381564497
12 medium 0.010971385
12 medium 0.035646472
12 medium 0.014781805
12 medium 4.129708296
12 medium 0.153094117
12 medium 0.018497847
12 high 15.09178491
12 high 17.58393041
12 high 14.17643928
12 high 4290.143561
12 high 26.79730719
12 high 294.6367065
12 high 14.2888441
12 high 248.495231
12 high 209.3131795
12 high 2014.506722
12 low 0.010751273
12 low 0.002325138
12 low 0.000637473
12 low 0.003984336
12 low 0.006018154
12 low 0.003620907
12 low 7.46E-05
12 low 0.000142311
12 low 0.002460417
12 low 0.001280189,header=TRUE)

# get the factor levels in the correct order
lmdf$class-factor(lmdf$class,levels=c(low,medium,high))
library(plotrix)
brkdn.plot(value,class,time,lmdf,col=2:4)
legend(14,1200,c(low,medium,high),col=2:4,lty=1)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Welcome to the R-help mailing list

2013-12-18 Thread Lydia Keppler
Hi there,

I am new to R and programming in general and am looking for help with
writing a function with dates and times. I have checked around but am still
a bit stuck.

Basically, I have data in the format dd/mm/ HH:MM and I have to
calculate how much time has passed between various events.

I have given the following command ( where date is the column in my data
frame that indicates the date and time)
date=as.Date.factor(date,format=%d/%m/%Y %H:%M)
However, this displays only the date, without the time.

I also have tried:
date=substr(argo1$date,1,907)
And it shows the date and time.

However, when I try to find the difference between two dates i.e.the time
that has passed with the command: difftime(date[2],date[3],unit=secs), it
returns that 0 seconds have passed.

When I try to find the difference with the command:
date[3]-date[2]
it tells me Error in date[3] - date[2] : non-numeric argument to binary
operator

The class(date) is character.

Any idea what I am doing wrong?
Thank you very much in advance!


On 19 December 2013 15:22, r-help-requ...@r-project.org wrote:

 Welcome to the R-help@r-project.org mailing list!

 To post to this list, send your message to:

   r-help@r-project.org

 General information about the mailing list is at:

   https://stat.ethz.ch/mailman/listinfo/r-help

 If you ever want to unsubscribe or change your options (eg, switch to
 or from digest mode, change your password, etc.), visit your
 subscription page at:

   https://stat.ethz.ch/mailman/options/r-help/lydiakeppler%40gmail.com

 You can also make such adjustments via email by sending a message to:

   r-help-requ...@r-project.org

 with the word `help' in the subject or body (don't include the
 quotes), and you will get back a message with instructions.

 You must know your password to change your options (including changing
 the password, itself) or to unsubscribe without confirmation.  It is:

   whaletail27

 Normally, Mailman will remind you of your r-project.org mailing list
 passwords once every month, although you can disable this if you
 prefer.  This reminder will also include instructions on how to
 unsubscribe or change your account options.  There is also a button on
 your options page that will email your current password to you.




-- 
*Protect the environment: think before you print*
º

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Welcome to the R-help mailing list

2013-12-18 Thread jim holtman
use POSIXct instead of Date:


 x - c(2013-12-12 12:00:00, 2013-12-15 03:15:23)
 # convert using as.POSIXct

 times - as.POSIXct(x, format = %Y-%m-%d %H:%M:%S)
 times
[1] 2013-12-12 12:00:00 EST 2013-12-15 03:15:23 EST

 difftime(times[2], times[1], units = 'secs')
Time difference of 227723 secs



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Dec 18, 2013 at 9:24 PM, Lydia Keppler lydiakepp...@gmail.com wrote:
 Hi there,

 I am new to R and programming in general and am looking for help with
 writing a function with dates and times. I have checked around but am still
 a bit stuck.

 Basically, I have data in the format dd/mm/ HH:MM and I have to
 calculate how much time has passed between various events.

 I have given the following command ( where date is the column in my data
 frame that indicates the date and time)
 date=as.Date.factor(date,format=%d/%m/%Y %H:%M)
 However, this displays only the date, without the time.

 I also have tried:
 date=substr(argo1$date,1,907)
 And it shows the date and time.

 However, when I try to find the difference between two dates i.e.the time
 that has passed with the command: difftime(date[2],date[3],unit=secs), it
 returns that 0 seconds have passed.

 When I try to find the difference with the command:
 date[3]-date[2]
 it tells me Error in date[3] - date[2] : non-numeric argument to binary
 operator

 The class(date) is character.

 Any idea what I am doing wrong?
 Thank you very much in advance!


 On 19 December 2013 15:22, r-help-requ...@r-project.org wrote:

 Welcome to the R-help@r-project.org mailing list!

 To post to this list, send your message to:

   r-help@r-project.org

 General information about the mailing list is at:

   https://stat.ethz.ch/mailman/listinfo/r-help

 If you ever want to unsubscribe or change your options (eg, switch to
 or from digest mode, change your password, etc.), visit your
 subscription page at:

   https://stat.ethz.ch/mailman/options/r-help/lydiakeppler%40gmail.com

 You can also make such adjustments via email by sending a message to:

   r-help-requ...@r-project.org

 with the word `help' in the subject or body (don't include the
 quotes), and you will get back a message with instructions.

 You must know your password to change your options (including changing
 the password, itself) or to unsubscribe without confirmation.  It is:

   whaletail27

 Normally, Mailman will remind you of your r-project.org mailing list
 passwords once every month, although you can disable this if you
 prefer.  This reminder will also include instructions on how to
 unsubscribe or change your account options.  There is also a button on
 your options page that will email your current password to you.




 --
 *Protect the environment: think before you print*
º

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GLMM parameter estimates giving opposite trends

2013-12-18 Thread Diana Virkki
I apologize if this is a simple question.

I am running GLMM's using glmmML and model averaging with MuMIn. One of the
parameter estimates for a parameter (firefreq) in the best model is giving
a positive number, where in reality I know this to be a negative
correlation.
I have checked and double checked the data that has gone in and this is not
the issue. This is occurring for numerous variables in my models.

As far as I was aware the parameter estimate is indicative of the direction
of the relationship? Is there any reason why this model would give me
opposite trends?

Let me know if any other information would be useful in answering this
question.
Thank you in advance for any input.


This is the best model:
M17-
glmmML(ldeli~bare+firefreq+canopy+treatment,cluster=season,family=poisson,data=ldeli)

Model-averaged coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)   -84.248439  30.376197   2.774  0.00555 **
bare   -0.102111   0.023231   4.396 1.11e-05 ***
firefreq3.370089   1.183091   2.849  0.00439 **
canopy -0.013598   0.007420   1.832  0.06688 .
treatmentLU87.939276  30.376750   2.895  0.00379 **
treatmentSM01  67.595184  24.477350   2.762  0.00575 **
treatmentSMWF  64.612540  23.322285   2.770  0.00560 **
treatmentT01   80.142838  28.030787   2.859  0.00425 **
treatmentT03   77.088813  26.836430   2.873  0.00407 **
treatmentTB56.163472  19.744217   2.845  0.00445 **
treatmentWF84.313036  29.214505   2.886  0.00390 **
-- 
_ _ _ _
Diana

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with Installing R ('failure to expand shell folder constant userdocs)

2013-12-18 Thread Prof Brian Ripley
You didn't say what gave that message (and there is no 'R 3.02': did you 
mean  3.0.2?).


If this came from the installer, that is not part of R: look for info on 
'Inno Setup'.


On 18/12/2013 21:43, Barry DeCicco wrote:

Â
Hello,
Â
I’m trying to install R 3.02 for windows on a Windows 7 machine (we were just 
upgraded from Windows XP).
It’s a networked machine in a work environment.  I have administrator rights.
Program files go on C; data is stored on D (it's a partitioned drive).
Â
I get the message 'failure to expand shell folder constant userdocs
Â
I searched, and found some stuff on the Microsoft website, which made me 
believe that  I needed to unlock a folder with my user name.  There are two, 
one in ‘C:\Users’, which has a padlock symbol on it, and another in 
‘D:\’m which doesn’t have a padlock icon.  Both had been set to read 
only; I removed that.Â
Â
I’ve not been able to find anything on the CRAN website about this.
Â
Has anybody run into this problem before?
Â
Â
Thank you very much,
Â
Barry DeCicco
[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Please do.

--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.