Re: [R] URL error when trying to use help function in R [Sec: UNOFFICIAL]

2010-09-10 Thread Peter Dalgaard
On 09/10/2010 01:03 AM, David Winsemius wrote:
 
 On Sep 9, 2010, at 6:34 PM, Gosse, Michelle wrote:
 
 Greetings,

 I am using R version 2.11.1 on a Dell computer, via a VMware  
 connection to a remote server. My browser version is IE  
 8.0.6001.18702 and the OS is some corporate version of Microsoft XP.

 I'm trying to learn more about the tapply function , so I typed ? 
 tapply into the command line. This opened up a browser window with  
 url http://127.0.0.1:28138/library/base/html/tapply.html which is  
 giving me an error message.
 I receive the same problem when trying for help on other commands,  
 e.g. ?table http://127.0.0.1:28138/library/base/html/table.html and ? 
 log http://127.0.0.1:28138/library/base/html/Log.html

 I did a whois on 127.0.0.1
 
 That should always be your own computer. The browser is trying to  
 reach a server on itself over port 28138 and either the port is  
 blocked or you don't have the documentation at that location.

The _real_ problem is likely that the server is really running on the
remote computer. Substituting the remote server name for 127.0.0.1 is
not unlikely to make things work. (Notwithstanding firewalls and the like).

It is a generic weakness of our current dynamic HTML setup, or of
current browser technology if you like. Same thing with file:// URLs --
if you try to view them in a browser and you already have a browser on
your display, but running on a different machine than the one with the
file, you get a file not found. So when R on machine B wants to
display a help page, it sends a message to the browser to connect to R's
own server on B  by specifying a port on localhost (127.0.0.1), but if
this request gets forwarded to a browser on machine A, then it goes
looking for a server on _its_ localhost, i.e. machine A, and it isn't
there...

I suppose we could do somewhat better, but I don't feel too confident
about the various platform issues. As far as I can see, we currently
hardcode http://127.0.0.1; inside the help print method in
utils:::print.help_file_with_topics(), and I suspect we could make that
a user option, or try to be more intelligent about finding the machine's
own IP address.

A pragmatic way out is always options(help_type=text).


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] covariance matrix structure for random effect in glmmPQL

2010-09-10 Thread Qiu, Weiyu
 Dear all,

I'm using R function glmmPQL in MASS package for generalized linear mixed 
model considering the temporal correlations in random effect. There are 1825 
observations in my data, in which the random effect is called Date, and there 
are five levels in Date, each repeats 365 times.

When I tried
fit.model1=glmmPQL(y~f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18+f19+f20+f21+f22+f23+f24,
 
family=poisson,random=~1|Date,data=mydata,correlation=corCompSymm(value=0.2,form=~1|Date)),
 the model was fitted well. But because of my particular interest, I need to 
specify the correlation structure by myself, so I tried the following code,
fit.model2=glmmPQL(y~f1+f2+f3+f4+f5+f6+f7+f8+f9+f10+f11+f12+f13+f14+f15+f16+f17+f18+f19+f20+f21+f22+f23+f24,family=poisson,random=~1|Date,data=mydata,correlation=corSymm(value=B[lower.tri(B)],form=~1|Date)),
 where B is a 365*365 correlation matrix that's specified by me. Then there's 
an error message Error in vector(double, length) : vector size specified is 
too large. Even I wrote B exactly the same as the one used in model1, i.e. 
diagonal elements 1, off-diagonal elements 0.2, the same error message shows.

Is this error something inherited from the glmmPQL function that I couldn't 
change, or something wrong I made so that I could make certain modifications?

Thanks so much in advance for any kind help!

--
Best,
Vicky

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help on simple problem with optim

2010-09-10 Thread Berend Hasselman

It is indeed a negative value for sigma that causes the issue.
You can check this by inserting this line

if(sigma = 0 ) cat(Negative sigma=,sigma,\n)

after the line

mu - x %*% beta 

in function llk.mar

Negative values for sigma can be avoided with the use of a transformation
for sigma, forcing it to be always positive.

Make optim use log(sigma) as parameter and transform this to sigma with
sigma - exp(parm[l]) in llk.mar.
Like this

# define the log likelihood to be maximized over 
llk.mar - function(parm,y,x){ 
# parm is the vector of parameters 
# the last element is sigma 
# y is the response 
# x is the design matrix 
l - length(parm) 
beta - parm[-l] 
sigma - exp(parm[l])  # === transform
x - as.matrix(x) 
mu - x %*% beta 
if(sigma = 0 ) cat(Negative sigma=,sigma,\n)
llk - sum(dnorm(y, mu, sigma,log=TRUE)) 
return(llk) 
} 

# initial values 
parm - c(as.vector(coef(fit)),log(summary(fit)$sigma))  # use log(sigma) as
independent parameter

Caveat: transformations often help in situations like this but can lead to
badly scaled problems and are not a universal remedy.

/Berend 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Help-on-simple-problem-with-optim-tp2533420p2533939.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2010-09-10 Thread Jim Silverton
I have two questions:
(1) How do you 'create' an 2 x 2 table in R using  say an Odd ratio of 3 or
even 0.5

(2) If I have several 2 x 2 tables, how can I 'implement' dependence in the
tables with say 25 of the Tables having an odds ratio of 1 and 75 of the
tables having an odds ratio of 4?

Jim

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help request: highlighting R code on WordPress.com blogs

2010-09-10 Thread Tal Galili
Hello D,
Thanks for sharing your technique, nice work :)

I hope the solution the people here are helping with will make it both
cheaper and simpler for people with less CSS expreince.

p.s: thank you for the kinds words regarding R-bloggers.com

Best,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, Sep 10, 2010 at 6:40 AM, D Kelly O'Day ko...@processtrends.comwrote:


 Tali

 I am one of your estimated 29 Wordpress bloggers. Thanks for your RBloggers
 site!!

 I use Wordpress.com's site for my blog.

 I use a simple method to highlight my R script in Wordpress, example

 http://chartsgraphs.wordpress.com/2010/07/17/time-series-regression-of-temperature-anomaly-data-1-%E2%80%93-don%E2%80%99t-use-ols/#more-3390
 here .

 I use pre Rscript /pre to set up my R script blocks. I purchased
 Wordpress' CSS service and customized the pre  /pre tags to add a text
 box and pale yellow color scheme.

 I use SnagIt to make images of the console results.


 --
 View this message in context:
 http://r.789695.n4.nabble.com/Help-request-highlighting-R-code-on-WordPress-com-blogs-tp2532433p2533842.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] URL error when trying to use help function in R [Sec: UNOFFICIAL]

2010-09-10 Thread Petr PIKAL
Hi

I had similar issue and it was solved by setting proxy in browser 
preferences to

do not use proxy server for 127.0.0.1

and it helped.

Regards
Petr
 

r-help-boun...@r-project.org napsal dne 10.09.2010 08:18:53:

 On 09/10/2010 01:03 AM, David Winsemius wrote:
  
  On Sep 9, 2010, at 6:34 PM, Gosse, Michelle wrote:
  
  Greetings,
 
  I am using R version 2.11.1 on a Dell computer, via a VMware 
  connection to a remote server. My browser version is IE 
  8.0.6001.18702 and the OS is some corporate version of Microsoft XP.
 
  I'm trying to learn more about the tapply function , so I typed ? 
  tapply into the command line. This opened up a browser window with 
  url http://127.0.0.1:28138/library/base/html/tapply.html which is 
  giving me an error message.
  I receive the same problem when trying for help on other commands, 
  e.g. ?table http://127.0.0.1:28138/library/base/html/table.html and ? 

  log http://127.0.0.1:28138/library/base/html/Log.html
 
  I did a whois on 127.0.0.1
  
  That should always be your own computer. The browser is trying to 
  reach a server on itself over port 28138 and either the port is 
  blocked or you don't have the documentation at that location.
 
 The _real_ problem is likely that the server is really running on the
 remote computer. Substituting the remote server name for 127.0.0.1 is
 not unlikely to make things work. (Notwithstanding firewalls and the 
like).
 
 It is a generic weakness of our current dynamic HTML setup, or of
 current browser technology if you like. Same thing with file:// URLs --
 if you try to view them in a browser and you already have a browser on
 your display, but running on a different machine than the one with the
 file, you get a file not found. So when R on machine B wants to
 display a help page, it sends a message to the browser to connect to R's
 own server on B  by specifying a port on localhost (127.0.0.1), but if
 this request gets forwarded to a browser on machine A, then it goes
 looking for a server on _its_ localhost, i.e. machine A, and it isn't
 there...
 
 I suppose we could do somewhat better, but I don't feel too confident
 about the various platform issues. As far as I can see, we currently
 hardcode http://127.0.0.1; inside the help print method in
 utils:::print.help_file_with_topics(), and I suspect we could make that
 a user option, or try to be more intelligent about finding the machine's
 own IP address.
 
 A pragmatic way out is always options(help_type=text).
 
 
 -- 
 Peter Dalgaard
 Center for Statistics, Copenhagen Business School
 Phone: (+45)38153501
 Email: pd@cbs.dk  Priv: pda...@gmail.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread omerle
Hi,

I got two questions :

1st Question
            a=S
            b=data.frame(S=3)
            do.call(`-`,list(do.call(`$`,list(b,S)),5))
= How can I put new values on S column having the column name as a variable ?

2 nd Question
       a=S
       b=data.frame(S=3)
       b[,S]=list(1:10) #Doesnt works
       b$S=list(1:10) #Works
= Isnt the same thing ? What is the difference between these two things ?


Thanks,

Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
tente ?
Je crée ma boîte mail www.laposte.net

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] importing third party hierarchical clustering data into R

2010-09-10 Thread socter

hi,

I have created a hierarchical clustering program which reads in data and
clusters them.
The next part of the problem is to visualize the results. 
I would like to know how i could import my cluster results into the R hclust
object so that i can visualise using the ggobi library. Or if there is any
other easier way it would be helpful too.
My program is able to generate the point label that is in each cluster.

Thanks a bunch!
-- 
View this message in context: 
http://r.789695.n4.nabble.com/importing-third-party-hierarchical-clustering-data-into-R-tp2533961p2533961.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Odp: Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 10.09.2010 10:05:37:

 Hi,
 
 I got two questions :
 
 1st Question
 Â Â Â  Â Â Â  Â Â Â  a=S
 Â Â Â  Â Â Â  Â Â Â  b=data.frame(S=3)
 Â Â Â  Â Â Â  Â Â Â  do.call(`-`,list(do.call(`$`,list(b,S)),5))
 = How can I put new values on S column having the column name as a 
variable ?
 
 2 nd Question
 Â Â Â Â Â Â  a=S
 Â Â Â  Â Â  b=data.frame(S=3)
 Â Â  Â Â Â  b[,S]=list(1:10) #Doesnt works
 Â Â  Â Â Â  b$S=list(1:10) #Works
 = Isnt the same thing ? What is the difference between these two things 
?

did you looked at b e.g. by str(b)? I believe you expected something 
different.

Maybe you wanted rbind
rbind(b[,S],data.frame(S=1:10))

Regards
Petr

 
 
 Thanks,
 
 Une messagerie gratuite, garantie à vie et des services en plus, ça 
vous tente ?
 Je crée ma boîte mail www.laposte.net
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer fixed effects, SE, t . . . and p

2010-09-10 Thread Gavin Simpson
On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote:
 Bert,
 I appreciate you comments, and I have read Doug Bates writing about p
 values in mixed effects regression. It is precisely because I read
 Doug's material that I asked how are we to interpret the estimates
 rather than how can we compute a p value. My question is a simple
 question whose answer is undoubtedly complex, but one that needs an
 answer. Without p values, or confidence intervals, I am not certain
 what to make of the results of my analysis. Does my analysis suggest,
 or does it not suggest that there is a relation between time and y? If
 I can't answer this question after running the analysis, I don't have
 any more information than I did before I ran the analysis, and a fair
 question would be why did I run the analysis? I am asking for help not
 in calculation a p value or a CI, but rather to know what I can and
 can't say about the results of the analysis. If this basic question
 can not be answered, I am at a loss to interpret my results. 
 Thank you,
 John

Doug talks quite a lot about profiling lmer fits using 'profile
deviance' to investigate variability in fixed effects. For example, see
section 1.5 in the draft of chapter 1 of Doug's book on mixed models:

http://lme4.r-forge.r-project.org/book/

HTH

G

 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert 
 Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM 
 John:
 
 Search on this issue in the list archives. Doug Bates has addressed it
 at length. Basically, he does not calculate CI's or p-values because
 he does not know how to reliably do so.
 
 However, the key remark in your query was:
 
  (2) lmer does not give p values or confidence intervals for the fixed 
  effects. How we are to interpret the estimates given that no p value or CI 
  is given for the estimates?
 
 Think about it. A statistical analysis -- ANY statistical analysis --
 treats the data in isolation: it is not informed by physics,
 thermodynamics, biology,  other similar data, prior experience, or,
 indeed, any part of the body of relevant scientific knowledge. Do you
 really think that any such analysis, especially when predicated upon
 often tenuous or even (necessarily) unverifiable assumptions and
 simplifications should be considered authoritative? Classical
 statistical inference is just another piece of the puzzle, and not
 even particularly useful when, as if typically the case, hypotheses
 are formulated AFTER seeing the data (this invalidates the probability
 calculations -- hypotheses must be formulated before seeing the data
 to be meaningfully assessed). Leo Breiman called this statistics'
 quiet scandal something like 20 years ago, and he was no dummy.
 
 It is comforting, perhaps, but illusory to believe that statistical
 inference can be relied on to give sound, objective scientific
 results. True, without such a framework, science seems rather
 subjective, perhaps closer to religion and arbitrary cultural
 archetypes than we care to admit. But see Thomas Kuhn and Paul
 Feuerabend for why this is neither surprising nor necessarily a bad
 thing.
 
 Cheers,
 Bert Gunter
 
 
 
 
 On Thu, Sep 9, 2010 at 8:00 PM, John Sorkin jsor...@grecc.umaryland.edu 
 wrote:
  windows Vista
  R 2.10.1
 
 
  (1) How can I get the complete table of for the fixed effects from lmer. As 
  can be seen from the example below, fixef(fit2) only give the estimates and 
  not the SE or t value
 
  fit3- lmer(y~time + (1|Subject) + (time|Subject),data=data.frame(data))
  summary(fit3)
  Linear mixed model fit by REML
  Formula: y ~ time + (1 | Subject) + (time | Subject)
Data: data.frame(data)
 AICBIC logLik deviance REMLdev
   -126.2 -116.4   70.1   -152.5  -140.2
  Random effects:
   Groups   NameVariance   Std.Dev.   Corr
   Subject  (Intercept) 2.9311e+01 5.41396385
   Subject  (Intercept) 0.e+00 0.
   time0.e+00 0.   NaN
   Residual 8.1591e-07 0.00090328
  Number of obs: 30, groups: Subject, 10
 
  Fixed effects:
  Estimate Std. Error t value
  (Intercept) 14.998216   1.712046   9
  time-0.999779   0.000202   -4950
 
  Correlation of Fixed Effects:
  (Intr)
  time -0.001
  fixef(fit3)
  (Intercept)time
   14.9982158  -0.9997793
 
  (2) lmer does not give p values or confidence intervals for the fixed 
  effects. How we are to interpret the estimates given that no p value or CI 
  is given for the estimates?
 
 
 
 
  John David Sorkin M.D., Ph.D.
  Chief, Biostatistics and Informatics
  University of Maryland School of Medicine Division of Gerontology
  Baltimore VA Medical Center
  10 North Greene Street
  GRECC (BT/18/GR)
  Baltimore, 

[R] (no subject)

2010-09-10 Thread Tanvir Khan
Hello,
I'm trying to do bar plot where 'sex' will be the category axis and
'occupation' will represent the bars and the clusters will represent
the mean 'income'.

   sex occupation   income
1  female  j 12
2male  b34
3male  j 22
4  female  j54
5male  b   33
6  female  b   67
7male  j89
8male  b  65
9  female  j  45
10   male  j  32

I can do bar plot where sex is the category axis and the clusters
represent 'occupation'.
the code is-

 t- table(data$sex,data$occupation)
 barplot(f)

and the barplot where the category axis is 'sex' and the cluster
represent the mean income and median income. The code is -
 mean=tapply(data$income,data$sex,mean)
 mean
  female male
38.7 46.5
 median=tapply(data$income,data$sex,median)
 median
female   male
  22.5   49.5
 r=rbind(mean,median)
 r
   female male
mean38.7 46.5
median 22.549.5
 par(fg='red',cex=1.2)
 barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0,120)

But how can I make 'occupation'' to nest inside 'sex' and then the
cluster to represent the mean income?
For example I am attaching a pdf plot that is produced by SPSS.

Thank you.


-- 
Tanvir Khan
MS Student
Institute Of Statistical Research  Training
University Of Dhaka
tkh...@isrt.ac.bd
khan.tanvir_...@ymail.com


OUTPUT.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rgl and lighting

2010-09-10 Thread james.foadi
Yes. A white cube and all lights off to start with give me what I want.

Many thanks!

J

Dr James Foadi PhD
Membrane Protein Laboratory (MPL)
Diamond Light Source Ltd
Diamond House
Harewell Science and Innovation Campus
Chilton, Didcot
Oxfordshire OX11 0DE

Email:  james.fo...@diamond.ac.uk
Alt Email:  j.fo...@imperial.ac.uk



-Original Message-
From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
Sent: Thu 09/09/2010 18:00
To: Foadi, James (Imperial Coll.,RAL,DIA)
Cc: r-help@r-project.org
Subject: Re: [R] rgl and lighting
 

  On 09/09/2010 12:02 PM, james.fo...@diamond.ac.uk wrote:
 Dear R community (and Duncan more specifically),
 I can't work out how to make additional light sources work in rgl.
 Here is the example.

 First I create a cube and visualize it:

   cubo- cube3d(col=black)
   shade3d(cubo)

 Next I position the viewpoint at theta=0 and phi=30:

   view3d(theta=0,phi=30)

 Next, I want to create a 2nd light source which diffuses red light from the 
 front face. I thought I could do:

 light3d(diffuse=red,theta=0,phi=0)

 but...the front side doesn't show any red-iness. Same goes for specular and 
 ambient.
 What am I doing wrong here? How should the fron side show in red colour?

Black doesn't reflect anything, so that's why you're not seeing the 
red.  Colour the cube white, and you'll see it turn pink when you turn 
the red light on, or red if you turn off the default light first (using 
rgl.pop(lights)).

Be aware that OpenGL (underlying rgl) has a fairly complicated lighting 
model.  When you say col=black, you're only setting the ambient 
colour, i.e. the colour that appears the same in all directions. (It 
won't be the same on all faces of the cube, because the intensity 
depends on the incoming light.) There is also a specular component, 
which makes things appear shiny, because it's brighter from some 
viewpoints than others.  It is normally white.  Finally there's an 
emission component, which doesn't care about lighting, but is normally 
turned off.

Lights also have 3 components, ambient (non-directional), diffuse 
(somewhat directional), and specular (highly directional).

Duncan Murdoch

 J

 Dr James Foadi PhD
 Membrane Protein Laboratory (MPL)
 Diamond Light Source Ltd
 Diamond House
 Harewell Science and Innovation Campus
 Chilton, Didcot
 Oxfordshire OX11 0DE

 Email:  james.fo...@diamond.ac.uk
 Alt Email:  j.fo...@imperial.ac.uk





-- 
This e-mail and any attachments may contain confidential...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] faster unlist,strsplit,gsub,for

2010-09-10 Thread Romain Francois

Hi,

You can leverage read.table using a textConnection:

 txt - x,y,z,a,b,c,dda,b,c,d,e,f,gd
 con - textConnection( gsub( d, \\\n, txt ) )
 read.table( con, sep = , )
  V1 V2 V3 V4 V5 V6 V7
1  x  y  z  a  b  c  d
2  a  b  c  d  e  f  g
 close( con )

Romain

Le 10/09/10 06:41, rajesh j a écrit :


Ok. These operations are on a string and the result is added to a
data.frame.
I have strings of the form
x,y,z,a,b,c,dda,b,c,d,e,f,gd
essentially comma separated values delimited by ad
I first do a
unlist(strsplit(string,split=d))
and then a
strsplit(string,split=,)

The list of vectors i end up with is added row by row to a preallocated
data.frame like..
df[i,]-list[[i]]

all of this is in a for loop and it runs for 1000 times atleast and the
strings are 7000 to 8000 characters in length



On Fri, Sep 10, 2010 at 9:14 AM, jim holtmanjholt...@gmail.com  wrote:


First thing to do is to use Rprof to profile your code to see where
the time is being spent, then you can make a decision as to what to
change.  Are you carrying out the operations on a dataframe, if so can
you change it to a matrix for some of the operations?  You have
provided no idea of what your code or data looks like, or how often
each of the operations is being done.

There are probably many ways of speeding up the code, but with no idea
of what the code is, no solutions can be specified.

On Thu, Sep 9, 2010 at 11:09 PM, rajesh jakshay.raj...@gmail.com  wrote:

Hi,

I perform the operations unlist,strsplit,gsub and the for loop on a lot

of

strings and its heavily slowing down the overall system. Is there some

way

for me to speeden up these operations..maybe like alternate versions that
exist which use multiprocessors etc.

--
Rajesh.J



--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/bzoWrs : Rcpp svn revision 2000
|- http://bit.ly/b8VNE2 : Rcpp at LondonR, oct 5th
`- http://bit.ly/aAyra4 : highlight 0.2-2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2010-09-10 Thread ONKELINX, Thierry
Have a look at the ggplot2 package

install.packages(ggplot2)
library(ggplot2)
ggplot(data, aes(x = sex, y = income, fill = occupation)) +
geom_bar(position = dodge)


Have a look at http://had.co.nz/ggplot2/ for more information and
examples.

HTH,

Thierry




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Tanvir Khan
 Verzonden: vrijdag 10 september 2010 11:39
 Aan: r-help@r-project.org
 Onderwerp: [R] (no subject)
 
 Hello,
 I'm trying to do bar plot where 'sex' will be the category 
 axis and 'occupation' will represent the bars and the
 clusters will represent the mean 'income'.
 
sex occupation   income
 1  female  j 12
 2male  b34
 3male  j 22
 4  female  j54
 5male  b   33
 6  female  b   67
 7male  j89
 8male  b  65
 9  female  j  45
 10   male  j  32
 
 I can do bar plot where sex is the category axis and the 
 clusters represent 'occupation'.
 the code is-
 
  t- table(data$sex,data$occupation)
  barplot(f)
 
 and the barplot where the category axis is 'sex' and the 
 cluster represent the mean income and median income. The code is -
  mean=tapply(data$income,data$sex,mean)
  mean
   female male
 38.7 46.5
  median=tapply(data$income,data$sex,median)
  median
 female   male
   22.5   49.5
  r=rbind(mean,median)
  r
female male
 mean38.7 46.5
 median 22.549.5
  par(fg='red',cex=1.2)
  
 barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0
  ,120)
 
 But how can I make 'occupation'' to nest inside 'sex' and 
 then the cluster to represent the mean income?
 For example I am attaching a pdf plot that is produced by SPSS.
 
 Thank you.
 
 
 --
 Tanvir Khan
 MS Student
 Institute Of Statistical Research  Training University Of 
 Dhaka tkh...@isrt.ac.bd khan.tanvir_...@ymail.com
 

Druk dit bericht a.u.b. niet onnodig af.
Please do not print this message unnecessarily.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alignment of lines within barplot bars

2010-09-10 Thread Jim Lemon

On 09/10/2010 01:35 AM, Steve Murray wrote:


Dear all,

I have a barplot upon which I hope to superimpose horizontal lines extending 
across the width of each bar. I am able to partly achieve this through the 
following set of commands:

positions- barplot(bar_values, col=grey)
par(new=TRUE)
plot(positions, horiz_values, col=red, pch=_, ylim=c(min(bar_values), 
max(bar_values)))


...however this results in small, off-centred lines, which don't extend across 
the width of each bar. I've tried using 'cex' to increase the width, but of 
course this also increases the height of the line and results in it spanning a 
large range of y-axis values.


I'm sure this shouldn't be too tricky to achieve, nor that uncommon a problem! 
It may be that I'm taking the wrong approach.


Hi Steve,
The barp function in the plotrix package centers the bars on integer 
values and allows you to control the width of the bars (0.4 on each side 
is the default). This would make it easier to calculate the values for 
the segments function to superimpose te lines.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Barplots, was Re: (no subject)

2010-09-10 Thread peter dalgaard

On Sep 10, 2010, at 11:38 , Tanvir Khan wrote:

 Hello,
 I'm trying to do bar plot where 'sex' will be the category axis and
 'occupation' will represent the bars and the clusters will represent
 the mean 'income'.
 
   sex occupation   income
 1  female  j 12
 2male  b34
 3male  j 22
 4  female  j54
 5male  b   33
 6  female  b   67
 7male  j89
 8male  b  65
 9  female  j  45
 10   male  j  32
 
 I can do bar plot where sex is the category axis and the clusters
 represent 'occupation'.
 the code is-
 
 t- table(data$sex,data$occupation)
 barplot(f)
 
 and the barplot where the category axis is 'sex' and the cluster
 represent the mean income and median income. The code is -
 mean=tapply(data$income,data$sex,mean)
 mean
  female male
 38.7 46.5
 median=tapply(data$income,data$sex,median)
 median
 female   male
  22.5   49.5
 r=rbind(mean,median)
 r
   female male
 mean38.7 46.5
 median 22.549.5
 par(fg='red',cex=1.2)
 barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0,120)
 
 But how can I make 'occupation'' to nest inside 'sex' and then the
 cluster to represent the mean income?
 For example I am attaching a pdf plot that is produced by SPSS.


(Please use sensible Subject:, those of us with threading mail programs see it 
mixed in with oodles of old and irrelevant posts.)

I think you are just looking for some of this

inc.by.sex.occ - with(data, tapply(income, list(sex,occupation), mean))
barplot(inc.by.sex.occ)
barplot(inc.by.sex.occ, beside=T)
barplot(t(inc.by.sex.occ))
barplot(t(inc.by.sex.occ),beside=T)

(or, of course, reverse the order in the list() rather than transpose the 
result)




-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Axis break with gap.plot()

2010-09-10 Thread Jim Lemon

On 09/10/2010 01:07 AM, Filoche wrote:


Hi everyone.

I'm trying to break the y axis on a plot. For instance, I have 2 series
(points and a loess). Since the loess is a continuous set of points, it
passes in the break section. However, with gap.plot I cant plot the loess
because of this (I got the message some values of y will not be
displayed).

Here's my code:

library(plotrix);


#generate some data
x = seq(-pi,pi,0.1);
sinx = sin(x);

#add leverage value
sinx = c(sinx,10);
xx = c(x,max(x) + 0.1);

#Loess
yy = loess(sinx ~ xx, span = 0.1);
yy = predict(yy);

#Add break between 2 and 8
gap.plot(xx,sinx,c(2,8)); #This line works fine
gap.plot(xx,yy,c(2,8), add = T); #This wont plot the loess

I did the graphic I would like to produce in Sigmaplot.

http://img830.imageshack.us/img830/5206/breakaxis.jpg


Hi Phil,
The loess is being displayed, but because it is just reproducing the 
points already there, except for one or two, you don't see it.

If you try this:

gap.plot(xx,yy,c(2,8), add = TRUE,type=l);

you'll see the line, although you won't get the uptick at the end 
because it passes through the gap. It would require a bit of manual 
labor to get the same plot as your example. If you have to do just one 
of these, I would probably recalculate the loess fit to account for the 
gap and display it with lines. If you have to do lots, I would think 
about writing a function to do this that you could call instead of the 
second call to gap.plot.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2010-09-10 Thread Jim Lemon

On 09/10/2010 07:38 PM, Tanvir Khan wrote:

Hello,
I'm trying to do bar plot where 'sex' will be the category axis and
'occupation' will represent the bars and the clusters will represent
the mean 'income'.

sex occupation   income
1  female  j 12
2male  b34
3male  j 22
4  female  j54
5male  b   33
6  female  b   67
7male  j89
8male  b  65
9  female  j  45
10   male  j  32

I can do bar plot where sex is the category axis and the clusters
represent 'occupation'.
the code is-


t- table(data$sex,data$occupation)
barplot(f)


and the barplot where the category axis is 'sex' and the cluster
represent the mean income and median income. The code is -

mean=tapply(data$income,data$sex,mean)
mean

   female male
38.7 46.5

median=tapply(data$income,data$sex,median)
median

female   male
   22.5   49.5

r=rbind(mean,median)
r

female male
mean38.7 46.5
median 22.549.5

par(fg='red',cex=1.2)
barplot(r,col=c('green','yellow'),cex.axis=1.2,col.axis='red',ylim=c(0,120)


But how can I make 'occupation'' to nest inside 'sex' and then the
cluster to represent the mean income?
For example I am attaching a pdf plot that is produced by SPSS.


Hi Tanvir,
I think you may be looking for barNest:

examp-data.frame(sex=sample(c(Female,Male),100,TRUE),
 position=sample(c(Clerical,Custodial,Manager),100,TRUE),
 income=rnorm(100,3,5000))
# just show the final bars
barNest(income~sex+position,examp,
 col=list(gray,c(pink,lightblue),c(blue,green,tan)))
# show the entire nest of bars
barNest(income~sex+position,examp,showall=TRUE,
 col=list(gray,c(pink,lightblue),c(blue,green,tan)))

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standardized logistic regression coefficients

2010-09-10 Thread Iasonas Lamprianou
Dear all,

I am looking for ways to compute standardized logistic regression coefficients. 
I found papers describing at least 6 different ways to standardize logistic 
regression coefficients. I also found a very old (Thu May 12 21:50:36 CEST 
2005) suggestion by Frank E Harrell (one of the colleagues who frequently 
contribute on this list) saying...

Design doesn't implement those because they have terrible properties. 
Instead consider interquartile-range odds ratios (done by summary.Design 
by typing summary(. . .)).


1. Is this still the case, or is there any package today in R which computes 
some sort of standardized logistic regression coefficients widely accepted by 
the community? 

2. Also, if anyone knows, how can I implement this interquartile-range odds 
ratios of the Design package? I checked on the manual and found no reference to 
interquartile-range odds ratios

Thanks



Dr. Iasonas Lamprianou


Assistant Professor (Educational Research and Evaluation)
Department of Education Sciences
European University-Cyprus
P.O. Box 22006
1516 Nicosia
Cyprus 
Tel.: +357-22-713178
Fax: +357-22-590539


Honorary Research Fellow
Department of Education
The University of Manchester
Oxford Road, Manchester M13 9PL, UK
Tel. 0044  161 275 3485
iasonas.lampria...@manchester.ac.uk




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Standardized Logistic Regression coefficients

2010-09-10 Thread Iasonas Lamprianou
Dear all,

I am looking for ways to compute standardized logistic regression coefficients. 
Before asking for your time I did some homework. I found papers describing at 
least 6 different ways to standardize logistic regression coefficients. I also 
found a very old (Thu May 12 21:50:36 CEST 2005) suggestion by Frank E Harrell 
(one of the colleagues who frequently contribute on this list) saying...

Design doesn't implement those because they have terrible properties.
Instead consider interquartile-range odds ratios (done by summary.Design
by typing summary(. . .)).


1. Is this still the case, or is there any package today in R which computes 
some sort of standardized logistic regression coefficients widely accepted by 
the community?

2. Also, if anyone knows, how can I implement this interquartile-range odds 
ratios of the Design package? I checked on the manual and found no reference to 
interquartile-range odds ratios

Thanks
Dr. Iasonas Lamprianou


Assistant Professor (Educational Research and Evaluation)
Department of Education Sciences
European University-Cyprus
P.O. Box 22006
1516 Nicosia
Cyprus 
Tel.: +357-22-713178
Fax: +357-22-590539


Honorary Research Fellow
Department of Education
The University of Manchester
Oxford Road, Manchester M13 9PL, UK
Tel. 0044  161 275 3485
iasonas.lampria...@manchester.ac.uk




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gee p values

2010-09-10 Thread Peng, C

There are two z-scores reported in the summary: Naive z and Robust z.

pvalue=2*min(pnorm(z-score), 1-pnorm(z-score))   # two-sided test

-- 
View this message in context: 
http://r.789695.n4.nabble.com/gee-p-values-tp2533835p2534302.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lmer output

2010-09-10 Thread Denis . Aydin
Hi
I have a question regarding an output of a binomial lmer-model.
The model is as follows:
lmer(y~diet * day * female + (day|female),family=binomial)

The corresponding output is:
Generalized linear mixed model fit by the Laplace approximation 
Formula: y ~ diet * day * female + (day | female) 
  AIC  BIC logLik deviance
 1084 1136 -531.1 1062
Random effects:
 Groups NameVariance Std.Dev. Corr 
 female (Intercept) 1.403060 1.18451 
day 0.012044 0.10975  -0.674 
Number of obs: 809, groups: female, 53

Fixed effects:
Estimate Std. Error z value Pr(|z|) 
(Intercept) 0.996444   0.713703   1.396   0.1627 
dietNAA 1.194581   0.862294   1.385   0.1659 
day 0.142026   0.074270   1.912   0.0558 .
female  0.015629   0.019156   0.816   0.4146 
dietNAA:day-0.124755   0.088684  -1.407   0.1595 
dietNAA:female -0.024733   0.026947  -0.918   0.3587 
day:female -0.001535   0.001966  -0.781   0.4348 
dietNAA:day:female  0.001543   0.002720   0.568   0.5704 

Now from my understanding, the estimates represent differences in slopes 
and intercepts between different levels of diet and so on.

My questions:

1. Is there a way to display the coefficients for all levels of variables 
(e.g., dietAA and dietNAA)? Because it is quite hard to calculate the 
slopes and intercepts for all levels of each variable.

2. Is there a way to get the degrees of freedom?

Thanks for your help.

Regards,
Denis Aydin

--
This email and any files transmitted with it are confide...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gee p values

2010-09-10 Thread John Sorkin
Peng,
If the answer were as simple as you suggest, I would expect that gee would 
automatically produce the p values. Since gee does not produce the values, I 
fear that the computation may be more complex, or perhaps computing p values 
from gee may be controversial. Do you know which, if either of my speculations 
is true?
Thank you,
John




John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) Peng, 
C cpeng@gmail.com 9/10/2010 8:06 AM 

There are two z-scores reported in the summary: Naive z and Robust z.

pvalue=2*min(pnorm(z-score), 1-pnorm(z-score))   # two-sided test

-- 
View this message in context: 
http://r.789695.n4.nabble.com/gee-p-values-tp2533835p2534302.html 
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme vs. lmer, how do they differ?

2010-09-10 Thread John Sorkin
windows Vista
R 2.10.1

What is the difference (or differences) between lme and lmer? Both appear to 
perform mixed effects regression analyses.
Thanks
John 


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme, groupedData, random intercept and slope

2010-09-10 Thread John Sorkin
Windows Vista
R 2.10.1




Does the following use of groupedData and lme produce an analysis with both 
random intercept and slope, or only random slope?


zz-groupedData(y~time |  Subject,data=data.frame(data),
  labels = list( x = Time,
y = y ),
  units = list( x = (yr), y = (mm))
)
plot(zz)

fit10-lme(zz)
summary(fit10)

Linear mixed-effects model fit by REML
 Data: zz 
AIC   BIC  logLik
  -123.1942 -115.2010 67.5971

Random effects:
 Formula: ~time | Subject
 Structure: General positive-definite
StdDev   Corr  
(Intercept) 6.054897e+00 (Intr)
time4.160662e-05 1 
Residual9.775954e-04   

Fixed effects: y ~ time 
Value Std.Error DF   t-value p-value
(Intercept) 15.000217  1.914727 19 7.834   0
time-1.51  0.000219 19 -4566.598   0
 Correlation: 
 (Intr)
time 0.059 

Standardized Within-Group Residuals:
Min  Q1 Med  Q3 Max 
-1.73706837 -0.36289558  0.06892484  0.59777067  1.69095476 

Number of Observations: 30
Number of Groups: 10 


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] plyr: version 1.2

2010-09-10 Thread Hadley Wickham
plyr is a set of tools for a common set of problems: you need to
__split__ up a big data structure into homogeneous pieces, __apply__ a
function to each piece and then __combine__ all the results back
together. For example, you might want to:

  * fit the same model each patient subsets of a data frame
  * quickly calculate summary statistics for each group
  * perform group-wise transformations like scaling or standardising

It's already possible to do this with base R functions (like split and
the apply family of functions), but plyr makes it all a bit easier
with:

  * totally consistent names, arguments and outputs
  * convenient parallelisation through the foreach package
  * input from and output to data.frames, matrices and lists
  * progress bars to keep track of long running operations
  * built-in error recovery, and informative error messages
  * labels that are maintained across all transformations

Considerable effort has been put into making plyr fast and memory
efficient, and in many cases plyr is as fast as, or faster than, the
built-in functions.

You can find out more at http://had.co.nz/plyr/, including a 20 page
introductory guide, http://had.co.nz/plyr/plyr-intro.pdf.  You can ask
questions about plyr (and data-manipulation in general) on the plyr
mailing list. Sign up at http://groups.google.com/group/manipulatr

Version 1.2 (2010-09-09)
--

NEW FEATURES

* l*ply, d*ply, a*ply and m*ply all gain a .parallel argument that when TRUE,
  applies functions in parallel using a parallel backend registered with the
  foreach package:

  x - seq_len(20)
  wait - function(i) Sys.sleep(0.1)
  system.time(llply(x, wait))
  #  user  system elapsed
  # 0.007   0.005   2.005

  library(doMC)
  registerDoMC(2)
  system.time(llply(x, wait, .parallel = TRUE))
  #  user  system elapsed
  # 0.020   0.011   1.038

  This work has been generously supported by BD (Becton Dickinson).

MINOR CHANGES

* a*ply and m*ply gain an .expand argument that controls whether data frames
  produce a single output dimension (one element for each row), or an output
  dimension for each variable.

* new vaggregate (vector aggregate) function, which is equivalent to tapply,
  but much faster (~ 10x), since it avoids copying the data.

* llply: for simple lists and vectors, with no progress bar, no extra info,
  and no parallelisation, llply calls lapply directly to avoid all the
  overhead associated with those unused extra features.

* llply: in serial case, for loop replaced with custom C function that takes
  about 40% less time (or about 20% less time than lapply). Note that as a
  whole, llply still has much more overhead than lapply.

* round_any now lives in plyr instead of reshape

BUG FIXES

* list_to_array works correct even when there are missing values in the array.
  This is particularly important for daply.


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] reshape2: a reboot of the reshape package

2010-09-10 Thread Hadley Wickham
Reshape2 is a reboot of the reshape package. It's been over five years
since the first release of the package, and in that time I've learned
a tremendous amount about R programming, and how to work with data in
R. Reshape2 uses that knowledge to make a new package for reshaping
data that is much more focussed and much much faster.

This version improves speed at the cost of functionality, so I have
renamed it to `reshape2` to avoid causing problems for existing users.
 Based on user feedback I may reintroduce some of these features.

What's new in `reshape2`:

 * considerably faster and more memory efficient thanks to a much better
   underlying algorithm that uses the power and speed of subsetting to the
   fullest extent, in most cases only making a single copy of the data.

 * cast is replaced by two functions depending on the output type: `dcast`
   produces data frames, and `acast` produces matrices/arrays.

 * multidimensional margins are now possible: `grand_row` and `grand_col` have
   been dropped: now the name of the margin refers to the variable that has
   its value set to (all).

 * some features have been removed such as the `|` cast operator, and the
   ability to return multiple values from an aggregation function. I'm
   reasonably sure both these operations are better performed by plyr.

 * a new cast syntax which allows you to reshape based on functions
   of variables (based on the same underlying syntax as plyr):

 * better development practices like namespaces and tests.

This work has been generously supported by BD (Becton Dickinson).


-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Over lay 2 scale in same plot

2010-09-10 Thread mamunbabu2001

Hi Josh,
Thanks for your reply. I gave a reply yesterday but found that it was not
posted.
I managed to plot the bar pot and overlay points. 

The problem I am facing now is the spread of Y scale. The values I am
plotting
in Y scale are very close. so they look pretty flat. (lowest value 7.5 and
highest 
value 8.5 , so if the ranges in y scale is 6-8, 8-10 , the values looks
pretty flat.)
How can I make the spread of Y scale i.e 6.2 - 6.4 - 6.6 -  .. 8.8 - 10 
so that
values does not look flat. I have added an image below.

http://r.789695.n4.nabble.com/file/n2534370/_GE_and_CN_Combined_Plot_mod.png 

Thanks in advance.

regards,
Mamun
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Over-lay-2-scale-in-same-plot-tp2528661p2534370.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme vs. lmer, how do they differ?

2010-09-10 Thread S Ellison
 John Sorkin jsor...@grecc.umaryland.edu 10/09/2010 13:21:09 
What is the difference (or differences) between lme and lmer? Both
appear to perform mixed effects regression analyses.

From a user's point of view:
- lme only accepts nested random effect; lmer handles crossed random
effects
- lme has a convenient methof of handling heteroscedasticity; lmer
doesn't.
- lme gives you p-values; lmer doesn't (there is explanation of why at
https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html which, I
guess, should not be all that reassuring for lme users either)


Steve Ellison


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme vs. lmer, how do they differ?

2010-09-10 Thread Ben Bolker
John Sorkin jsorkin at grecc.umaryland.edu writes:

 
 windows Vista
 R 2.10.1
 
 What is the difference (or differences) between lme and lmer? Both appear to
perform mixed effects
 regression analyses.
 Thanks
 John 
 

  in a nutshell:

  lmer is newer, much faster, handles crossed random effects well (and
generalized linear mixed models), has some support for producing
likelihood profiles (in the development version), and is under rapid 
development.
It does not attempt to estimate residual degrees of freedom and
hence does not give p-values for significance of effects.
  lme is older, better documented (Pinheiro and Bates 2000), more
stable, and handles 'R-side' structures (heteroscedasticity,
within-group correlations).

  r-sig-mixed-models is a good place for questions about these
packages.

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Ista Zahn
Hi,

On Fri, Sep 10, 2010 at 4:05 AM, omerle ome...@laposte.net wrote:
 Hi,

 I got two questions :

 1st Question
             a=S
             b=data.frame(S=3)
             do.call(`-`,list(do.call(`$`,list(b,S)),5))

I think there is some confusion here. Why are you setting a equal to
S but then never using it?

 = How can I put new values on S column having the column name as a variable ?

I'm having trouble parsing this. What exactly do you want to do?


 2 nd Question
    a=S
        b=data.frame(S=3)
        b[,S]=list(1:10) #Doesnt works
        b$S=list(1:10) #Works
 = Isnt the same thing ? What is the difference between these two things ?

I believe b[[S]] is the same as b$S, b[,S] is different. But I
have to question your assertion that b$S=list(1:10) Works. This is a
very odd construction (putting a list as an element of a data.frame)
and is almost certainly not what you want.



 Thanks,

 Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
 tente ?
 Je crée ma boîte mail www.laposte.net

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which language is faster for numerical computation?

2010-09-10 Thread Prof. John C Nash
Dirk E. has properly focussed the discussion on measurement rather than opinion. I'll add 
the issue of the human time taken to convert, and more importantly debug, interfaced code. 
That too could be measured, but we rarely see human hours to code/debug/test reported.


Moreover, I'll mention the cat among the pigeons of Rcgmin, which I wrote to allow me to 
play with an optimization code more easily to discover where the algorithm might be 
improved. The resulting package on some problems outperforms C equivalents. Now the code 
is quite vectorized, but this has still been a very nice surprise. In fact, I've decided 
to avoid playing around with the interfaces if I can run things well-enough entirely in R.


JN

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SSOAP complex request types

2010-09-10 Thread jonnyboi
Hi, I'm having a bit of trouble using SSOAP to send requests containing
complex types.  It seems as though processWSDL does not generate functions
for converting the generated types into SOAP requests, but it does generate
them for converting **from** SOAP requests to the complex types.

The error that I get when trying to pass an object of type LoginReq (as
generated by processWSDL) is that LoginReq does not inherit method toSOAP
with signature (obj=LoginReq, con=XMLInternalElementNode, type=NULL).

Does anyone with experience of this library know if theres something i'm
doing wrong? or is it the case that sending complex request types isn't
supported?

Thanks,

Jonny

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread omerle
 Message du 10/09/10 14:53
 De : Ista Zahn
 A : omerle
 Copie à : r-help@r-project.org
 Objet : Re: [R] Data.frames : difference between x$a and x[, a] ? - How set 
 new values on x$a with a as variable ?


 Hi,

 On Fri, Sep 10, 2010 at 4:05 AM, omerle wrote:
  Hi,
 
  I got two questions :
 
  1st Question
              a=S
              b=data.frame(S=3)
              do.call(`-`,list(do.call(`$`,list(b,S)),5))

 I think there is some confusion here. Why are you setting a equal to
 S but then never using it?

  = How can I put new values on S column having the column name as a 
  variable ?

 I'm having trouble parsing this. What exactly do you want to do?
1 - Put a list as an element of a data.frame. That's quite convenient for my 
pricing function.
 
 
  2 nd Question
         a=S
         b=data.frame(S=3)
         b[,S]=list(1:10) #Doesnt works
         b$S=list(1:10) #Works
  = Isnt the same thing ? What is the difference between these two things ?

 I believe b[[S]] is the same as b$S, b[,S] is different. But I
 have to question your assertion that b$S=list(1:10) Works. This is a
 very odd construction (putting a list as an element of a data.frame)
 and is almost certainly not what you want.
2 - That's what I want. I figured out just five minutes ago that b[[S]] works 
because it's the same thing as b$S.
But I still dont know what is b[,S] compared to b[[S]]
 
 
  Thanks,
 
  Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
  tente ?
  Je crée ma boîte mail www.laposte.net
 
         [[alternative HTML version deleted]]
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org

 

Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
tente ?
Je crée ma boîte mail www.laposte.net

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package gbm C++ code as separate module

2010-09-10 Thread Markus Loecher
Dear all,
I would like to separate the gbm C++ code from any R dependencies, so that
it could be compiled into a standalone module.
I am wondering if anyone has already done this and could provide me with
some pointers/help ?

Thanks!

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating with tolerances (error propagation)

2010-09-10 Thread David Winsemius


On Sep 9, 2010, at 10:57 AM, David Winsemius wrote:



On Sep 9, 2010, at 6:50 AM, Jan private wrote:


Hello Bernardo,

-
If I understood  your problem this script solve your problem:

q-0.15 + c(-.1,0,.1)
h-10 + c(-.1,0,.1)
5*q*h
[1]  2.475  7.500 12.625
-

OK, this solves the simple example.
But what if the example is not that simple. E.g.

P = 5 * q/h

Here, to get the maximum tolerances for P, we need to divide the  
maximum

value for q by the minimum value for h, and vice versa.
Is there any way
to do this automatically, without thinking about every single step?

There is a thing called interval arithmetic (I saw it as an Octave
package) which would do something like this.


(Sorry for the blank reply posting. Serum caffeine has not yet  
reached optimal levels.)


Is it possible that interval arithmetic would produce statistically  
incorrect tolerance calculation, and that be why it has not been  
added to R? Those tolerance intervals are presumably some sort of  
(unspecified) prediction intervals (i.e. contain 95% or 63% or some  
fraction of a large sample) and combinations under mathematical  
operations are not going to be properly derived by c( min(XY),  
max(XY) ) since those are not calculated with any understanding of  
combining variances of functions on random variables.


There is a function, propagate, in the qpcR package that does  
incorporate statistical principles in handling error propagation.  
Thanks to the author, Dr. rer. nat. Andrej-Nikolai Spiess, for drawing  
it to my attention (and of course for writing it.). It appears that it  
should handle the data situation offered with only minor  
modifications. The first example is vary similar to your (more  
difficult) ratio problem.

 install.packages(pkgs=qpcR, type=source)
# binary install did not succeed on my Mac, but installing from source  
produced no errors or warnings.

 require(qpcR)
 q- c( 0.15 , .1)
 h-c( 10  , .1)
 EXPR - expression(5*q/h)
 DF - cbind(q, h)
 res - propagate(expr = EXPR, data = DF, type = stat,
+  do.sim = TRUE, verbose = TRUE)
 res$summary
   Sim PermProp
Mean0.07500751  NaN  0.0750
s.d.0.05001970   NA  0.05000562
Median  0.07445724   NA  NA
MAD 0.04922935   NA  NA
Conf.lower -0.02332498   NA -0.02300922
Conf.upper  0.17475818   NA  0.17300922

(My only suggestion for enhancement would be a print or summary method  
that did not output every single simulated value.)


Three methods for error propagation estimation are included: a) Monte- 
Carlo simulation using sampling from the data, b) Permutation, and c)  
Gaussian errors calculated via a Taylor series expansion. There is a  
rich set of worked examples. It appears to be capable of meeting a  
wide variety of challenges.


--
David.



--
David.


I would have thought that tracking how a (measuring) error propagates
through a complex calculation would be a standard problem of
statistics??


In probability theory, anyway.


In other words, I am looking for a data type which is a
number with a deviation +- somehow attached to it, with binary  
operators

that automatically knows how to handle the deviation.


There is the suite of packages that represent theoretic random  
variables and support mathematical operations on them.


See distrDoc and the rest of that suite.




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Ista Zahn
On Fri, Sep 10, 2010 at 9:22 AM, omerle ome...@laposte.net wrote:
 Message du 10/09/10 14:53
 De : Ista Zahn
 A : omerle
 Copie à : r-help@r-project.org
 Objet : Re: [R] Data.frames : difference between x$a and x[, a] ? - How 
 set new values on x$a with a as variable ?


 Hi,

 On Fri, Sep 10, 2010 at 4:05 AM, omerle wrote:
  Hi,
 
  I got two questions :
 
  1st Question
              a=S
              b=data.frame(S=3)
              do.call(`-`,list(do.call(`$`,list(b,S)),5))

 I think there is some confusion here. Why are you setting a equal to
 S but then never using it?

  = How can I put new values on S column having the column name as a 
  variable ?

 I'm having trouble parsing this. What exactly do you want to do?
 1 - Put a list as an element of a data.frame. That's quite convenient for my 
 pricing function.

I think this is a really bad idea. data.frames are not meant to be
used in this way. Why not use a list of lists?


 
  2 nd Question
     a=S
         b=data.frame(S=3)
         b[,S]=list(1:10) #Doesnt works
         b$S=list(1:10) #Works
  = Isnt the same thing ? What is the difference between these two things ?

 I believe b[[S]] is the same as b$S, b[,S] is different. But I
 have to question your assertion that b$S=list(1:10) Works. This is a
 very odd construction (putting a list as an element of a data.frame)
 and is almost certainly not what you want.
 2 - That's what I want. I figured out just five minutes ago that b[[S]] 
 works because it's the same thing as b$S.
 But I still dont know what is b[,S] compared to b[[S]]

see ?[

 
 
  Thanks,
 
  Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
  tente ?
  Je crée ma boîte mail www.laposte.net
 
         [[alternative HTML version deleted]]
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



 --
 Ista Zahn
 Graduate student
 University of Rochester
 Department of Clinical and Social Psychology
 http://yourpsyche.org



 Une messagerie gratuite, garantie à vie et des services en plus, ça vous 
 tente ?
 Je crée ma boîte mail www.laposte.net

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Hadley Wickham
 I'm having trouble parsing this. What exactly do you want to do?
 1 - Put a list as an element of a data.frame. That's quite convenient for my 
 pricing function.

 I think this is a really bad idea. data.frames are not meant to be
 used in this way. Why not use a list of lists?

It can be very convenient, but I suspect the original poster is
confused about the different between vectors and lists.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Ivan Calandra


Le 9/10/2010 15:37, Ista Zahn a écrit :
 On Fri, Sep 10, 2010 at 9:22 AM, omerleome...@laposte.net  wrote:
 Message du 10/09/10 14:53
 De : Ista Zahn
 A : omerle
 Copie à : r-help@r-project.org
 Objet : Re: [R] Data.frames : difference between x$a and x[, a] ? - How 
 set new values on x$a with a as variable ?


 Hi,

 On Fri, Sep 10, 2010 at 4:05 AM, omerle wrote:
 Hi,

 I got two questions :

 1st Question
  a=S
  b=data.frame(S=3)
  do.call(`-`,list(do.call(`$`,list(b,S)),5))
 I think there is some confusion here. Why are you setting a equal to
 S but then never using it?

 =  How can I put new values on S column having the column name as a 
 variable ?
 I'm having trouble parsing this. What exactly do you want to do?
 1 - Put a list as an element of a data.frame. That's quite convenient for my 
 pricing function.
 I think this is a really bad idea. data.frames are not meant to be
 used in this way. Why not use a list of lists?
Since data.frames are lists, why would it be a bad practice?
 2 nd Question
 a=S
 b=data.frame(S=3)
 b[,S]=list(1:10) #Doesnt works
 b$S=list(1:10) #Works
 =  Isnt the same thing ? What is the difference between these two things ?
 I believe b[[S]] is the same as b$S, b[,S] is different. But I
 have to question your assertion that b$S=list(1:10) Works. This is a
 very odd construction (putting a list as an element of a data.frame)
 and is almost certainly not what you want.
 2 - That's what I want. I figured out just five minutes ago that b[[S]] 
 works because it's the same thing as b$S.
 But I still dont know what is b[,S] compared to b[[S]]
 see ?[
 From the help page, there is not much distinction. Maybe I haven't 
understood all the details...

Ivan

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pairwise.t.test vs t.test

2010-09-10 Thread Jabez Wilson
Dear all, I am perplexed when trying to get the same results using 
pairwise.t.test and t.test.
I'm using examples in the ISwR library, 
attach(red.cell.folate)
I can get the same result for pairwise.t.test and t.test when I set the 
variances to be non-equal, but not when they are assumed to be equal. Can 
anyone explain the differences, or what I'm doing wrong?
Here's an example where I compare the first two ventilations with 
pairwise.t.test and t.test
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F)
    Pairwise comparisons using t tests with non-pooled SD 
data:  folate and ventilation 
  N2O+O2,24h N2O+O2,op
N2O+O2,op 0.029  -    
O2,24h    0.161  0.298    
P value adjustment method: none 

 t.test(folate[1:8], folate[9:17], var.equal=F)
    Welch Two Sample t-test
data:  folate[1:8] and folate[9:17] 
t = 2.4901, df = 11.579, p-value = 0.02906
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
   7.310453 113.050658 
sample estimates:
mean of x mean of y 
 316.6250  256. 
 
So 0.029 and 0.02906 are identical but if I do the same with pool.sd and 
var.equal = T, I get different results
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T)
    Pairwise comparisons using t tests with pooled SD 
data:  folate and ventilation 
  N2O+O2,24h N2O+O2,op
N2O+O2,op 0.014  -    
O2,24h    0.155  0.408    
P value adjustment method: none 

 t.test(folate[1:8], folate[9:17], var.equal=T)
    Two Sample t-test
data:  folate[1:8] and folate[9:17] 
t = 2.5582, df = 15, p-value = 0.02184
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
  10.03871 110.32240 
sample estimates:
mean of x mean of y 
 316.6250  256. 
 
So 0.014 and 0.02184 are not the same.
 
 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem importing square character

2010-09-10 Thread Marcelo Estácio



Dear,

When I try to to execute the following command, R don't read all lines (reads 
only 57658 lines when the file has 814125 lines):

 

dados2-read.table(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua 
IFPD\\200701_02_03_04\\SegurosClube.txt,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL),quote=,comment.char=,skip=1,fill=TRUE)
 
If I exclude fill=TRUE, R gives the message 

 

Warning message:
In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  número de itens não é múltiplo do número de colunas (number of itens is not 
multiple of number of columns)

 

I identified that the problem is the following line of my data (line 57659 of 
my file):

 

13850074571^01/01/1940^000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência
 privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE


As you can observe, my data have a square string like this:  (i don't know 
if you can see the character, but it looks like a white square). It looks like 
that R understands this character as the end of the archive.

I opened my data on the notepad and copied the character. When I paste this 
character on R, it try to close asking if I want to save my work. What is 
happenning?

 

Thanks very much.

Marcelo Estácio

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Over lay 2 scale in same plot

2010-09-10 Thread Joshua Wiley
Hi Mamum,

You can look at the ylim argument to plot().  It lets you control the
limits; however, in your example graph, part of the issue is that the
bars have a much higher value, so you could not change ylim too much
(looks like in your example graph you could set it to something like
ylim = c(0, 7) ? ).

Josh

On Fri, Sep 10, 2010 at 5:43 AM, mamunbabu2001 mrashi...@hotmail.com wrote:

 Hi Josh,
 Thanks for your reply. I gave a reply yesterday but found that it was not
 posted.
 I managed to plot the bar pot and overlay points.

 The problem I am facing now is the spread of Y scale. The values I am
 plotting
 in Y scale are very close. so they look pretty flat. (lowest value 7.5 and
 highest
 value 8.5 , so if the ranges in y scale is 6-8, 8-10 , the values looks
 pretty flat.)
 How can I make the spread of Y scale i.e 6.2 - 6.4 - 6.6 -  .. 8.8 - 10
 so that
 values does not look flat. I have added an image below.

 http://r.789695.n4.nabble.com/file/n2534370/_GE_and_CN_Combined_Plot_mod.png

 Thanks in advance.

 regards,
 Mamun
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Over-lay-2-scale-in-same-plot-tp2528661p2534370.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread David Winsemius


On Sep 10, 2010, at 9:42 AM, Hadley Wickham wrote:


I'm having trouble parsing this. What exactly do you want to do?
1 - Put a list as an element of a data.frame. That's quite  
convenient for my pricing function.


I think this is a really bad idea. data.frames are not meant to be
used in this way. Why not use a list of lists?


It can be very convenient, but I suspect the original poster is
confused about the different between vectors and lists.


I wouldn't be surprised if someone were confused, since my reading of  
some (but not all) of the help documents has led me to think that  
lists _were_ vectors, just not vectors of atomic mode. And one oft- 
illustrated method for creating a list is:  alist -  
vector(mode=list, length=10). I am perhaps less confused than I was  
two years ago but my confusion about all the possible permutations of  
mode, typeof, expression, formula, and class and the extraction  
methods therefrom definitely persists. I think the authors of the  
documentation are of divided opinion or usage on this topic.


Best;
David.




Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can I save my console contents automatically?

2010-09-10 Thread Nobuaki Michihata
Dear All,

I'm using R for Mac OS X Cocoa GUI R version 2.11.1.
I can save contents of my console window by using command + s, but I
would like to do same thing using R commands.
My question is can I save the contents automatically by using R editor
with some R commands.

Thank you.
Nobu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] modifying axis labels in lattice panels

2010-09-10 Thread Markus Loecher
Dear all,
I am struggling to modify the axis labels/ticks in a panel provided to
xyplot.
To begin with, I do not know the equivalent of the xaxt=n directive for
panels that would set the stage for no default x axis being drawn.
My goal is to draw ticks and custom formatted labels at certain hours of the
week.
When I execute the code below, I get an error message in the plot window
that suggests a problem with some args parameter.

A second problem concerns the shaded rectangles I try to draw. Clearly, the
range returned by par('usr') does not correspond to the true y ranges.

Any help would be greatly appreciated,

Thanks,

Markus

PS I am using R version 2.10.0 on MACOS and the lattice package version
0.18-3 (latest)

library(lattice);

#multivariate time series, one row for each hour of the week:
Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) +
(0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 =
cos(2*pi*7*(0:167)/168));
#to pass this to xyplot, first need to reshape:
Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long,
timevar = PC);
#get descriptive shingle labels
Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2));

xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales =
list(x=list(at = Hours24-4*3600,
labels=as.character(format(Hours24-4*3600,%H);

WeekhourPanel - function(x,y,...){
  r - range(x);
  #print(r)
  Hours8 - seq(r[1], r[2], by=8*3600);
  Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)
  #axis.POSIXct(1, at= Hours8, format=%H);
  panel.xyplot(x,y, type=l, ...);
  panel.grid(0,3);
  panel.abline(v= Hours24-4*3600, lty=2, col = rgb(0,0,1,0.5));
  panel.abline(v=Hours24+6*3600, lty=2, col = rgb(0,1,0,0.5));
  bb - par('usr')
  y0 - bb[3];
  for (i in seq(r[1], r[2], by=48*3600)) panel.rect(xleft=i, ybottom=y0,
xright=i+24*3600-1, ytop=bb[4], col = rgb(0.75,0.75,0.75,0.3), border = NA);
  panel.axis(1, at= Hours24-4*3600,
labels=as.character(format(Hours24-4*3600,%H)));
  #panel.axis(1, at= Hours24+6*3600, labels=format(x,%H));
  #panel.axis(3, at= Hours24, labels=format(x,%a), line = -1, tick =
FALSE);

}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Save R-Part Tree

2010-09-10 Thread Paul Kramer
Hello,

I have some Data, that I analyse using a regression tree (by using rpart). 

Now I would like to save this tree to a file for later use, and load it (not 
necessarily in the same workspace). It would be most preferrable, if I could 
just save some parameters (later probably to a database) and load them again to 
reform the tree. 

Is this possible? And if yes, how?

Thanks in advance

--Paul--
-- 
GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!



-- 
GRATIS: Spider-Man 1-3 sowie 300 weitere Videos!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairwise.t.test vs t.test

2010-09-10 Thread peter dalgaard

On Sep 10, 2010, at 16:01 , Jabez Wilson wrote:

 Dear all, I am perplexed when trying to get the same results using 
 pairwise.t.test and t.test.
 I'm using examples in the ISwR library, 
 attach(red.cell.folate)
 I can get the same result for pairwise.t.test and t.test when I set the 
 variances to be non-equal, but not when they are assumed to be equal. Can 
 anyone explain the differences, or what I'm doing wrong?
 Here's an example where I compare the first two ventilations with 
 pairwise.t.test and t.test
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F)
 Pairwise comparisons using t tests with non-pooled SD 
 data:  folate and ventilation 
   N2O+O2,24h N2O+O2,op
 N2O+O2,op 0.029  -
 O2,24h0.161  0.298
 P value adjustment method: none 
 
 t.test(folate[1:8], folate[9:17], var.equal=F)
 Welch Two Sample t-test
 data:  folate[1:8] and folate[9:17] 
 t = 2.4901, df = 11.579, p-value = 0.02906
 alternative hypothesis: true difference in means is not equal to 0 
 95 percent confidence interval:
7.310453 113.050658 
 sample estimates:
 mean of x mean of y 
  316.6250  256. 
  
 So 0.029 and 0.02906 are identical but if I do the same with pool.sd and 
 var.equal = T, I get different results
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T)
 Pairwise comparisons using t tests with pooled SD 
 data:  folate and ventilation 
   N2O+O2,24h N2O+O2,op
 N2O+O2,op 0.014  -
 O2,24h0.155  0.408
 P value adjustment method: none 
 
 t.test(folate[1:8], folate[9:17], var.equal=T)
 Two Sample t-test
 data:  folate[1:8] and folate[9:17] 
 t = 2.5582, df = 15, p-value = 0.02184
 alternative hypothesis: true difference in means is not equal to 0 
 95 percent confidence interval:
   10.03871 110.32240 
 sample estimates:
 mean of x mean of y 
  316.6250  256. 
  
 So 0.014 and 0.02184 are not the same.
  
  


The help page says:

The pool.SD switch calculates a common SD for all groups (NB: all)

So the denominator is not the same as when testing each pair separately.

You can in fact do

pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F,var.eq=T)

 


-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairwise.t.test vs t.test

2010-09-10 Thread Marc Schwartz

On Sep 10, 2010, at 9:01 AM, Jabez Wilson wrote:

 Dear all, I am perplexed when trying to get the same results using 
 pairwise.t.test and t.test.
 I'm using examples in the ISwR library, 
 attach(red.cell.folate)
 I can get the same result for pairwise.t.test and t.test when I set the 
 variances to be non-equal, but not when they are assumed to be equal. Can 
 anyone explain the differences, or what I'm doing wrong?
 Here's an example where I compare the first two ventilations with 
 pairwise.t.test and t.test
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F)
 Pairwise comparisons using t tests with non-pooled SD 
 data:  folate and ventilation 
   N2O+O2,24h N2O+O2,op
 N2O+O2,op 0.029  -
 O2,24h0.161  0.298
 P value adjustment method: none 
 
 t.test(folate[1:8], folate[9:17], var.equal=F)
 Welch Two Sample t-test
 data:  folate[1:8] and folate[9:17] 
 t = 2.4901, df = 11.579, p-value = 0.02906
 alternative hypothesis: true difference in means is not equal to 0 
 95 percent confidence interval:
7.310453 113.050658 
 sample estimates:
 mean of x mean of y 
  316.6250  256. 
  
 So 0.029 and 0.02906 are identical but if I do the same with pool.sd and 
 var.equal = T, I get different results
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T)
 Pairwise comparisons using t tests with pooled SD 
 data:  folate and ventilation 
   N2O+O2,24h N2O+O2,op
 N2O+O2,op 0.014  -
 O2,24h0.155  0.408
 P value adjustment method: none 
 
 t.test(folate[1:8], folate[9:17], var.equal=T)
 Two Sample t-test
 data:  folate[1:8] and folate[9:17] 
 t = 2.5582, df = 15, p-value = 0.02184
 alternative hypothesis: true difference in means is not equal to 0 
 95 percent confidence interval:
   10.03871 110.32240 
 sample estimates:
 mean of x mean of y 
  316.6250  256. 
  
 So 0.014 and 0.02184 are not the same.
  


require(ISwR)

 with(red.cell.folate[1:17, ], pairwise.t.test(folate, ventilation, pool.sd = 
 TRUE))$p.value
  N2O+O2,24h
N2O+O2,op 0.02184081


NB: The pool.SD switch calculates a common SD for all groups and used that for 
all comparisons

See the Details in ?pairwise.t.test

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem importing square character

2010-09-10 Thread Duncan Murdoch

 On 10/09/2010 10:03 AM, Marcelo Estácio wrote:



Dear,

When I try to to execute the following command, R don't read all lines (reads 
only 57658 lines when the file has 814125 lines):



dados2-read.table(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua 
IFPD\\200701_02_03_04\\SegurosClube.txt,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL),quote=,comment.char=,skip=1,fill=TRUE)

If I exclude fill=TRUE, R gives the message



Warning message:
In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
   número de itens não é múltiplo do número de colunas (number of itens is not 
multiple of number of columns)



I identified that the problem is the following line of my data (line 57659 of 
my file):



13850074571^01/01/1940^000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência
 privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE


As you can observe, my data have a square string like this:  (i don't know 
if you can see the character, but it looks like a white square). It looks like that R 
understands this character as the end of the archive.

I opened my data on the notepad and copied the character. When I paste this 
character on R, it try to close asking if I want to save my work. What is 
happenning?


That symbol is the way some systems display the hex 1A character, which 
in DOS marked the end of file.  By the pathname it looks as though 
you're working on Windows, which has inherited that behaviour.


The best way to get around it would be to correct those bad characters:  
they are almost certainly errors in the data file.  If you want to keep 
them, then you could try reading the file in binary mode rather than 
text mode.  You do this using


con - file( filename, open=rb)
read.table(con, header=FALSE, ...)
close(con)

You could also try reading it on a different OS; I don't think Linux 
cares about 1A characters.


Duncan Murdoch




Thanks very much.

Marcelo Estácio


[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Hadley Wickham
 I think this is a really bad idea. data.frames are not meant to be
 used in this way. Why not use a list of lists?

 It can be very convenient, but I suspect the original poster is
 confused about the different between vectors and lists.

 I wouldn't be surprised if someone were confused, since my reading of some
 (but not all) of the help documents has led me to think that lists _were_
 vectors, just not vectors of atomic mode. And one oft-illustrated method for
 creating a list is:  alist - vector(mode=list, length=10). I am perhaps
 less confused than I was two years ago but my confusion about all the
 possible permutations of mode, typeof, expression, formula, and class and
 the extraction methods therefrom definitely persists. I think the authors of
 the documentation are of divided opinion or usage on this topic.

I probably should have said atomic vectors - it's easy to get
confused when you fail to be specific.  Data frames add even more
confusion:

 is.vector(as.vector(mtcars))
[1] FALSE

(That behaviour matches the documentation, but it's still confusing!)

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data.frames : difference between x$a and x[, a] ? - How set new values on x$a with a as variable ?

2010-09-10 Thread Bert Gunter
Well, let's see if the following helps or just adds to the confusion.

First lists are vectors of mode list . But they are general
recursive structures (in fact, completely general).

Second, data frames are lists: each column of a data frame is a
component (member) of the list with the additional requirement that
all the components must be the same length. The reason for the scare
quotes areound column will become clear shortly.

Now, for examples:

 x - data.frame(a=1:3,b=list(c=4:6))
 x
  a c
1 1 4
2 2 5
3 3 6

## This is as documented in ?data.frame.


## Now compare:

 y - data.frame(a=1:3)
 y$b - list(c=4:6)
 y
  a   b
1 1 4, 5, 6
2 2 4, 5, 6
3 3 4, 5, 6

## A different result that one might think should be the same as the
previous one. What's going on is: y is a data frame, so all components
must have the same length. Because the length of b, a list, is just 1,
it is replicated to be the proper length:

 y$b
$c
[1] 4 5 6

$c
[1] 4 5 6

$c
[1] 4 5 6

##The b component is still a list:

 mode(y$b)
[1] list

## of course:

 y$c=7:9
 y
  a   b c
1 1 4, 5, 6 7
2 2 4, 5, 6 8
3 3 4, 5, 6 9

## This is correct, since the c component is a vector of length 3. Note also:

 mode(y[,3])
[1] numeric
 mode(y[[3]])
[1] numeric
 mode(y[3])
[1] list

## All these are correct = agree with documented behavior of [ and
[[ because a data.frame IS a list.

Cheers,
Bert










On Fri, Sep 10, 2010 at 7:13 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Sep 10, 2010, at 9:42 AM, Hadley Wickham wrote:

 I'm having trouble parsing this. What exactly do you want to do?

 1 - Put a list as an element of a data.frame. That's quite convenient
 for my pricing function.

 I think this is a really bad idea. data.frames are not meant to be
 used in this way. Why not use a list of lists?

 It can be very convenient, but I suspect the original poster is
 confused about the different between vectors and lists.

 I wouldn't be surprised if someone were confused, since my reading of some
 (but not all) of the help documents has led me to think that lists _were_
 vectors, just not vectors of atomic mode. And one oft-illustrated method for
 creating a list is:  alist - vector(mode=list, length=10). I am perhaps
 less confused than I was two years ago but my confusion about all the
 possible permutations of mode, typeof, expression, formula, and class and
 the extraction methods therefrom definitely persists. I think the authors of
 the documentation are of divided opinion or usage on this topic.

 Best;
 David.



 Hadley

 --
 Assistant Professor / Dobelman Family Junior Chair
 Department of Statistics / Rice University
 http://had.co.nz/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which language is faster for numerical computation?

2010-09-10 Thread Henrik Bengtsson
Don't underestimate the importance of the choice of the algorithm you
use.  That often makes a huge difference.  Also, vectorization is key
in R, and when you use that you're really up there among the top
performing languages.  Here is an example from the official R wiki
illustrating my points:

  http://rwiki.sciviews.org/doku.php?id=tips:programming:code_optim2

My rule of thumb is:

Any piece of code can be made twice as fast.

/Henrik

On Fri, Sep 10, 2010 at 6:06 AM, Prof. John C Nash nas...@uottawa.ca wrote:
 Dirk E. has properly focussed the discussion on measurement rather than
 opinion. I'll add the issue of the human time taken to convert, and more
 importantly debug, interfaced code. That too could be measured, but we
 rarely see human hours to code/debug/test reported.

 Moreover, I'll mention the cat among the pigeons of Rcgmin, which I wrote to
 allow me to play with an optimization code more easily to discover where the
 algorithm might be improved. The resulting package on some problems
 outperforms C equivalents. Now the code is quite vectorized, but this has
 still been a very nice surprise. In fact, I've decided to avoid playing
 around with the interfaces if I can run things well-enough entirely in R.

 JN

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Which language is faster for numerical computation?

2010-09-10 Thread Hadley Wickham
On Fri, Sep 10, 2010 at 10:23 AM, Henrik Bengtsson h...@stat.berkeley.edu 
wrote:
 Don't underestimate the importance of the choice of the algorithm you
 use.  That often makes a huge difference.  Also, vectorization is key
 in R, and when you use that you're really up there among the top
 performing languages.  Here is an example from the official R wiki
 illustrating my points:

  http://rwiki.sciviews.org/doku.php?id=tips:programming:code_optim2

 My rule of thumb is:

 Any piece of code can be made twice as fast.

I second this point - the latest version of reshape is 100x faster in
some situations because I came up with a better vectorised algorithm -
it's still all in R.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression using mapply?

2010-09-10 Thread Greg Snow
This really depends on why you want to do this and what results you want.  If 
your main goal is to look at some basic tests, goodness of fit, then the add1 
function may do everything you need.  If you just want coefficient estimates 
then some basic matrix algebra will give those to you.

Another option would be to reshape the data to long format and use lmList from 
the nlme package (the above will be quicker if you do not need everything that 
lm gives you).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Philipp Kunze
 Sent: Wednesday, September 08, 2010 5:35 AM
 To: R-help@r-project.org
 Subject: [R] Regression using mapply?
 
 Hi,
 I have huge matrices in which the response variable is in the first
 column and the regressors are in the other columns. What I wanted to do
 now is something like this:
 
 #this is just to get an example-matrix
 DataMatrix - rep(1,1000);
 Disturbance - rnorm(900);
 DataMatrix[101:1000] - DataMatrix[101:1000]+Disturbance;
 DataMatrix - matrix(DataMatrix,ncol=10,nrow=100);
 
 #estimate univariate linear model with each regressor-column, response
 in the first column
 
 for(i in 2:10){
   result - lm(DataMatrix[,1]~DataMatrix[,i])
 }
 
 
 Is there any way to get rid of the for-loop using mapply (or some other
 function)?
 
 Thanks!
 Philipp
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairwise.t.test vs t.test

2010-09-10 Thread Jabez Wilson
Thanks a lot, Peter. Excellent book btw.
 
Jab

--- On Fri, 10/9/10, peter dalgaard pda...@gmail.com wrote:


From: peter dalgaard pda...@gmail.com
Subject: Re: [R] pairwise.t.test vs t.test
To: Jabez Wilson jabez...@yahoo.co.uk
Cc: R-Help r-h...@stat.math.ethz.ch
Date: Friday, 10 September, 2010, 15:20



On Sep 10, 2010, at 16:01 , Jabez Wilson wrote:

 Dear all, I am perplexed when trying to get the same results using 
 pairwise.t.test and t.test.
 I'm using examples in the ISwR library, 
 attach(red.cell.folate)
 I can get the same result for pairwise.t.test and t.test when I set the 
 variances to be non-equal, but not when they are assumed to be equal. Can 
 anyone explain the differences, or what I'm doing wrong?
 Here's an example where I compare the first two ventilations with 
 pairwise.t.test and t.test
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F)
         Pairwise comparisons using t tests with non-pooled SD 
 data:  folate and ventilation 
           N2O+O2,24h N2O+O2,op
 N2O+O2,op 0.029      -        
 O2,24h    0.161      0.298    
 P value adjustment method: none 
 
 t.test(folate[1:8], folate[9:17], var.equal=F)
         Welch Two Sample t-test
 data:  folate[1:8] and folate[9:17] 
 t = 2.4901, df = 11.579, p-value = 0.02906
 alternative hypothesis: true difference in means is not equal to 0 
 95 percent confidence interval:
    7.310453 113.050658 
 sample estimates:
 mean of x mean of y 
  316.6250  256. 
  
 So 0.029 and 0.02906 are identical but if I do the same with pool.sd and 
 var.equal = T, I get different results
 pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=T)
         Pairwise comparisons using t tests with pooled SD 
 data:  folate and ventilation 
           N2O+O2,24h N2O+O2,op
 N2O+O2,op 0.014      -        
 O2,24h    0.155      0.408    
 P value adjustment method: none 
 
 t.test(folate[1:8], folate[9:17], var.equal=T)
         Two Sample t-test
 data:  folate[1:8] and folate[9:17] 
 t = 2.5582, df = 15, p-value = 0.02184
 alternative hypothesis: true difference in means is not equal to 0 
 95 percent confidence interval:
   10.03871 110.32240 
 sample estimates:
 mean of x mean of y 
  316.6250  256. 
  
 So 0.014 and 0.02184 are not the same.
  
  


The help page says:

The pool.SD switch calculates a common SD for all groups (NB: all)

So the denominator is not the same as when testing each pair separately.

You can in fact do

pairwise.t.test(folate, ventilation, p.adj=none, pool.sd=F,var.eq=T)




-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] adding labels above bars in a barplot

2010-09-10 Thread Antonio Olinto

Hello,

I want to make a general routine to draw barplots with numbers plotted  
above each bar. See the example below.


I could not place the numbers on the middle of each bar because I  
could not calculate the right position of each x-axis tick. axTicks(1)  
indicated a unitary step, but it does not seem work.


I appreciate any help or suggestions.

Best regards,

Antonio Olinto

==

CAT VAR1VAR2
Category 01 17.59
Category 02 15.220
Category 03 10.3500
Category 04 8.4 150
Category 05 20.35000

# Coping data from a spreadsheet

dat.data - read.delim(clipboard,header=T)

summary(dat.data)
  CAT VAR1VAR2
 Category 01:1   Min.   : 8.40   Min.   :   9
 Category 02:1   1st Qu.:10.30   1st Qu.:  20
 Category 03:1   Median :15.20   Median : 150
 Category 04:1   Mean   :14.34   Mean   :1136
 Category 05:1   3rd Qu.:17.50   3rd Qu.: 500
 Max.   :20.30   Max.   :5000

dat.bar - data.frame(dat.data[,c(2)])
row.names(dat.bar)-dat.data[,1]
names(dat.bar)-c(VAR1)
dat.bar
VAR1
Category 01 17.5
Category 02 15.2
Category 03 10.3
Category 04  8.4
Category 05 20.3

par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4)
barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,ylab=Y  
label text,col=orange)

box()

up - max(dat.data$VAR1)*0.1

for (i in c(0:nrow(dat.data))) {
legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n)
}



Webmail - iBCMG Internet
http://www.ibcmg.com.br

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can't event start 'rattle' under Ubuntu 64bit

2010-09-10 Thread Kamil Sijko
Dear all,

I've wanted to give a try to the rattle GUI for R. After long struggle
with dependencies I've finally managed to install rattle and all of
its dependencies, although for some of them I've been forced to use
cran2deb. And now all I get is:

 library(rattle)
Loading required package: pmml
Loading required package: XML
Loading required package: RGtk2
Loading required package: colorspace
Rattle: Graphical interface for data mining using R.
Version 2.5.39 Copyright (c) 2006-2010 Togaware Pty Ltd.
Type 'rattle()' to shake, rattle, and roll your data.
 rattle()
Error: attempt to apply non-function
In addition: Warning message:
In method(obj, ...) : Invalid object type `GtkFrame'
Rattle timestamp (for the error above): 2010-09-10 17:24:14


As an addition a blank window appears. And I'm stuck. Can anyone -
please - point me in the right direction?

Best,
Kamil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Axis break with gap.plot()

2010-09-10 Thread Filoche

Thank Jim for your answer.

I actually did my own function to plot with the loess. I just calculated the
intersection between the first and second horizontal gap lines with the line
formed by the 2 points before and after the gap. So I can now plot the loess
from both sides of the gap section.

Thank again for your help,
Phil
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Axis-break-with-gap-plot-tp2533027p2534658.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Data Manipulation

2010-09-10 Thread dfong

Hi,

I just started using R and need some guidance.

I need to create a time series chart in R, but the problem is the data is
not numeric.
The data is in the following format

Study 
A
A
B
B
B
A
C
C
D

Then there is also another column with dates. How can I manipulate this in
order to have something that will count the number of unique entries and
group them.
Say A = 3 B= 3 C=2 D=1

Thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534662.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem importing square character

2010-09-10 Thread Marcelo Estácio

Duncan, thanks for your answer.

 

I tried this:

 

con-file(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua 
IFPD\\200701_02_03_04\\200701_02_03_04.txt,open=rb)

 

dados2-read.table(con,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL),

quote=,comment.char=,skip=1)
Erro em pushBack(c(lines, lines), file) : 
  can only push back on text-mode connections

 

My file has 800mbs. The best way to correct this is import to Access and export 
to txt file.

 

Thanks again!

Marcelo Estácio


 
 Date: Fri, 10 Sep 2010 10:35:06 -0400
 From: murdoch.dun...@gmail.com
 To: mes...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] Problem importing square character
 
 On 10/09/2010 10:03 AM, Marcelo Estácio wrote:
 
 
  Dear,
 
  When I try to to execute the following command, R don't read all lines 
  (reads only 57658 lines when the file has 814125 lines):
 
 
 
  dados2-read.table(C:\\Documents and Settings\\mgoncalves\\Desktop\\Tábua 
  IFPD\\200701_02_03_04\\SegurosClube.txt,header=FALSE,sep=^,colClasses=c(character,character,NULL,NA,NULL,NULL,NULL,character,character,NULL,NULL,NULL,NULL,NA,NULL,NULL,NULL,NULL,NA,NULL,NULL),quote=,comment.char=,skip=1,fill=TRUE)
 
  If I exclude fill=TRUE, R gives the message
 
 
 
  Warning message:
  In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
  número de itens não é múltiplo do número de colunas (number of itens is not 
  multiple of number of columns)
 
 
 
  I identified that the problem is the following line of my data (line 57659 
  of my file):
 
 
 
  13850074571^01/01/1940^000^93101104^^^1^01/05/2006^30/06/2006^13479^13479^13479^0^0^0^0^^66214-Previdência
   privada fechada^MARIA^DA CONCEI`O FERREIRA LOBATO^CORPORATE
 
 
  As you can observe, my data have a square string like this:  (i don't 
  know if you can see the character, but it looks like a white square). It 
  looks like that R understands this character as the end of the archive.
 
  I opened my data on the notepad and copied the character. When I paste this 
  character on R, it try to close asking if I want to save my work. What is 
  happenning?
 
 That symbol is the way some systems display the hex 1A character, which 
 in DOS marked the end of file. By the pathname it looks as though 
 you're working on Windows, which has inherited that behaviour.
 
 The best way to get around it would be to correct those bad characters: 
 they are almost certainly errors in the data file. If you want to keep 
 them, then you could try reading the file in binary mode rather than 
 text mode. You do this using
 
 con - file( filename, open=rb)
 read.table(con, header=FALSE, ...)
 close(con)
 
 You could also try reading it on a different OS; I don't think Linux 
 cares about 1A characters.
 
 Duncan Murdoch
 
 
 
  Thanks very much.
 
  Marcelo Estácio
 
  
  [[alternative HTML version deleted]]
 
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation

2010-09-10 Thread Joshua Wiley
Hi,

Look at the table() function.  Here is an example with your data:


dat - read.table(textConnection(
Study
A
A
B
B
B
A
C
C
D), header = TRUE)
closeAllConnections()

table(dat)


Hope that helps,

Josh

On Fri, Sep 10, 2010 at 8:53 AM, dfong df...@medicine.umaryland.edu wrote:

 Hi,

 I just started using R and need some guidance.

 I need to create a time series chart in R, but the problem is the data is
 not numeric.
 The data is in the following format

 Study
 A
 A
 B
 B
 B
 A
 C
 C
 D

 Then there is also another column with dates. How can I manipulate this in
 order to have something that will count the number of unique entries and
 group them.
 Say A = 3 B= 3 C=2 D=1

 Thanks
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534662.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation

2010-09-10 Thread dfong

I'm actually importing it from a CSV, so I already have that in a table. But
i Can't make a graph with text. I assume I need to do some counting in order
to draw the graph?
Any example of this?

thanks
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534690.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] for loop help please!

2010-09-10 Thread alfredo

Hi Everyone, 

I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of
values. row2 - rown specify the number of times I want to replicate each
corresponding value in row1. I can do this with the following function: 

rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n. 

Now, I can do this manually by changing the values of X and save each
resulting array/vector in an object, or write a for loop that will iterate
through the rows and output a new data.matrix in which row1 - rown will
correspond to the vectors generated by replicating the values of row1 row2
- rown independent times from the original data.matrix with the rep
function shown above. So far I have been unable to get the for loop right. 

Any help will be most appreciated! Thanks beforehand for your help. 

Best, 

A 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation

2010-09-10 Thread Joshua Wiley
Hi,

Yes, the table() function is not to read the data, but to do the
frequency counts of each level.  I just included the read.table() part
so that you could copy and paste my code, but I did not include the R
output from table(dat).

 table(dat)
dat
A B C D
3 3 2 1

It nicely tallies for you.  Also, you can look at a simple plot:

# you will have to run this in your R
# because I do not know an easy way to include graphs
plot(table(dat))

You can also save the results in a new variable and then access portions of it:

 my.table - table(dat)
 my.table # the full table
dat
A B C D
3 3 2 1
 my.table[2] # just extract the second element
B
3


Cheers,

Josh

On Fri, Sep 10, 2010 at 9:11 AM, dfong df...@medicine.umaryland.edu wrote:

 I'm actually importing it from a CSV, so I already have that in a table. But
 i Can't make a graph with text. I assume I need to do some counting in order
 to draw the graph?
 Any example of this?

 thanks
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Data-Manipulation-tp2534662p2534690.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Data Manipulation

2010-09-10 Thread Erik Iverson

Hello,

This is definitely possible with R, there a lots of package to make
good graphics.

However, the easiest way for us to help you is if you give us a small
reproducible example, as you started to with your initial post.

If you have an object in your R session that you'd like help with
you can use ?dput to create a text version of it to share with the list.

The table class in R is separate from a data.frame, which is
probably what you have now...

dfong wrote:

I'm actually importing it from a CSV, so I already have that in a table. But
i Can't make a graph with text. I assume I need to do some counting in order
to draw the graph?
Any example of this?

thanks


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop help please!

2010-09-10 Thread Joshua Wiley
Hi A,

Here is a little example that I believe does what you want.  I am not
quite sure how you want all the output in a new matrix, because as you
repeat each value is the first row varying numbers of times, you will
not have rows of equal length.  Although perhaps your data is setup so
that you can.

# Sample data
table1 - matrix(1:16, ncol = 4)
table1 # look at it

# Using your code to create the first example
rep(table1[1,], table1[2,])

# Using apply() to go through every row of the matrix 'table1'
# except the first ([-1, ])
apply(table1[-1, ], 1, function(x) {rep(table1[1,], x)})

# Using a for loop to do the same
for(i in 2:nrow(table1)) {
  print(rep(table1[1, ], table1[i, ]))
}

# In general, I think apply() is prettier
# It is also easier to assign all the output to a list
# Compared to using a for loop

Best regards,

Josh

On Fri, Sep 10, 2010 at 8:54 AM, alfredo alfredote...@gmail.com wrote:

 Hi Everyone,

 I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of
 values. row2 - rown specify the number of times I want to replicate each
 corresponding value in row1. I can do this with the following function:

 rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n.

 Now, I can do this manually by changing the values of X and save each
 resulting array/vector in an object, or write a for loop that will iterate
 through the rows and output a new data.matrix in which row1 - rown will
 correspond to the vectors generated by replicating the values of row1 row2
 - rown independent times from the original data.matrix with the rep
 function shown above. So far I have been unable to get the for loop right.

 Any help will be most appreciated! Thanks beforehand for your help.

 Best,

 A
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop help please!

2010-09-10 Thread Erik Iverson

I do not follow.  Could you please provide a small reproducible example
of what table1 might look like, and what you want as a result?

Surely you don't need a for loop.

alfredo wrote:
Hi Everyone, 


I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of
values. row2 - rown specify the number of times I want to replicate each
corresponding value in row1. I can do this with the following function: 

rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n. 


Now, I can do this manually by changing the values of X and save each
resulting array/vector in an object, or write a for loop that will iterate
through the rows and output a new data.matrix in which row1 - rown will
correspond to the vectors generated by replicating the values of row1 row2
- rown independent times from the original data.matrix with the rep
function shown above. So far I have been unable to get the for loop right. 

Any help will be most appreciated! Thanks beforehand for your help. 

Best, 


A


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop help please!

2010-09-10 Thread jim holtman
try this:

 x - matrix(sample(25,25), 5)
 x
 [,1] [,2] [,3] [,4] [,5]
[1,]   12   24   143   20
[2,]   2175   15   17
[3,]   11   10   22   169
[4,]6   2541   23
[5,]2   198   13   18
 # save result in a list
 result - lapply(2:nrow(x), function(.row){
+ rep(x[1,], times=x[.row,])
+ })
 result
[[1]]
 [1] 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12 24
24 24 24 24 24 24 14 14 14 14 14  3  3
[36]  3  3  3  3  3  3  3  3  3  3  3  3  3 20 20 20 20 20 20 20 20 20
20 20 20 20 20 20 20 20

[[2]]
 [1] 12 12 12 12 12 12 12 12 12 12 12 24 24 24 24 24 24 24 24 24 24 14
14 14 14 14 14 14 14 14 14 14 14 14 14
[36] 14 14 14 14 14 14 14 14  3  3  3  3  3  3  3  3  3  3  3  3  3  3
 3  3 20 20 20 20 20 20 20 20 20

[[3]]
 [1] 12 12 12 12 12 12 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24
24 24 24 24 24 24 24 24 24 14 14 14 14
[36]  3 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20

[[4]]
 [1] 12 12 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 24 14
14 14 14 14 14 14 14  3  3  3  3  3  3
[36]  3  3  3  3  3  3  3 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20




On Fri, Sep 10, 2010 at 11:54 AM, alfredo alfredote...@gmail.com wrote:

 Hi Everyone,

 I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of
 values. row2 - rown specify the number of times I want to replicate each
 corresponding value in row1. I can do this with the following function:

 rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n.

 Now, I can do this manually by changing the values of X and save each
 resulting array/vector in an object, or write a for loop that will iterate
 through the rows and output a new data.matrix in which row1 - rown will
 correspond to the vectors generated by replicating the values of row1 row2
 - rown independent times from the original data.matrix with the rep
 function shown above. So far I have been unable to get the for loop right.

 Any help will be most appreciated! Thanks beforehand for your help.

 Best,

 A
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme, groupedData, random intercept and slope

2010-09-10 Thread array chip
from the output, I think it's both.



- Original Message 
From: John Sorkin jsor...@grecc.umaryland.edu
To: r-help@r-project.org
Sent: Fri, September 10, 2010 5:25:44 AM
Subject: [R] lme, groupedData, random intercept and slope

Windows Vista
R 2.10.1




Does the following use of groupedData and lme produce an analysis with both 
random intercept and slope, or only random slope?


zz-groupedData(y~time |  Subject,data=data.frame(data),
  labels = list( x = Time,
y = y ),
  units = list( x = (yr), y = (mm))
)
plot(zz)

fit10-lme(zz)
summary(fit10)

Linear mixed-effects model fit by REML
Data: zz 
AIC   BIC  logLik
  -123.1942 -115.2010 67.5971

Random effects:
Formula: ~time | Subject
Structure: General positive-definite
StdDev   Corr  
(Intercept) 6.054897e+00 (Intr)
time4.160662e-05 1
Residual9.775954e-04  

Fixed effects: y ~ time 
Value Std.Error DF   t-value p-value
(Intercept) 15.000217  1.914727 19 7.834   0
time-1.51  0.000219 19 -4566.598   0
Correlation: 
 (Intr)
time 0.059 

Standardized Within-Group Residuals:
Min  Q1 Med  Q3 Max 
-1.73706837 -0.36289558  0.06892484  0.59777067  1.69095476 

Number of Observations: 30
Number of Groups: 10 


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modifying axis labels in lattice panels

2010-09-10 Thread Dennis Murphy
Hi:

On Fri, Sep 10, 2010 at 7:16 AM, Markus Loecher markus.loec...@gmail.comwrote:

 Dear all,
 I am struggling to modify the axis labels/ticks in a panel provided to
 xyplot.
 To begin with, I do not know the equivalent of the xaxt=n directive for
 panels that would set the stage for no default x axis being drawn.
 My goal is to draw ticks and custom formatted labels at certain hours of
 the
 week.
 When I execute the code below, I get an error message in the plot window
 that suggests a problem with some args parameter.

 A second problem concerns the shaded rectangles I try to draw. Clearly, the
 range returned by par('usr') does not correspond to the true y ranges.

 Any help would be greatly appreciated,

 Thanks,

 Markus

 PS I am using R version 2.10.0 on MACOS and the lattice package version
 0.18-3 (latest)

 
 library(lattice);

 #multivariate time series, one row for each hour of the week:
 Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) +
 (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 =
 cos(2*pi*7*(0:167)/168));
 #to pass this to xyplot, first need to reshape:
 Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long,
 timevar = PC);
 #get descriptive shingle labels
 Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2));


A less mentally taxing approach :)

library(reshape)
xlong - melt(Xwide, id = 'time')
names(xlong)[2:3] - c('PC', 'Comp')


 xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales =
 list(x=list(at = Hours24-4*3600,
 labels=as.character(format(Hours24-4*3600,%H);


When attempting to run this, I got
Error in xyplot.formula(Comp ~ time | PC, data = Xlong, pane1l =
WeekhourPanel,  :
  object 'Hours24' not found

Attempting to pull Hours24 out of the function didn't work...
 Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)
Error in seq(r[1] + 12 * 3600, r[2], by = 24 * 3600) :
  object 'r' not found

One problem is that to use Hours24 in scales, it has to be defined in the
calling environment of xyplot() - in other words, it has to be defined
outside the panel function and outside of xyplot() if your present code is
to have any hope of working.

I think I got that part figured out: in the console, type
r - range(Xwide$time)
Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)

I at least get a plot now by running your xyplot() function with the panel
function, but all the labels are 08 on the x-axis. Here's why:

format(Hours24-4*3600,%H)
[1] 08 08 08 08 08 08 08

This comes from the labels = part of your panel function. I got the same
plot with this code (apart from adding the lines):
xyplot(Comp ~ time | PC ,data = Xlong, type = 'l',
scales =list(x = list(at = Hours24-4*3600,

labels=as.character(format(Hours24-4*3600,%H)

which indicates that something in your panel function is awry.

I'd suggest starting out simply. Put both plots on the same panel  using PC
as a grouping variable in the long data frame. It will automatically use
different colors for groups, but you can control the line color with  the
col.lines = argument; e.g., col.lines = c('red', 'blue'). Next, I'd work on
getting the axis ticks and labels the way you want with scales. It also
appears that you want to set a custom grid - my suggestion would be to do
that last, after you've controlled the axis ticks and labels. Once you have
that figured out, you have the kernel of your panel function. In most
applications I've seen in lattice, the idea is to keep the panel function as
simple as possible and pass the 'global' stuff to the function call. There's
something broken in your panel function, but it's a run-time error rather
than a compile-time error, so you can either debug it or try simplifying the
problem (and the panel function) as much as possible.

HTH,
Dennis

WeekhourPanel - function(x,y,...){
  r - range(x);
  #print(r)
  Hours8 - seq(r[1], r[2], by=8*3600);
  Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)
  #axis.POSIXct(1, at= Hours8, format=%H);
  panel.xyplot(x,y, type=l, ...);
  panel.grid(0,3);
  panel.abline(v= Hours24-4*3600, lty=2, col = rgb(0,0,1,0.5));
  panel.abline(v=Hours24+6*3600, lty=2, col = rgb(0,1,0,0.5));
  bb - par('usr')
  y0 - bb[3];
  for (i in seq(r[1], r[2], by=48*3600)) panel.rect(xleft=i, ybottom=y0,
 xright=i+24*3600-1, ytop=bb[4], col = rgb(0.75,0.75,0.75,0.3), border =
 NA);
  panel.axis(1, at= Hours24-4*3600,
 labels=as.character(format(Hours24-4*3600,%H)));
  #panel.axis(1, at= Hours24+6*3600, labels=format(x,%H));
  #panel.axis(3, at= Hours24, labels=format(x,%a), line = -1, tick =
 FALSE);

 }

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



Re: [R] lmer fixed effects, SE, t . . . and p

2010-09-10 Thread array chip
But as far as I know, profile() seems to be de-activated in the lme4 package.



- Original Message 
From: Gavin Simpson gavin.simp...@ucl.ac.uk
To: John Sorkin jsor...@grecc.umaryland.edu
Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com
Sent: Fri, September 10, 2010 2:05:37 AM
Subject: Re: [R] lmer fixed effects, SE, t . . . and p

On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote:
 Bert,
 I appreciate you comments, and I have read Doug Bates writing about p
 values in mixed effects regression. It is precisely because I read
 Doug's material that I asked how are we to interpret the estimates
 rather than how can we compute a p value. My question is a simple
 question whose answer is undoubtedly complex, but one that needs an
 answer. Without p values, or confidence intervals, I am not certain
 what to make of the results of my analysis. Does my analysis suggest,
 or does it not suggest that there is a relation between time and y? If
 I can't answer this question after running the analysis, I don't have
 any more information than I did before I ran the analysis, and a fair
 question would be why did I run the analysis? I am asking for help not
 in calculation a p value or a CI, but rather to know what I can and
 can't say about the results of the analysis. If this basic question
 can not be answered, I am at a loss to interpret my results. 
 Thank you,
 John

Doug talks quite a lot about profiling lmer fits using 'profile
deviance' to investigate variability in fixed effects. For example, see
section 1.5 in the draft of chapter 1 of Doug's book on mixed models:

http://lme4.r-forge.r-project.org/book/

HTH

G

 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert 
Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM 
 John:
 
 Search on this issue in the list archives. Doug Bates has addressed it
 at length. Basically, he does not calculate CI's or p-values because
 he does not know how to reliably do so.
 
 However, the key remark in your query was:
 
  (2) lmer does not give p values or confidence intervals for the fixed 
effects. How we are to interpret the estimates given that no p value or CI is 
given for the estimates?
 
 Think about it. A statistical analysis -- ANY statistical analysis --
 treats the data in isolation: it is not informed by physics,
 thermodynamics, biology,  other similar data, prior experience, or,
 indeed, any part of the body of relevant scientific knowledge. Do you
 really think that any such analysis, especially when predicated upon
 often tenuous or even (necessarily) unverifiable assumptions and
 simplifications should be considered authoritative? Classical
 statistical inference is just another piece of the puzzle, and not
 even particularly useful when, as if typically the case, hypotheses
 are formulated AFTER seeing the data (this invalidates the probability
 calculations -- hypotheses must be formulated before seeing the data
 to be meaningfully assessed). Leo Breiman called this statistics'
 quiet scandal something like 20 years ago, and he was no dummy.
 
 It is comforting, perhaps, but illusory to believe that statistical
 inference can be relied on to give sound, objective scientific
 results. True, without such a framework, science seems rather
 subjective, perhaps closer to religion and arbitrary cultural
 archetypes than we care to admit. But see Thomas Kuhn and Paul
 Feuerabend for why this is neither surprising nor necessarily a bad
 thing.
 
 Cheers,
 Bert Gunter
 
 
 
 
 On Thu, Sep 9, 2010 at 8:00 PM, John Sorkin jsor...@grecc.umaryland.edu 
wrote:
  windows Vista
  R 2.10.1
 
 
  (1) How can I get the complete table of for the fixed effects from lmer. As 
can be seen from the example below, fixef(fit2) only give the estimates and 
not 
the SE or t value
 
  fit3- lmer(y~time + (1|Subject) + (time|Subject),data=data.frame(data))
  summary(fit3)
  Linear mixed model fit by REML
  Formula: y ~ time + (1 | Subject) + (time | Subject)
Data: data.frame(data)
 AICBIC logLik deviance REMLdev
   -126.2 -116.4   70.1   -152.5  -140.2
  Random effects:
   Groups   NameVariance   Std.Dev.   Corr
   Subject  (Intercept) 2.9311e+01 5.41396385
   Subject  (Intercept) 0.e+00 0.
   time0.e+00 0.   NaN
   Residual 8.1591e-07 0.00090328
  Number of obs: 30, groups: Subject, 10
 
  Fixed effects:
  Estimate Std. Error t value
  (Intercept) 14.998216   1.712046   9
  time-0.999779   0.000202   -4950
 
  Correlation of Fixed Effects:
  (Intr)
  time -0.001
  fixef(fit3)
  (Intercept)time
   14.9982158  -0.9997793
 
  (2) lmer does not give p values or confidence intervals for the 

Re: [R] lmer fixed effects, SE, t . . . and p

2010-09-10 Thread Gavin Simpson
On Fri, 2010-09-10 at 09:51 -0700, array chip wrote:
 But as far as I know, profile() seems to be de-activated in the lme4 package.

It is beta software. The lme4a version of the lme4 package might have
had profile re-enabled, IIRC. 

G

 - Original Message 
 From: Gavin Simpson gavin.simp...@ucl.ac.uk
 To: John Sorkin jsor...@grecc.umaryland.edu
 Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com
 Sent: Fri, September 10, 2010 2:05:37 AM
 Subject: Re: [R] lmer fixed effects, SE, t . . . and p
 
 On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote:
  Bert,
  I appreciate you comments, and I have read Doug Bates writing about p
  values in mixed effects regression. It is precisely because I read
  Doug's material that I asked how are we to interpret the estimates
  rather than how can we compute a p value. My question is a simple
  question whose answer is undoubtedly complex, but one that needs an
  answer. Without p values, or confidence intervals, I am not certain
  what to make of the results of my analysis. Does my analysis suggest,
  or does it not suggest that there is a relation between time and y? If
  I can't answer this question after running the analysis, I don't have
  any more information than I did before I ran the analysis, and a fair
  question would be why did I run the analysis? I am asking for help not
  in calculation a p value or a CI, but rather to know what I can and
  can't say about the results of the analysis. If this basic question
  can not be answered, I am at a loss to interpret my results. 
  Thank you,
  John
 
 Doug talks quite a lot about profiling lmer fits using 'profile
 deviance' to investigate variability in fixed effects. For example, see
 section 1.5 in the draft of chapter 1 of Doug's book on mixed models:
 
 http://lme4.r-forge.r-project.org/book/
 
 HTH
 
 G
 
  John David Sorkin M.D., Ph.D.
  Chief, Biostatistics and Informatics
  University of Maryland School of Medicine Division of Gerontology
  Baltimore VA Medical Center
  10 North Greene Street
  GRECC (BT/18/GR)
  Baltimore, MD 21201-1524
  (Phone) 410-605-7119
  (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert 
 Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM 
  John:
  
  Search on this issue in the list archives. Doug Bates has addressed it
  at length. Basically, he does not calculate CI's or p-values because
  he does not know how to reliably do so.
  
  However, the key remark in your query was:
  
   (2) lmer does not give p values or confidence intervals for the fixed 
 effects. How we are to interpret the estimates given that no p value or CI 
 is 
 given for the estimates?
  
  Think about it. A statistical analysis -- ANY statistical analysis --
  treats the data in isolation: it is not informed by physics,
  thermodynamics, biology,  other similar data, prior experience, or,
  indeed, any part of the body of relevant scientific knowledge. Do you
  really think that any such analysis, especially when predicated upon
  often tenuous or even (necessarily) unverifiable assumptions and
  simplifications should be considered authoritative? Classical
  statistical inference is just another piece of the puzzle, and not
  even particularly useful when, as if typically the case, hypotheses
  are formulated AFTER seeing the data (this invalidates the probability
  calculations -- hypotheses must be formulated before seeing the data
  to be meaningfully assessed). Leo Breiman called this statistics'
  quiet scandal something like 20 years ago, and he was no dummy.
  
  It is comforting, perhaps, but illusory to believe that statistical
  inference can be relied on to give sound, objective scientific
  results. True, without such a framework, science seems rather
  subjective, perhaps closer to religion and arbitrary cultural
  archetypes than we care to admit. But see Thomas Kuhn and Paul
  Feuerabend for why this is neither surprising nor necessarily a bad
  thing.
  
  Cheers,
  Bert Gunter
  
  
  
  
  On Thu, Sep 9, 2010 at 8:00 PM, John Sorkin jsor...@grecc.umaryland.edu 
 wrote:
   windows Vista
   R 2.10.1
  
  
   (1) How can I get the complete table of for the fixed effects from lmer. 
   As 
 can be seen from the example below, fixef(fit2) only give the estimates and 
 not 
 the SE or t value
  
   fit3- lmer(y~time + (1|Subject) + (time|Subject),data=data.frame(data))
   summary(fit3)
   Linear mixed model fit by REML
   Formula: y ~ time + (1 | Subject) + (time | Subject)
 Data: data.frame(data)
  AICBIC logLik deviance REMLdev
-126.2 -116.4   70.1   -152.5  -140.2
   Random effects:
Groups   NameVariance   Std.Dev.   Corr
Subject  (Intercept) 2.9311e+01 5.41396385
Subject  (Intercept) 0.e+00 0.
time0.e+00 0.   NaN
Residual 8.1591e-07 0.00090328
   Number of obs: 30, groups: Subject, 10
  
   Fixed effects:
   Estimate Std. Error t value
   

[R] difference of two RData files/environments

2010-09-10 Thread Barry Rowlingson
I just wrote up some code for differencing two .RData files or
environments (or one of each). Available from source here:

http://www.maths.lancs.ac.uk/~rowlings/R/Ediff/

In its handiest form, running:

ediff()

 will tell you the difference between your working environment and the
.RData file that it probably started from. Useful for those 'What have
I done here?' moments when you discover an R session in a long-lost
terminal window.

It can take two arguments which can be paths to files or environments.
It tells you which objects are in one or the other or both, and does
an identical() check on the ones in both.

If anyone maintains a package that this could go in, get in touch.

Barry


-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme4a package loading error

2010-09-10 Thread array chip
Thanks for reminding this. So I found lme4a package from Doug's UserR!2010 
presentation folder: 
http://lme4.r-forge.r-project.org/slides/2010-07-20-Gaithersburg/pkg/

However, after installation, I got the following error message when trying to 
load the library:

library(Matrix)
 library(Rcpp)
 library(minqa)
 library(lme4a)
Error : classes modelMatrix, denseModelMatrix, sparseModelMatrix, 
ddenseModelMatrix, dsparseModelMatrix, predModule, dPredModule, 
sPredModule, respModule, glmRespMod, nlsRespMod are not exported by 
'namespace:Matrix'
Error: package/namespace load failed for 'lme4a'

Here is my sessionInfo()
 sessionInfo()
R version 2.11.1 (2010-05-31) 
i386-pc-mingw32 

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252  
 

[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C 
 

[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] minqa_1.1.9Rcpp_0.8.6 Matrix_0.999375-43 lattice_0.18-8

loaded via a namespace (and not attached):
[1] grid_2.11.1nlme_3.1-96splines_2.11.1 stats4_2.11.1  tools_2.11.1  

Any suggestions would be appreciated.

John





- Original Message 
From: Gavin Simpson gavin.simp...@ucl.ac.uk
To: array chip arrayprof...@yahoo.com
Cc: John Sorkin jsor...@grecc.umaryland.edu; r-help@r-project.org; Bert 
Gunter 
gunter.ber...@gene.com
Sent: Fri, September 10, 2010 10:00:15 AM
Subject: Re: [R] lmer fixed effects, SE, t . . . and p

On Fri, 2010-09-10 at 09:51 -0700, array chip wrote:
 But as far as I know, profile() seems to be de-activated in the lme4 package.

It is beta software. The lme4a version of the lme4 package might have
had profile re-enabled, IIRC. 

G

 - Original Message 
 From: Gavin Simpson gavin.simp...@ucl.ac.uk
 To: John Sorkin jsor...@grecc.umaryland.edu
 Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com
 Sent: Fri, September 10, 2010 2:05:37 AM
 Subject: Re: [R] lmer fixed effects, SE, t . . . and p
 
 On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote:
  Bert,
  I appreciate you comments, and I have read Doug Bates writing about p
  values in mixed effects regression. It is precisely because I read
  Doug's material that I asked how are we to interpret the estimates
  rather than how can we compute a p value. My question is a simple
  question whose answer is undoubtedly complex, but one that needs an
  answer. Without p values, or confidence intervals, I am not certain
  what to make of the results of my analysis. Does my analysis suggest,
  or does it not suggest that there is a relation between time and y? If
  I can't answer this question after running the analysis, I don't have
  any more information than I did before I ran the analysis, and a fair
  question would be why did I run the analysis? I am asking for help not
  in calculation a p value or a CI, but rather to know what I can and
  can't say about the results of the analysis. If this basic question
  can not be answered, I am at a loss to interpret my results. 
  Thank you,
  John
 
 Doug talks quite a lot about profiling lmer fits using 'profile
 deviance' to investigate variability in fixed effects. For example, see
 section 1.5 in the draft of chapter 1 of Doug's book on mixed models:
 
 http://lme4.r-forge.r-project.org/book/
 
 HTH
 
 G
 
  John David Sorkin M.D., Ph.D.
  Chief, Biostatistics and Informatics
  University of Maryland School of Medicine Division of Gerontology
  Baltimore VA Medical Center
  10 North Greene Street
  GRECC (BT/18/GR)
  Baltimore, MD 21201-1524
  (Phone) 410-605-7119
  (Fax) 410-605-7913 (Please call phone number above prior to faxing) Bert 
 Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM 
  John:
  
  Search on this issue in the list archives. Doug Bates has addressed it
  at length. Basically, he does not calculate CI's or p-values because
  he does not know how to reliably do so.
  
  However, the key remark in your query was:
  
   (2) lmer does not give p values or confidence intervals for the fixed 
 effects. How we are to interpret the estimates given that no p value or CI 
 is 

 given for the estimates?
  
  Think about it. A statistical analysis -- ANY statistical analysis --
  treats the data in isolation: it is not informed by physics,
  thermodynamics, biology,  other similar data, prior experience, or,
  indeed, any part of the body of relevant scientific knowledge. Do you
  really think that any such analysis, especially when predicated upon
  often tenuous or even (necessarily) unverifiable assumptions and
  simplifications should be considered authoritative? Classical
  statistical inference is just another piece of the puzzle, and not
  even particularly useful when, as if typically the case, hypotheses
  are formulated AFTER seeing the data (this invalidates the probability
  calculations -- hypotheses 

[R] filter a tab delimited text file

2010-09-10 Thread Duke

 Hi all,

I have to filter a tab-delimited text file like below:

GeneNamesvalue1value2log2(Fold_change)
log2(Fold_change) normalizedSignature(abs(log2(Fold_change) 
normalized)  4)

ENSG0209350435-3.81131293562629-4.14357714689656TRUE
ENSG017713314225.467717200823365.13545298955309FALSE
ENSG01162851151669-4.54130810709955
-4.87357231836982TRUE
ENSG000972410162-4.69995182667858
-5.03221603794886FALSE

ENSG0162460331-4.05126372834704-4.38352793961731TRUE

based on the last column (TRUE), and then write to a new text file, 
meaning I should get something like below:


GeneNamesvalue1value2log2(Fold_change)
log2(Fold_change) normalizedSignature(abs(log2(Fold_change) 
normalized)  4)

ENSG0209350435-3.81131293562629-4.14357714689656TRUE
ENSG01162851151669-4.54130810709955
-4.87357231836982TRUE

ENSG0162460331-4.05126372834704-4.38352793961731TRUE

I used read.table and write.table but I am still not very satisfied with 
the results. Here is what I did:


expFC - read.table( test.txt, header=T, sep=\t )
expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,]
write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t )

Result:

GeneNamesvalue1value2log2.Fold_change.
log2.Fold_change..normalized
Signature.abs.log2.Fold_change..normalized4.
ENSG0209350435-3.81131293562629
-4.14357714689656TRUE
ENSG01162851151669-4.54130810709955
-4.87357231836982TRUE
ENSG0162460331-4.05126372834704
-4.38352793961731TRUE


As you can see, there are two points:

1. The headers were altered. All the special characters were converted 
to dot (.).
2. The gene names (first column) were quoted (which were not in the 
original file).


The second point is not very annoying, but the first one is. How do I 
get exact the headers like the original file?


Thanks,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop help please!

2010-09-10 Thread Dennis Murphy
Hi:

It's not immediately clear what you have in mind, as others have noted, but
here are a couple of ideas that seem as though they may apply to your
problem, as dangerous as it is to play clairvoyant:

I'm using vectors instead of a matrix, but the first vector, val, contains
the values whereas the second, reps, holds the number of desired
repetitions, which are randomly generated to take values from 1:5.

val - 1:10
reps - reps = sample(1:5, 10, replace = TRUE)

(1) Output is a vector:
rep(val, reps)
 [1]  1  1  1  2  2  3  3  3  4  4  4  5  5  5  5  5  6  6  6  6  6  7  7
8  9
[26]  9 10 10 10 10 10

(2) Output is a list:
l - mapply(rep, val, reps)
 l
[[1]]
[1] 1 1 1

[[2]]
[1] 2 2

[[3]]
[1] 3 3 3

[[4]]
[1] 4 4 4

[[5]]
[1] 5 5 5 5 5

[[6]]
[1] 6 6 6 6 6

[[7]]
[1] 7 7

[[8]]
[1] 8

[[9]]
[1] 9 9

[[10]]
[1] 10 10 10 10 10

Hopefully, one of these answers your question. If not, you'll need to
explain your problem in more detail. The easiest way to do this is to come
up with a toy example, what you tried that didn't work, and the expected
outcome.

Dennis

On Fri, Sep 10, 2010 at 8:54 AM, alfredo alfredote...@gmail.com wrote:


 Hi Everyone,

 I have a 2-dim data.matrix(e.g., table1) in which row1 specifies a range of
 values. row2 - rown specify the number of times I want to replicate each
 corresponding value in row1. I can do this with the following function:

 rep(c(table1[1,]),c(table1[X,])) #where X would go from 2 - n.

 Now, I can do this manually by changing the values of X and save each
 resulting array/vector in an object, or write a for loop that will iterate
 through the rows and output a new data.matrix in which row1 - rown will
 correspond to the vectors generated by replicating the values of row1 row2
 - rown independent times from the original data.matrix with the rep
 function shown above. So far I have been unable to get the for loop right.

 Any help will be most appreciated! Thanks beforehand for your help.

 Best,

 A
 --
 View this message in context:
 http://r.789695.n4.nabble.com/for-loop-help-please-tp2534666p2534666.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding labels above bars in a barplot

2010-09-10 Thread Greg Snow
See this message and the replies to it (and the replies to the replies, etc.):

http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22858.html

In there is a discussion of why you don't really want to do that along with 
better alternatives and examples of the improved plots.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Antonio Olinto
 Sent: Friday, September 10, 2010 8:59 AM
 To: R-help
 Subject: [R] adding labels above bars in a barplot
 
 Hello,
 
 I want to make a general routine to draw barplots with numbers plotted
 above each bar. See the example below.
 
 I could not place the numbers on the middle of each bar because I
 could not calculate the right position of each x-axis tick. axTicks(1)
 indicated a unitary step, but it does not seem work.
 
 I appreciate any help or suggestions.
 
 Best regards,
 
 Antonio Olinto
 
 ==
 
 CAT   VAR1VAR2
 Category 01   17.59
 Category 02   15.220
 Category 03   10.3500
 Category 04   8.4 150
 Category 05   20.35000
 
 # Coping data from a spreadsheet
 
 dat.data - read.delim(clipboard,header=T)
 
 summary(dat.data)
CAT VAR1VAR2
   Category 01:1   Min.   : 8.40   Min.   :   9
   Category 02:1   1st Qu.:10.30   1st Qu.:  20
   Category 03:1   Median :15.20   Median : 150
   Category 04:1   Mean   :14.34   Mean   :1136
   Category 05:1   3rd Qu.:17.50   3rd Qu.: 500
   Max.   :20.30   Max.   :5000
 
 dat.bar - data.frame(dat.data[,c(2)])
 row.names(dat.bar)-dat.data[,1]
 names(dat.bar)-c(VAR1)
 dat.bar
  VAR1
 Category 01 17.5
 Category 02 15.2
 Category 03 10.3
 Category 04  8.4
 Category 05 20.3
 
 par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4)
 barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,yla
 b=Y
 label text,col=orange)
 box()
 
 up - max(dat.data$VAR1)*0.1
 
 for (i in c(0:nrow(dat.data))) {
 legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n)
 }
 
 
 
 Webmail - iBCMG Internet
 http://www.ibcmg.com.br
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filter a tab delimited text file

2010-09-10 Thread Phil Spector

Duke -
   One possibility is to check the help files for the functions
involved to see if there are options to control this behaviour.
For example, the check.names= argument to read.table, or the 
quote= argument to write.table.  How about


expFC - read.table(test.txt, header=TRUE, sep=\t, check.names=FALSE)
expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,]
write.table(expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t, 
quote=FALSE )

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Fri, 10 Sep 2010, Duke wrote:


Hi all,

I have to filter a tab-delimited text file like below:

GeneNamesvalue1value2log2(Fold_change) 
log2(Fold_change) normalizedSignature(abs(log2(Fold_change) 
normalized)  4)

ENSG0209350435-3.81131293562629-4.14357714689656TRUE
ENSG017713314225.467717200823365.13545298955309FALSE
ENSG01162851151669-4.54130810709955-4.87357231836982 
TRUE
ENSG000972410162-4.69995182667858-5.03221603794886 
FALSE

ENSG0162460331-4.05126372834704-4.38352793961731TRUE

based on the last column (TRUE), and then write to a new text file, meaning I 
should get something like below:


GeneNamesvalue1value2log2(Fold_change) 
log2(Fold_change) normalizedSignature(abs(log2(Fold_change) 
normalized)  4)

ENSG0209350435-3.81131293562629-4.14357714689656TRUE
ENSG01162851151669-4.54130810709955-4.87357231836982 
TRUE

ENSG0162460331-4.05126372834704-4.38352793961731TRUE

I used read.table and write.table but I am still not very satisfied with the 
results. Here is what I did:


expFC - read.table( test.txt, header=T, sep=\t )
expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,]
write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t )

Result:

GeneNamesvalue1value2log2.Fold_change. 
log2.Fold_change..normalized 
Signature.abs.log2.Fold_change..normalized4.
ENSG0209350435-3.81131293562629-4.14357714689656 
TRUE
ENSG01162851151669-4.54130810709955-4.87357231836982 
TRUE
ENSG0162460331-4.05126372834704-4.38352793961731 
TRUE


As you can see, there are two points:

1. The headers were altered. All the special characters were converted to dot 
(.).
2. The gene names (first column) were quoted (which were not in the original 
file).


The second point is not very annoying, but the first one is. How do I get 
exact the headers like the original file?


Thanks,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme4a package loading error

2010-09-10 Thread Gavin Simpson
On Fri, 2010-09-10 at 10:23 -0700, array chip wrote:
 Thanks for reminding this. So I found lme4a package from Doug's UserR!2010 
 presentation folder: 
 http://lme4.r-forge.r-project.org/slides/2010-07-20-Gaithersburg/pkg/

What is wrong with the one on the packages tab of the lme4 project page:

https://r-forge.r-project.org/R/?group_id=60

?

You might need to make sure you have the latest Matrix as well to run
lme4a. Update Matrix via update.packages() or install the latest version
from r-forge and see if that helps.

Also, try not to cross-post to multiple lists. Stick with one, or move
the thread onto the new list.

HTH

G

 However, after installation, I got the following error message when trying to 
 load the library:
 
 library(Matrix)
  library(Rcpp)
  library(minqa)
  library(lme4a)
 Error : classes modelMatrix, denseModelMatrix, sparseModelMatrix, 
 ddenseModelMatrix, dsparseModelMatrix, predModule, dPredModule, 
 sPredModule, respModule, glmRespMod, nlsRespMod are not exported by 
 'namespace:Matrix'
 Error: package/namespace load failed for 'lme4a'
 
 Here is my sessionInfo()
  sessionInfo()
 R version 2.11.1 (2010-05-31) 
 i386-pc-mingw32 
 
 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
 States.1252   
 
 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C   

 
 [5] LC_TIME=English_United States.1252
 
 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base 
 
 other attached packages:
 [1] minqa_1.1.9Rcpp_0.8.6 Matrix_0.999375-43 lattice_0.18-8   
  
 
 loaded via a namespace (and not attached):
 [1] grid_2.11.1nlme_3.1-96splines_2.11.1 stats4_2.11.1  tools_2.11.1  
 
 Any suggestions would be appreciated.
 
 John
 
 
 
 
 
 - Original Message 
 From: Gavin Simpson gavin.simp...@ucl.ac.uk
 To: array chip arrayprof...@yahoo.com
 Cc: John Sorkin jsor...@grecc.umaryland.edu; r-help@r-project.org; Bert 
 Gunter 
 gunter.ber...@gene.com
 Sent: Fri, September 10, 2010 10:00:15 AM
 Subject: Re: [R] lmer fixed effects, SE, t . . . and p
 
 On Fri, 2010-09-10 at 09:51 -0700, array chip wrote:
  But as far as I know, profile() seems to be de-activated in the lme4 
  package.
 
 It is beta software. The lme4a version of the lme4 package might have
 had profile re-enabled, IIRC. 
 
 G
 
  - Original Message 
  From: Gavin Simpson gavin.simp...@ucl.ac.uk
  To: John Sorkin jsor...@grecc.umaryland.edu
  Cc: r-help@r-project.org; Bert Gunter gunter.ber...@gene.com
  Sent: Fri, September 10, 2010 2:05:37 AM
  Subject: Re: [R] lmer fixed effects, SE, t . . . and p
  
  On Thu, 2010-09-09 at 23:40 -0400, John Sorkin wrote:
   Bert,
   I appreciate you comments, and I have read Doug Bates writing about p
   values in mixed effects regression. It is precisely because I read
   Doug's material that I asked how are we to interpret the estimates
   rather than how can we compute a p value. My question is a simple
   question whose answer is undoubtedly complex, but one that needs an
   answer. Without p values, or confidence intervals, I am not certain
   what to make of the results of my analysis. Does my analysis suggest,
   or does it not suggest that there is a relation between time and y? If
   I can't answer this question after running the analysis, I don't have
   any more information than I did before I ran the analysis, and a fair
   question would be why did I run the analysis? I am asking for help not
   in calculation a p value or a CI, but rather to know what I can and
   can't say about the results of the analysis. If this basic question
   can not be answered, I am at a loss to interpret my results. 
   Thank you,
   John
  
  Doug talks quite a lot about profiling lmer fits using 'profile
  deviance' to investigate variability in fixed effects. For example, see
  section 1.5 in the draft of chapter 1 of Doug's book on mixed models:
  
  http://lme4.r-forge.r-project.org/book/
  
  HTH
  
  G
  
   John David Sorkin M.D., Ph.D.
   Chief, Biostatistics and Informatics
   University of Maryland School of Medicine Division of Gerontology
   Baltimore VA Medical Center
   10 North Greene Street
   GRECC (BT/18/GR)
   Baltimore, MD 21201-1524
   (Phone) 410-605-7119
   (Fax) 410-605-7913 (Please call phone number above prior to faxing) 
   Bert 
  Gunter gunter.ber...@gene.com 9/9/2010 11:21 PM 
   John:
   
   Search on this issue in the list archives. Doug Bates has addressed it
   at length. Basically, he does not calculate CI's or p-values because
   he does not know how to reliably do so.
   
   However, the key remark in your query was:
   
(2) lmer does not give p values or confidence intervals for the fixed 
  effects. How we are to interpret the estimates given that no p value or CI 
  is 
 
  given for the estimates?
   
   Think about it. A statistical analysis -- ANY statistical analysis --
   treats the data in isolation: it is not informed 

Re: [R] boxplot knowing Q1, Q3, median, upper and lower whisker value

2010-09-10 Thread Peng, C

 x=1:16
 S=summary(x)
 S
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
   1.004.758.508.50   12.25   16.00 
S[-4]
   Min. 1st Qu.  Median 3rd Qu.Max. 
   1.004.758.50   12.25   16.00 
 par(mfrow=c(1,2))
 boxplot(S[-4])   # based on the summarized stats
 boxplot(x) # based on the raw data
 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/boxplot-knowing-Q1-Q3-median-upper-and-lower-whisker-value-tp2528571p2534818.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[R] adding zeroes after old zeroes in a vector ??

2010-09-10 Thread skan

Hello

Imagine I have a vector with ones and zeroes

I write it compactly:
011100101

I need to get a new vector replacing the N ones following the zeroes to
new zeroes.

For example for N = 3
011100101  becomes
00010

I can do it with a for loop but I've read is not a good practice,  How can I
do it then?

cheers


My vector is a zoo series, indeed, but I guess it doesn't make any
difference.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534824.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boxplot knowing Q1, Q3, median, upper and lower whisker value

2010-09-10 Thread Greg Snow
For base graphics the bxp function does the actual plotting given the 
statistics.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Brian Diggs
 Sent: Thursday, September 09, 2010 1:41 PM
 To: David A.
 Cc: R-help
 Subject: Re: [R] boxplot knowing Q1, Q3, median, upper and lower
 whisker value
 
 On 9/6/2010 8:46 AM, David A. wrote:
 
  Dear list,
 
  I am using a external program that outputs Q1, Q3, median, upper and
  lower whisker values for various datasets simultaneously in a tab
  delimited format. After importing this text file into R, I would like
  to plot a boxplot using these given values and not the original
  series of data points, i.e. not using something like
  boxplot(mydata).
 
  Is there an easy way for doing this? If I am not wrong, boxplot()
  does not accept these values as parameters.
 
  Cheers,
 
  Dave  [[alternative HTML version deleted]]
 
 If you use ggplot2, you can specify the aesthetics lower, upper,
 middle,
 ymin, and ymax directly to variables in geom_boxplot.  Just be sure to
 set stat=identity so that it does not try to summarize your data
 again.
 
 --
 Brian Diggs
 Senior Research Associate, Department of Surgery
 Oregon Health  Science University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modifying axis labels in lattice panels

2010-09-10 Thread Markus Loecher
Thanks a lot for this incredibly helpful and thorough reply.
I had actually meant to cut out the scales part before sending the email,
very sorry about the confusion, so I was actually executing just

xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel)

The scales part was a later attempt to control the axis directly which I
eventually abandoned. (partly because I actually wanted the HOUR variables
to be local to the panel function)
and yes, in this simplified version I asked for labels only at 8am which
formats to 08. My intention was to add more hours and weekly labels once I
figure out this simple axis first.

I had hoped to define a panel function that draws only one PC at a time
since I envision that grouping variable to have many levels (two were just
an example).

Might you know how to disable the axis drawing in panel.xyplot ?

Thanks !

Markus


On Fri, Sep 10, 2010 at 12:45 PM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:

 On Fri, Sep 10, 2010 at 7:16 AM, Markus Loecher 
 markus.loec...@gmail.comwrote:

 Dear all,
 I am struggling to modify the axis labels/ticks in a panel provided to
 xyplot.
 To begin with, I do not know the equivalent of the xaxt=n directive for
 panels that would set the stage for no default x axis being drawn.
 My goal is to draw ticks and custom formatted labels at certain hours of
 the
 week.
 When I execute the code below, I get an error message in the plot window
 that suggests a problem with some args parameter.

 A second problem concerns the shaded rectangles I try to draw. Clearly,
 the
 range returned by par('usr') does not correspond to the true y ranges.

 Any help would be greatly appreciated,

 Thanks,

 Markus

 PS I am using R version 2.10.0 on MACOS and the lattice package version
 0.18-3 (latest)

 
 library(lattice);

 #multivariate time series, one row for each hour of the week:
 Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) +
 (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 =
 cos(2*pi*7*(0:167)/168));
 #to pass this to xyplot, first need to reshape:
 Xlong - reshape(Xwide, varying = c(2:3), idvar = time,
 direction=long,
 timevar = PC);
 #get descriptive shingle labels
 Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2));


 A less mentally taxing approach :)

 library(reshape)
 xlong - melt(Xwide, id = 'time')
 names(xlong)[2:3] - c('PC', 'Comp')


 xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales =
 list(x=list(at = Hours24-4*3600,
 labels=as.character(format(Hours24-4*3600,%H);


 When attempting to run this, I got
 Error in xyplot.formula(Comp ~ time | PC, data = Xlong, pane1l =
 WeekhourPanel,  :
   object 'Hours24' not found

 Attempting to pull Hours24 out of the function didn't work...

  Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)
 Error in seq(r[1] + 12 * 3600, r[2], by = 24 * 3600) :
   object 'r' not found

 One problem is that to use Hours24 in scales, it has to be defined in the
 calling environment of xyplot() - in other words, it has to be defined
 outside the panel function and outside of xyplot() if your present code is
 to have any hope of working.

 I think I got that part figured out: in the console, type
 r - range(Xwide$time)

 Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)

 I at least get a plot now by running your xyplot() function with the panel
 function, but all the labels are 08 on the x-axis. Here's why:


 format(Hours24-4*3600,%H)
 [1] 08 08 08 08 08 08 08

 This comes from the labels = part of your panel function. I got the same
 plot with this code (apart from adding the lines):
 xyplot(Comp ~ time | PC ,data = Xlong, type = 'l',
 scales =list(x = list(at = Hours24-4*3600,

 labels=as.character(format(Hours24-4*3600,%H)

 which indicates that something in your panel function is awry.

 I'd suggest starting out simply. Put both plots on the same panel  using PC
 as a grouping variable in the long data frame. It will automatically use
 different colors for groups, but you can control the line color with  the
 col.lines = argument; e.g., col.lines = c('red', 'blue'). Next, I'd work on
 getting the axis ticks and labels the way you want with scales. It also
 appears that you want to set a custom grid - my suggestion would be to do
 that last, after you've controlled the axis ticks and labels. Once you have
 that figured out, you have the kernel of your panel function. In most
 applications I've seen in lattice, the idea is to keep the panel function as
 simple as possible and pass the 'global' stuff to the function call. There's
 something broken in your panel function, but it's a run-time error rather
 than a compile-time error, so you can either debug it or try simplifying the
 problem (and the panel function) as much as possible.

 HTH,
 Dennis

 WeekhourPanel - function(x,y,...){
  r - range(x);
  #print(r)
  Hours8 - seq(r[1], r[2], by=8*3600);
  Hours24 - 

Re: [R] filter a tab delimited text file

2010-09-10 Thread Duke

 Hi Phil,

On 9/10/10 1:45 PM, Phil Spector wrote:

Duke -
   One possibility is to check the help files for the functions
involved to see if there are options to control this behaviour.
For example, the check.names= argument to read.table, or the quote= 
argument to write.table.  How about


Yes, I did before posting question to the list. But somehow I missed (or 
misunderstood) the check.names option. As about quote=FALSE option for 
write.table, it does not work as I want, since all the headers are 
unquoted too.




expFC - read.table(test.txt, header=TRUE, sep=\t, check.names=FALSE)
expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,]
write.table(expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, 
sep=\t, quote=FALSE )


This works perfectly and solves the first issue. Thanks so much Phil.

D.



- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Fri, 10 Sep 2010, Duke wrote:


Hi all,

I have to filter a tab-delimited text file like below:

GeneNamesvalue1value2log2(Fold_change) 
log2(Fold_change) normalizedSignature(abs(log2(Fold_change) 
normalized)  4)
ENSG0209350435-3.81131293562629
-4.14357714689656TRUE
ENSG017713314225.46771720082336
5.13545298955309FALSE
ENSG01162851151669-4.54130810709955
-4.87357231836982 TRUE
ENSG000972410162-4.69995182667858
-5.03221603794886 FALSE
ENSG0162460331-4.05126372834704
-4.38352793961731TRUE


based on the last column (TRUE), and then write to a new text file, 
meaning I should get something like below:


GeneNamesvalue1value2log2(Fold_change) 
log2(Fold_change) normalizedSignature(abs(log2(Fold_change) 
normalized)  4)
ENSG0209350435-3.81131293562629
-4.14357714689656TRUE
ENSG01162851151669-4.54130810709955
-4.87357231836982 TRUE
ENSG0162460331-4.05126372834704
-4.38352793961731TRUE


I used read.table and write.table but I am still not very satisfied 
with the results. Here is what I did:


expFC - read.table( test.txt, header=T, sep=\t )
expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,]
write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, 
sep=\t )


Result:

GeneNamesvalue1value2log2.Fold_change. 
log2.Fold_change..normalized 
Signature.abs.log2.Fold_change..normalized4.
ENSG0209350435-3.81131293562629
-4.14357714689656 TRUE
ENSG01162851151669-4.54130810709955
-4.87357231836982 TRUE
ENSG0162460331-4.05126372834704
-4.38352793961731 TRUE


As you can see, there are two points:

1. The headers were altered. All the special characters were 
converted to dot (.).
2. The gene names (first column) were quoted (which were not in the 
original file).


The second point is not very annoying, but the first one is. How do I 
get exact the headers like the original file?


Thanks,

D.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boxplot knowing Q1, Q3, median, upper and lower whisker value

2010-09-10 Thread William Dunlap
is.nan(bd.coerce(as.bdVector(c(1.0, N -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Brian Diggs
 Sent: Thursday, September 09, 2010 12:41 PM
 To: David A.
 Cc: R-help
 Subject: Re: [R] boxplot knowing Q1, Q3, median,upper and 
 lower whisker value
 
 On 9/6/2010 8:46 AM, David A. wrote:
 
  Dear list,
 
  I am using a external program that outputs Q1, Q3, median, upper and
  lower whisker values for various datasets simultaneously in a tab
  delimited format. After importing this text file into R, I 
 would like
  to plot a boxplot using these given values and not the original
  series of data points, i.e. not using something like
  boxplot(mydata).
 
  Is there an easy way for doing this? If I am not wrong, boxplot()
  does not accept these values as parameters.

I believe boxplot(x,y,z) computes the required statistics
and passes them to the bxp() function for plotting.  If
you have the statistics you can pass them to bxp() yourself.
You might call trace(bxp) followed by a call to boxplot()
to see how boxplot uses bxp.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 
  Cheers,
 
  Dave  [[alternative HTML version deleted]]
 
 If you use ggplot2, you can specify the aesthetics lower, 
 upper, middle, 
 ymin, and ymax directly to variables in geom_boxplot.  Just 
 be sure to 
 set stat=identity so that it does not try to summarize your 
 data again.
 
 --
 Brian Diggs
 Senior Research Associate, Department of Surgery
 Oregon Health  Science University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot bar geom: control the filling in the colour legend

2010-09-10 Thread Benoit Boulinguiez

Hi all,

Is it possible to change the filling of the squares used to represent 
the colour legend in a bar plot with ggplot?


in this example, fillings are raven black, I'd like them white.

ggplot(diamonds, aes(clarity, colour = cut)) + geom_bar()

Regards

--
-
Benoit Boulinguiez
Ph.D student
Ecole de Chimie de Rennes (ENSCR) Bureau 1.20
Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes
Avenue du Général Leclerc
CS 50837
35708 Rennes CEDEX 7
Tel 33 (0)2 23 23 80 83
Fax 33 (0)2 23 23 81 20
http://www.ensc-rennes.fr/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding labels above bars in a barplot

2010-09-10 Thread Dennis Murphy
Hi:

To add to Greg's sound advice, if you want to put the numbers on top of the
bars, why bother with the numerical scale? The entire point of a scale is to
provide a reference for comparing different (sets of) values.

\begin{rant}
And when I see things like this:
 dat.bar
  VAR1
 Category 01 17.5
 Category 02 15.2
 Category 03 10.3
 Category 04  8.4
 Category 05 20.3

I get doubly annoyed, because it is yet another attempt to use a bar chart
to plot quantitative values by factor level. As I mentioned in a private
response today, one of the problems with a bar chart is that it forces the
numerical scale to have origin zero, and this is often neither necessary nor
desirable. A simple line plot that connects the quantitative values between
categories is sufficient, and takes *far* less ink to produce. The purpose
of a statistical graphic is to convey information in a simple, clean,
concise fashion - it is not meant to be a rococo art form. If you intend to
write a function to automate a graphic, please think carefully about what is
meant to be conveyed and the *visually* simplest means by which to convey
it.
\end{rant}

The purpose of a bar chart is to visualize a (joint) discrete distribution.
There are better ways to plot quantitative variables by group; in addition
to the line plot mentioned above, the Cleveland dot chart can be very
effective with many groups or multiple grouping factors. With two factors
and a quantitative response, another option is the interaction plot.

If this weren't the third such example/request I've seen today, I probably
wouldn't be so apoplectic...

Dennis

On Fri, Sep 10, 2010 at 10:44 AM, Greg Snow greg.s...@imail.org wrote:

 See this message and the replies to it (and the replies to the replies,
 etc.):

 http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22858.html

 In there is a discussion of why you don't really want to do that along with
 better alternatives and examples of the improved plots.

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of Antonio Olinto
  Sent: Friday, September 10, 2010 8:59 AM
  To: R-help
  Subject: [R] adding labels above bars in a barplot
 
  Hello,
 
  I want to make a general routine to draw barplots with numbers plotted
  above each bar. See the example below.
 
  I could not place the numbers on the middle of each bar because I
  could not calculate the right position of each x-axis tick. axTicks(1)
  indicated a unitary step, but it does not seem work.
 
  I appreciate any help or suggestions.
 
  Best regards,
 
  Antonio Olinto
 
  ==
 
  CAT   VAR1VAR2
  Category 01   17.59
  Category 02   15.220
  Category 03   10.3500
  Category 04   8.4 150
  Category 05   20.35000
 
  # Coping data from a spreadsheet
 
  dat.data - read.delim(clipboard,header=T)
 
  summary(dat.data)
 CAT VAR1VAR2
Category 01:1   Min.   : 8.40   Min.   :   9
Category 02:1   1st Qu.:10.30   1st Qu.:  20
Category 03:1   Median :15.20   Median : 150
Category 04:1   Mean   :14.34   Mean   :1136
Category 05:1   3rd Qu.:17.50   3rd Qu.: 500
Max.   :20.30   Max.   :5000
 
  dat.bar - data.frame(dat.data[,c(2)])
  row.names(dat.bar)-dat.data[,1]
  names(dat.bar)-c(VAR1)
  dat.bar
   VAR1
  Category 01 17.5
  Category 02 15.2
  Category 03 10.3
  Category 04  8.4
  Category 05 20.3
 
  par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4)
  barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,yla
  b=Y
  label text,col=orange)
  box()
 
  up - max(dat.data$VAR1)*0.1
 
  for (i in c(0:nrow(dat.data))) {
  legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n)
  }
 
 
  
  Webmail - iBCMG Internet
  http://www.ibcmg.com.br
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding labels above bars in a barplot

2010-09-10 Thread John Kane
Are you trying to say that you don't really like barplots?

At least the OP did not ask for error bars as well.  :)

--- On Fri, 9/10/10, Dennis Murphy djmu...@gmail.com wrote:

 From: Dennis Murphy djmu...@gmail.com
 Subject: Re: [R] adding labels above bars in a barplot
 To: Antonio Olinto aolint...@bignet.com.br
 Cc: R-help r-h...@stat.math.ethz.ch
 Received: Friday, September 10, 2010, 2:31 PM
 Hi:
 
 To add to Greg's sound advice, if you want to put the
 numbers on top of the
 bars, why bother with the numerical scale? The entire point
 of a scale is to
 provide a reference for comparing different (sets of)
 values.
 
 \begin{rant}
 And when I see things like this:
  dat.bar
               VAR1
  Category 01 17.5
  Category 02 15.2
  Category 03 10.3
  Category 04  8.4
  Category 05 20.3
 
 I get doubly annoyed, because it is yet another attempt to
 use a bar chart
 to plot quantitative values by factor level. As I mentioned
 in a private
 response today, one of the problems with a bar chart is
 that it forces the
 numerical scale to have origin zero, and this is often
 neither necessary nor
 desirable. A simple line plot that connects the
 quantitative values between
 categories is sufficient, and takes *far* less ink to
 produce. The purpose
 of a statistical graphic is to convey information in a
 simple, clean,
 concise fashion - it is not meant to be a rococo art form.
 If you intend to
 write a function to automate a graphic, please think
 carefully about what is
 meant to be conveyed and the *visually* simplest means by
 which to convey
 it.
 \end{rant}
 
 The purpose of a bar chart is to visualize a (joint)
 discrete distribution.
 There are better ways to plot quantitative variables by
 group; in addition
 to the line plot mentioned above, the Cleveland dot chart
 can be very
 effective with many groups or multiple grouping factors.
 With two factors
 and a quantitative response, another option is the
 interaction plot.
 
 If this weren't the third such example/request I've seen
 today, I probably
 wouldn't be so apoplectic...
 
 Dennis
 
 On Fri, Sep 10, 2010 at 10:44 AM, Greg Snow greg.s...@imail.org
 wrote:
 
  See this message and the replies to it (and the
 replies to the replies,
  etc.):
 
  http://tolstoy.newcastle.edu.au/R/e2/help/07/08/22858.html
 
  In there is a discussion of why you don't really want
 to do that along with
  better alternatives and examples of the improved
 plots.
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
   -Original Message-
   From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-
   project.org] On Behalf Of Antonio Olinto
   Sent: Friday, September 10, 2010 8:59 AM
   To: R-help
   Subject: [R] adding labels above bars in a
 barplot
  
   Hello,
  
   I want to make a general routine to draw barplots
 with numbers plotted
   above each bar. See the example below.
  
   I could not place the numbers on the middle of
 each bar because I
   could not calculate the right position of each
 x-axis tick. axTicks(1)
   indicated a unitary step, but it does not seem
 work.
  
   I appreciate any help or suggestions.
  
   Best regards,
  
   Antonio Olinto
  
   ==
  
   CAT   VAR1    VAR2
   Category 01   17.5    9
   Category 02   15.2   
 20
   Category 03   10.3   
 500
   Category 04   8.4 
    150
   Category 05   20.3   
 5000
  
   # Coping data from a spreadsheet
  
   dat.data - read.delim(clipboard,header=T)
  
   summary(dat.data)
             
 CAT         VAR1   
         VAR2
     Category
 01:1   Min.   :
 8.40   Min.   :   9
     Category
 02:1   1st Qu.:10.30   1st
 Qu.:  20
     Category
 03:1   Median :15.20   Median
 : 150
     Category
 04:1   Mean   :14.34   Mean   :1136
     Category
 05:1   3rd Qu.:17.50   3rd
 Qu.: 500
               
  
    Max.   :20.30   Max.   :5000
  
   dat.bar - data.frame(dat.data[,c(2)])
   row.names(dat.bar)-dat.data[,1]
   names(dat.bar)-c(VAR1)
   dat.bar
               
 VAR1
   Category 01 17.5
   Category 02 15.2
   Category 03 10.3
   Category 04  8.4
   Category 05 20.3
  
   par(mar=c(12,6,3,2),cex.axis=1.2,cex.lab=1.4)
  
 barplot(t(as.matrix(dat.bar)),ylim=c(0,max(dat.data[,2]*1.1)),las=2,yla
   b=Y
   label text,col=orange)
   box()
  
   up - max(dat.data$VAR1)*0.1
  
   for (i in c(0:nrow(dat.data))) {
  
 legend(0.25+i,dat.bar[1+i,1]+up,dat.data[i+1,3],col=blue,bty=n)
   }
  
  
  
 
   Webmail - iBCMG Internet
   http://www.ibcmg.com.br
  
   __
   R-help@r-project.org
 mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide http://www.R-project.org/posting-
   guide.html
   and provide commented, minimal, self-contained,
 reproducible code.
 
  __
  R-help@r-project.org
 mailing list

Re: [R] ggplot bar geom: control the filling in the colour legend

2010-09-10 Thread Ista Zahn
Sure, just change the color of the fill.

ggplot(diamonds, aes(clarity, colour = cut)) + geom_bar(fill=white)

-Ista

On Fri, Sep 10, 2010 at 2:24 PM, Benoit Boulinguiez
benoit.boulingu...@ensc-rennes.fr wrote:
 Hi all,

 Is it possible to change the filling of the squares used to represent the
 colour legend in a bar plot with ggplot?

 in this example, fillings are raven black, I'd like them white.

 ggplot(diamonds, aes(clarity, colour = cut)) + geom_bar()

 Regards

 --
 -
 Benoit Boulinguiez
 Ph.D student
 Ecole de Chimie de Rennes (ENSCR) Bureau 1.20
 Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes
 Avenue du Général Leclerc
 CS 50837
 35708 Rennes CEDEX 7
 Tel 33 (0)2 23 23 80 83
 Fax 33 (0)2 23 23 81 20
 http://www.ensc-rennes.fr/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Maximum log likelihood estimates of the parameters of a nonlinear model.

2010-09-10 Thread Odette Gaston
Dear all,

Is it possible to generate AIC or something equivalent for nonlinear
model estimated based on maximum log likelihood l in R?
I used nls based on least squares to estimate, and therefore I cannot
assess the quality of models with AIC. nlme seems good for only mixed
models and mine is not mixed models.

res - nls(y ~ d*(x)^3+a*(x)^2+b*x+c, start=list(a=2, b=1,c=1,d=1), data=d)

If anybody does know a R-function to estimate nonlinear model based on
maximum log likelihood, please let me know.

Thanks for your help in advance!
Odette

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filter a tab delimited text file

2010-09-10 Thread Gabor Grothendieck
On Fri, Sep 10, 2010 at 1:24 PM, Duke duke.li...@gmx.com wrote:
  Hi all,

 I have to filter a tab-delimited text file like below:

 GeneNames    value1    value2    log2(Fold_change)
  log2(Fold_change) normalized    Signature(abs(log2(Fold_change)
 normalized)  4)
 ENSG0209350    4    35    -3.81131293562629    -4.14357714689656    TRUE
 ENSG0177133    142    2    5.46771720082336    5.13545298955309    FALSE
 ENSG0116285    115    1669    -4.54130810709955    -4.87357231836982
  TRUE
 ENSG0009724    10    162    -4.69995182667858    -5.03221603794886
  FALSE
 ENSG0162460    3    31    -4.05126372834704    -4.38352793961731    TRUE

 based on the last column (TRUE), and then write to a new text file, meaning
 I should get something like below:

 GeneNames    value1    value2    log2(Fold_change)
  log2(Fold_change) normalized    Signature(abs(log2(Fold_change)
 normalized)  4)
 ENSG0209350    4    35    -3.81131293562629    -4.14357714689656    TRUE
 ENSG0116285    115    1669    -4.54130810709955    -4.87357231836982
  TRUE
 ENSG0162460    3    31    -4.05126372834704    -4.38352793961731    TRUE

 I used read.table and write.table but I am still not very satisfied with the
 results. Here is what I did:

 expFC - read.table( test.txt, header=T, sep=\t )
 expFC.TRUE - expFC[expFC[dim(expFC)[2]]==TRUE,]
 write.table (expFC.TRUE, file=test_TRUE.txt, row.names=FALSE, sep=\t )

 Result:

 GeneNames    value1    value2    log2.Fold_change.
  log2.Fold_change..normalized
  Signature.abs.log2.Fold_change..normalized4.
 ENSG0209350    4    35    -3.81131293562629    -4.14357714689656
  TRUE
 ENSG0116285    115    1669    -4.54130810709955    -4.87357231836982
  TRUE
 ENSG0162460    3    31    -4.05126372834704    -4.38352793961731
  TRUE

 As you can see, there are two points:

 1. The headers were altered. All the special characters were converted to
 dot (.).
 2. The gene names (first column) were quoted (which were not in the original
 file).


This will copy input lines matching pattern as well as the header to
the output verbatim preserving all quotes, spacing, etc.

myFilter - function(infile, outfile, pattern = TRUE$) {
L - readLines(infile)
cat(L[1], \n, file = outfile)
L2 - grep(pattern, L[-1], value = TRUE)
for(el in L2) cat(el, \n, file = outfile, append = TRUE)
}

# e.g.
myFilter(infile.txt, outfile.txt)

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulation

2010-09-10 Thread Greg Snow
Do you need a table with an odds ratio exactly equal to 3 (or other value), or 
a realistic sample from a population with odds ratio 3 where the sample table 
will have a different OR (but the various tables will cluster around the true 
value)?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Jim Silverton
 Sent: Friday, September 10, 2010 1:03 AM
 To: r-help@r-project.org
 Subject: Re: [R] Simulation
 
 I have two questions:
 (1) How do you 'create' an 2 x 2 table in R using  say an Odd ratio of
 3 or
 even 0.5
 
 (2) If I have several 2 x 2 tables, how can I 'implement' dependence in
 the
 tables with say 25 of the Tables having an odds ratio of 1 and 75 of
 the
 tables having an odds ratio of 4?
 
 Jim
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding zeroes after old zeroes in a vector ??

2010-09-10 Thread jim holtman
Not sure how you handle the ending sequence of '0101';  here is one approach:

 x - 011100101
 gsub(0111, , x)
[1] 000100101
 x
[1] 011100101


For the final one, you could do:

 gsub(01.., , x)
[1] 00010


On Fri, Sep 10, 2010 at 1:51 PM, skan juanp...@gmail.com wrote:

 Hello

 Imagine I have a vector with ones and zeroes

 I write it compactly:
 011100101

 I need to get a new vector replacing the N ones following the zeroes to
 new zeroes.

 For example for N = 3
 011100101  becomes
 00010

 I can do it with a for loop but I've read is not a good practice,  How can I
 do it then?

 cheers


 My vector is a zoo series, indeed, but I guess it doesn't make any
 difference.
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534824.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't event start 'rattle' under Ubuntu 64bit

2010-09-10 Thread Johan Nyberg

 Error: attempt to apply non-function
 In addition: Warning message:
 In method(obj, ...) : Invalid object type `GtkFrame'
 Rattle timestamp (for the error above): 2010-09-10 17:24:14

Ditto in Debian testing i386 32 bit.

Johan Nyberg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] OT: model diagnostics in the published literature

2010-09-10 Thread Greg Snow
My experience is that most medical journals (and probably others as well, but I 
am most familiar with the medical journals) have word or page limits on 
articles.  Diagnostic plots and tests that just show you what you expected and 
say that it is ok to use your model are not exciting enough to include.  And 
some of those plots/tests tend to confuse non-statisticians rather than help, 
have you ever given a QQ-plot of the residuals to a client to show that the 
normal approximation is OK?  I made this mistake a few times and ended up 
having to explain it over and over again.

Most papers that I have been involved with end up including less than half of 
the analyses that I actually did, just the most interesting results make it.  
Sometimes a reviewer will ask about the tests on the assumptions and we will 
send them the results so they can see the model is reasonable, but rarely does 
it make it into the paper itself.  Though I think it would be better if more 
reviewers asked.

The drawback is that when you read a paper it is difficult (or impossible) to 
tell if they did all the tests and the results were as expected, or they did 
not do the tests and there could be major problems.

One bright spot for the future is that more journals are now allowing for 
online supplements where all the details that don't make it into the main paper 
can be provided for the few who are interested.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Christopher W. Ryan
 Sent: Thursday, September 09, 2010 8:34 PM
 To: r-help@r-project.org
 Subject: [R] OT: model diagnostics in the published literature
 
 This is a more general statiscal question, not specific to R:
 
 As I move through my masters curriculum in statistics, I am becoming
 more and more attuned to issues of model fit and diagnostics (graphical
 methods, AIC, BIC, deviance, etc.) As my regression professor always
 likes to say, only draw substantive conclusions from valid models.
 
 Yet in published articles in my field (medicine), I rarely see any
 explicit description of whether, and if so how, model fit was assessed
 and assumptions checked. Mostly the results sections are all about
 hypothesis testing on model coefficients.
 
 Is this common in other disciplines? Are there fields of study in which
 it is customary to provide a discussion of model adequacy, either in
 the
 text or perhaps in an online appendix?
 
 And if that discussion is not provided, what, if anything, can one
 conclude about whether, and how well, it was done? Is it sort of taken
 as a given that those diagnostic checks were carried out? Do journal
 editors often ask?
 
 Thanks for your thoughts.
 
 --Chris Ryan
 Clinical Associate Professor of Family Medicine
 SUNY Upstate Medical University Clinical Campus at Binghamton
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding zeroes after old zeroes in a vector ??

2010-09-10 Thread Dennis Murphy
Hi:

Try the following:
f - function(x, n) {
   r - rle(x)
   t1 - cumsum(r$lengths)[r$values == 0L] + 1
   repl - as.vector(mapply(seq, t1, t1 + n - 1))
   replace(x, repl, 0)
  }

f(x, 3)

HTH,
Dennis

On Fri, Sep 10, 2010 at 10:51 AM, skan juanp...@gmail.com wrote:


 Hello

 Imagine I have a vector with ones and zeroes

 I write it compactly:
 011100101

 I need to get a new vector replacing the N ones following the zeroes to
 new zeroes.

 For example for N = 3
 011100101  becomes
 00010

 I can do it with a for loop but I've read is not a good practice,  How can
 I
 do it then?

 cheers


 My vector is a zoo series, indeed, but I guess it doesn't make any
 difference.
 --
 View this message in context:
 http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534824.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Maximum log likelihood estimates of the parameters of a nonlinear model.

2010-09-10 Thread Ben Bolker
Odette Gaston odette.gaston at gmail.com writes:

 
 Dear all,
 
 Is it possible to generate AIC or something equivalent for nonlinear
 model estimated based on maximum log likelihood l in R?
 I used nls based on least squares to estimate, and therefore I cannot
 assess the quality of models with AIC. nlme seems good for only mixed
 models and mine is not mixed models.
 
 res - nls(y ~ d*(x)^3+a*(x)^2+b*x+c, start=list(a=2, b=1,c=1,d=1), data=d)
 
 If anybody does know a R-function to estimate nonlinear model based on
 maximum log likelihood, please let me know.


  AIC(res) should work just fine.
  Ordinary least-squares fitting is equivalent to assuming that the residuals
are independent and normally distributed with a homogeneous variance.  If
you're willing to make those assumptions you're set.  If not, there are
various options for relaxing them:  gnls in the nlme package for 
correlation and heteroscedasticity, mle (stats4) or mle2 (bbmle) for
normality.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] adding zeroes after old zeroes in a vector ??

2010-09-10 Thread skan

Hi

I'll study your answers.

I could also try
gsub(01, 00, x)  N times
but it could be very slow if N is large
In fact when I wrote 10011I mean a vector
1
1
1
1
1
0
0
1
1
not a string, but I wrote it more compactly.

I also could by shifting the elements of the vector one position and ANDing
the result with the original. And again shifting 2 postions and so on up to
N. But it's very slow.
-- 
View this message in context: 
http://r.789695.n4.nabble.com/adding-zeroes-after-old-zeroes-in-a-vector-tp2534824p2534982.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting occurances of a letter by a factor

2010-09-10 Thread Davis, Brian
I'm trying to find a more elegant way of doing this.  What I'm trying to 
accomplish is to count the frequency of letters (major / minor alleles)  in  a 
string grouped by the factor levels in another column of my data frame.

Ex.
 DF-data.frame(c(CC, CC, NA, CG, GG, GC), c(L, U, L, U, 
 L, NA))
 colnames(DF)-c(X, Y)
 DF
 XY
1   CCL
2   CCU
3 NAL
4   CGU
5   GGL
6   GC NA

I have an ugly solution, which works if you know the factor levels of Y in 
advance.

 ans-rbind(table(unlist(strsplit(as.character(DF[DF[ ,'Y'] == 'L', 1]), ))),
+ table(unlist(strsplit(as.character(DF[DF[ ,'Y']  == 'U', 1]), 
 rownames(ans)-c(L, U)
 ans
  C G
L 2 2
U 3 1


I've played with table, xtab, tabulate, aggregate, tapply, etc but haven't 
found a combination that gives a more general solution to this problem.

Any ideas?

Brian

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lmer output

2010-09-10 Thread Ben Bolker
 Denis.Aydin at unibas.ch writes:

 I have a question regarding an output of a binomial lmer-model.
 The model is as follows:
 lmer(y~diet * day * female + (day|female),family=binomial)

  A reproducible example would always be nice.
 
 The corresponding output is:
 Generalized linear mixed model fit by the Laplace approximation 
 Formula: y ~ diet * day * female + (day | female) 
   AIC  BIC logLik deviance
  1084 1136 -531.1 1062

[ snip ]

 Fixed effects:
 Estimate Std. Error z value Pr(|z|) 
 (Intercept) 0.996444   0.713703   1.396   0.1627 
 dietNAA 1.194581   0.862294   1.385   0.1659 
 day 0.142026   0.074270   1.912   0.0558 .
 female  0.015629   0.019156   0.816   0.4146 
 dietNAA:day-0.124755   0.088684  -1.407   0.1595 
 dietNAA:female -0.024733   0.026947  -0.918   0.3587 
 day:female -0.001535   0.001966  -0.781   0.4348 
 dietNAA:day:female  0.001543   0.002720   0.568   0.5704 
 
 Now from my understanding, the estimates represent differences in slopes 
 and intercepts between different levels of diet and so on.
 
 My questions:
 
 1. Is there a way to display the coefficients for all levels of variables 
 (e.g., dietAA and dietNAA)? Because it is quite hard to calculate the 
 slopes and intercepts for all levels of each variable.

See if 
lmer(y~(diet-1) * (day-1) * (female-1) + (day|female),family=binomial)

helps, or see if you can use predict() with an appropriately
constructed prediction data frame -- although not sure if
predict works with GLMMs in current version of lme4.

 
 2. Is there a way to get the degrees of freedom?

  Giant can of worms, I'm afraid. See http://glmm.wikidot.com/faq for
relevant links and alternatives.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >