Re: [R] Interactive/Dynamic plots with R

2011-02-26 Thread Abhishek Pratap
Hi Greg

Ability to zoom in/out on x/y axis to begin with and then
adding/removing data sets.

Thanks!
-Abhi



On Fri, Feb 25, 2011 at 12:52 PM, Greg Snow greg.s...@imail.org wrote:
 What types of interaction do you want?

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Abhishek Pratap
 Sent: Friday, February 25, 2011 12:37 AM
 To: r-help@r-project.org
 Subject: [R] Interactive/Dynamic plots with R

 Hi Guys

 In order to look at a dense plot I would like to have the capability
 to plot dynamic/interactive. Before I try rgobi which I heard can help
 me; I would like to take your opinion.

 Thanks!
 -Abhi

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using uniroot() with output from deriv() or D()

2011-02-26 Thread jjheath
I need to find the root of the second derivative of many curves and do not
want
to cut and paste the expression results from the deriv() or D() functions
every time.  Below is an
example.  What I need to do is refer to fn2nd in the uniroot() function,
but when I
try something like uniroot(fn2nd,c(0,1)) I get an error, so I have resorted
to pasting
in the expression, but this is highly inefficient.

Thanks,  J

y - c(9.9,10,10,9.5,9,6,3,1,0,0,0)
b - seq(0,1,by=0.1)
dat - data.frame(y = y, b = b)
plot(y ~ b, xlim = c(0,1), ylim= c(-12,12))
fn - nls(y ~ asym/(1 + exp(p * b - q)),data = dat, list(asym=10,p=16,q=7),
trace=T)
curve(10.001/(1 + exp(14.094 * x - 7.551)), from = 0, to = 1, add = T)
fn2 - expression(10.001/(1 + exp(14.094 * x - 7.551)))
fn2nd - D(D(fn2, x), x)
ex - seq.int(from=0, to=1, length.out = 1000)
y1 - eval(fn2nd, envir = list(x = ex), enclos = parent.frame())
lines(ex, y1, type = l) 
r - uniroot(function(x) -(10.001 * (exp(14.094 * x - 7.551) * 14.094 *
14.094)/(1 + exp(14.094 * 
x - 7.551))^2 - 10.001 * (exp(14.094 * x - 7.551) * 14.094) * 
(2 * (exp(14.094 * x - 7.551) * 14.094 * (1 + exp(14.094 * 
x - 7.551/((1 + exp(14.094 * x -
7.551))^2)^2),interval=c(0,1),tol = 0.0001)
r$root
abline(h=0, col = red)
abline(v=r$root, col = green)
arrows(0.6, -2, r$root,0, length = 0.1, angle = 30, code = 2, col = red)
text(0.765,-2.3,paste(b = ,r$root,sep=))



-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-uniroot-with-output-from-deriv-or-D-tp3325635p3325635.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Boxplot for X Vs Y variable grouped by ID

2011-02-26 Thread Shankar Lanke
Dear All,

I am new to R. I amazed by this software. I have a question regarding
boxplot.

I have three columns say X, Y,ID (X is time from 0 to 12 hrs, Y is a
variable dependent on X) I can plot a simple boxplot if it is just one
group. But I have 10 groups and I want to plot all of them in one graphs.
Something like multiple boxplots in one graph.

I am not in R mailing list please reply to me directly.

Thank you very much for your help.

-- 
Regards,
Shankar Lanke Ph.D.
University at Buffalo
Office # 716-645-4853
Fax # 716-645-2886
Cell # 678-232-3567

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A problem about realized garch model

2011-02-26 Thread 張宏豪
Hi, I am trying to write the Realized GARCH model with order (1,1)

The model can be describe bellow:

r_t = sqrt( h_t) * z_t

logh_t = w + b*logh_(t-1) + r*logx_(t-1)

logx_t = c + q*logh_t + t1*z_t +t2*(z_t ^2 -1) + u_t

and z follow N(0,1) , u follow N(0, sigma.u^2)

But I'm troubled with the simulation check for my code.

After I simulate data from the model and estimate the data,

I can't get precise estimation for my setting parameters.

This is my simulation code:
===
sim-function(theta)
{
 omega = theta[1]
 bet = theta[2]
 gam = theta[3]
 xi = theta[4]
 phi = theta[5]
 tau1 = theta[6]
 tau2 = theta[7]
 sigma.u = theta[8]
 n = theta[9]
 n.warm = 500
 n = n+n.warm
 z = rnorm(n+1)
 u = rnorm((n+1),0,sigma.u)
 logh.pre = 1
 logx.pre = xi+phi*logh.pre+tau1*z[1]+tau2*(z[1]^2-1)+u[1]
 logh = c(logh.pre,rep(0,n))
 r = c(rep(0,(n+1)))
 logx = c(logx.pre,rep(0,n))
 for(i in 2:(n+1))
 {
  logh[i] = omega+bet*logh[(i-1)]+gam*logx[(i-1)]
  logx[i] = xi+phi*logh[i]+tau1*z[i]+tau2*(z[i]^2-1)+u[i]
  r[i] = sqrt(exp(logh[i]))*z[i]
 }
 gdata = rbind(r,exp(logx))
 ans = gdata[,-(1:(n.warm+1))]
 ans
}
===

This is my estimation code:

 LLH-function(data,theta)
 {
  r = data[1,]
  x = data[2,]
  n = dim(data)[2]
  logx = log(x)
  logh = rep(0,n)
  u = rep(0,n)
  garchDist = function(r, hh) { dnorm(x = r/hh)/hh }
  measureDist = function(u,sigma){dnorm(x = u/sigma)/sigma}
  omega = theta[1]
  bet = theta[2]
  gam = theta[3]
  xi = theta[4]
  phi = theta[5]
  tau1 = theta[6]
  tau2 = theta[7]
  sigma.u = theta[8]
  logh.pre = 0.1
  logx.pre = 0.1
  logh[1]-omega+bet*logh.pre+gam*logx.pre

u[1]-logx[1]-xi-phi*logh[1]-tau1*(r[1]/sqrt(exp(logh[1])))-tau2*((r[1]/sqrt(exp(logh[1])))^2-1)
  for(i in 2:n)
  {
   logh[i] = omega+bet*logh[(i-1)]+gam*logx[(i-1)]
   u[i] =
logx[i]-xi-phi*logh[i]-tau1*(r[i]/sqrt(exp(logh[i])))-tau2*((r[i]/sqrt(exp(logh[i])))^2-1)
  }
  h = exp(logh)
  hh = sqrt(h)
  llh = -sum(log(garchDist(r, hh)))-sum(log(measureDist(u,sigma.u)))
  llh
 }

fitting-function(data)
{
 LLH1-function(theta) {LLH(data,theta)}
 ini = c(0.1,0.6,0.3,0.07,1,-0.03,0.1,0.4)
 for(i in 1:10)
 {
  fit = optim(ini,LLH1,control=list(trace=2),hessian=T)
  ini = fit$par
 }
 hessian = fit$hessian
 std = sqrt(diag(solve(hessian)))
 A = rbind(fit$par,std)
 A
}
=

Does anyone can point out that where I am wrong?  Thank you very much!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Autocorrelation Rho Greater Than One

2011-02-26 Thread Hock Ann Lim
Dear R Users,

Kindly advice me what's wrong in my programming.

I'm using the Cochrane-Orcutt two stage procedure with Prais Wisten 
transformation, below is my R programming :

Y-c(60.8,62.5,64.6,66.1,67.7,69.1,71.7,73.5,76.2,77.3,78.8,80.2,82.6,84.3,83.3,84.1,86.4,87.6,89.1,89.3,89.1,
,
+ 
89.3,90.4,90.3,90.7,92.0,94.9,95.2,96.5,95.0,96.2,97.4,100.0,99.7,99.0,98.7,99.4,100.5,105.2,108.0,112.0,113.5,

+ 115.7,117.7,119.0,120.2)
 
X-c(48.9,50.6,52.9,55.0,56.8,58.8,61.2,62.5,64.7,65.0,66.3,69.0,71.2,73.4,72.3,74.8,77.1,78.5,79.3,79.3,79.2,
,
+ 
80.8,80.1,83.0,85.2,87.1,89.7,90.1,91.5,92.4,94.4,95.9,100.0,100.4,101.3,101.5,104.5,106.5,109.5,112.8,116.1,

+ 119.1,124.0,128.7,132.7,135.7)
 model-lm(Y~X)
 e-resid(model)
 mylag-function(e,d=1) { 
+   n-length(e) 
+   c(rep(NA,d),e)[1:n] 
+ }
 n-length(e)
 e1-mylag(e)
 modele-lm(e~e1-1)
 rho-coef(modele)
 rho
   e1 
0.8875926 
 n-length(e)
 xstar-c(X[1]*(1-rho^2)^0.5,X[2:n]-rho*X[1:(n-1)])
 ystar-c(Y[1]*(1-rho^2)^0.5,Y[2:n]-rho*Y[1:(n-1)])
 modelb-lm(ystar~xstar)
 bstar-coef(modelb)
 a-(bstar[[1]][[1]])/(1-rho)
 a[1:n]-a
 b-bstar[[2]][[1]]
 u-Y-(a+X*b)
 u-u
 myu-function(u,d=1) { 
+   n-length(u) 
+   c(rep(NA,d),u)[1:n] 
+ }
 u1-myu(u)
 modelu-lm(u~u1-1)
 Rho-coef(modelu)
 Rho
  u1 
1.029970 

 The answer should be less than one but I got 1.029970. Any correction in the 
programming part? Any preventive action on this matter?

Thank you.

Regards,
Lim Hock Ann



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Boxplot for X Vs Y variable grouped by ID

2011-02-26 Thread David Winsemius


On Feb 26, 2011, at 1:07 AM, Shankar Lanke wrote:


Dear All,

I am new to R. I amazed by this software. I have a question regarding
boxplot.

I have three columns say X, Y,ID (X is time from 0 to 12 hrs, Y is a
variable dependent on X) I can plot a simple boxplot if it is just one
group. But I have 10 groups and I want to plot all of them in one  
graphs.

Something like multiple boxplots in one graph.

I am not in R mailing list please reply to me directly.


Work through the examples on the boxplot help page.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with RODBC sqlSave

2011-02-26 Thread Lars Bishop
 Hi,

I'm able to establish a successful odbc connection using RODBC 1.3-2 on Win
7 and R 2.12.0. But I'm getting the following error message when I try to
save a data frame into the debase as shown below.

library(RODBC)
bbdb - odbcConnect(bbdb)
odbcGetInfo(bbdb) # returns ok
sqlSave(bbdb, USArrests, rownames = state, addPK=TRUE) #  example from the
RODBC manual

Error in sqlSave(bbdb, USArrests, rownames = state, addPK = TRUE) :
  [RODBC] ERROR: Could not SQLExecDirect 'CREATE TABLE USArrests  (state
varchar(255) NOT NULL PRIMARY KEY, Murder double, Assault integer,
UrbanPop integer, Rape double)'

Thanks in advance,
Lars.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sourcing a linux file with system

2011-02-26 Thread james.foadi
Dear R community,
I would like to source a file in my linux system to set a few environment 
variables.
I have tried:

 system(source /home/james/build//ccp4-6.1.13/include/ccp4.setup)

and got:

sh: source: not found

In fact, using Sys.which(source) I get an empty string.
How can I source that ccp4.setup file from within an R session?

J

Dr James Foadi PhD
Membrane Protein Laboratory (MPL)
Diamond Light Source Ltd
Diamond House
Harewell Science and Innovation Campus
Chilton, Didcot
Oxfordshire OX11 0DE

Email:  james.fo...@diamond.ac.uk
Alt Email:  j.fo...@imperial.ac.uk

Personal web page: http://www.jfoadi.me.uk


-- 
This e-mail and any attachments may contain confidential...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in different OS

2011-02-26 Thread David Scott
Not sure exactly what the  original poster was after, but for 
distinguishing when I am working on different machines with different 
OS, I use something like this:


### Set some state variables
opSys - Sys.info()[sysname]
if (opSys == Windows){
  linux - FALSE
} else {
  linux - TRUE
}

David Scott

On 26/02/2011 10:00 a.m., Ista Zahn wrote:

Hi,

see ?R.version

Something like
if(version$os == mingw32) {
path = /ABC} else {
path = /DEF
}

might do it, but I'm not sure exactly what possible values version$os
can take or what determines the value exactly.

Best,
Ista


On Fri, Feb 25, 2011 at 1:23 PM, Hui Duhui...@dataventures.com  wrote:

Hi All,

I have two Rs, one has been installed in Windows system and 
another one has been installed under UNIX system. Is there any environmental 
variable or function to tell me which R I am using? The reason that I need to 
know it is under different system, the data path could be different. I want to 
do something like

if it is R under Windows

path = /ABC
else if it is R under UNIX,
path = /DEF

Any idea? Thanks.

Best Regards,

HXD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sourcing a linux file with system

2011-02-26 Thread Prof Brian Ripley
And what do you think 'source' does?  Or system() does (check the help 
page)?


Hint: source is not an external command on Linux, but a builtin 
command of some shells (e.g. csh, bash), but not of sh -- in sh the 
equivalent is '.'.


On Sat, 26 Feb 2011, james.fo...@diamond.ac.uk wrote:


Dear R community,


I would like to source a file in my linux system to set a few 
environment variables.


Hmm, how is that going to work?  That will set environment variables 
in the shell launched by system(), not for the R process.  The only 
way to set environment variables for the running R process is to call 
Sys.setenv.



I have tried:


system(source /home/james/build//ccp4-6.1.13/include/ccp4.setup)


and got:

sh: source: not found

In fact, using Sys.which(source) I get an empty string.
How can I source that ccp4.setup file from within an R session?

J

Dr James Foadi PhD
Membrane Protein Laboratory (MPL)
Diamond Light Source Ltd
Diamond House
Harewell Science and Innovation Campus
Chilton, Didcot
Oxfordshire OX11 0DE

Email:  james.fo...@diamond.ac.uk
Alt Email:  j.fo...@imperial.ac.uk

Personal web page: http://www.jfoadi.me.uk


--
This e-mail and any attachments may contain confidential...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in different OS

2011-02-26 Thread Prof Brian Ripley
It is less clear what you are after, but the canonical way to decide 
if your R session is on Windows is


.Platform$OS.type == windows

Unlike {R.}version$os and Sys.info()[sysname], the set of values 
here is known and documented.  As ?R.version does say:


 Do _not_ use ‘R.version$os’ to test the platform the code is
 running on: use ‘.Platform$OS.type’ instead.  Slightly different
 versions of the OS may report different values of ‘R.version$os’,
 as may different versions of R.


On Sun, 27 Feb 2011, David Scott wrote:

Not sure exactly what the  original poster was after, but for distinguishing 
when I am working on different machines with different OS, I use something 
like this:


### Set some state variables
opSys - Sys.info()[sysname]
if (opSys == Windows){
 linux - FALSE
} else {
 linux - TRUE
}

David Scott

On 26/02/2011 10:00 a.m., Ista Zahn wrote:

Hi,

see ?R.version

Something like
if(version$os == mingw32) {
path = /ABC} else {
path = /DEF
}

might do it, but I'm not sure exactly what possible values version$os
can take or what determines the value exactly.

Best,
Ista


On Fri, Feb 25, 2011 at 1:23 PM, Hui Duhui...@dataventures.com  wrote:

Hi All,

I have two Rs, one has been installed in Windows system 
and another one has been installed under UNIX system. Is there any 
environmental variable or function to tell me which R I am using? The 
reason that I need to know it is under different system, the data path 
could be different. I want to do something like


if it is R under Windows

path = /ABC
else if it is R under UNIX,
path = /DEF

Any idea? Thanks.

Best Regards,

HXD

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.








--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding pairs with least magnitude difference from mean

2011-02-26 Thread Hans W Borchers
 I have what I think is some kind of linear programming question.
 Basically, what I want to figure out is if I have a vector of numbers,
 
  x - rnorm(10)
  x
  [1] -0.44305959 -0.26707077  0.07121266  0.44123714 -1.10323616
 -0.19712807  0.20679494 -0.98629992  0.97191659 -0.77561593
 
  mean(x)
 [1] -0.2081249
 
 Using each number only once, I want to find the set of five pairs
 where the magnitude of the differences between the mean(x) and each
 pairs sum is least.
 
  y - outer(x, x, +) - (2 * mean(x))
 
 With this matrix, if I put together a combination of pairs which uses
 each number only once, the sum of the corresponding numbers is 0.
 
 For example, compare the SD between this set of 5 pairs
  sd(c(y[10,1], y[9,2], y[8,3], y[7,4], y[6,5]))
 [1] 1.007960
 
 versus this hand-selected, possibly lowest SD combination of pairs
  sd(c(y[3,1], y[6,2], y[10,4], y[9,5], y[8,7]))
 [1] 0.2367030

Your selection is not bad, as only about 0.4% of all possible distinct
combinations have a smaller value -- the minimum is 0.1770076, for example
[10 7 9 5 8 4 6 2 3 1].

(1) combinat() from the 'combinations' package seems slow, try instead the
permutations() function from 'e1071'. 

(2) Yes, except your vector is getting much larger in which case brute force
is no longer feasible.

(3) This is not a linear programming, but a combinatorial optimization task.
You could try optim() with the SANN method, or some mixed-integer linear
program (e.g., lpSolve, Rglpk, Rsymphony) by intelligently using binary
variables to define the sets.

This does not mean that some specialized approach might not be more
appropriate.

--Hans Werner

 I believe that if I could test all the various five pair combinations,
 the combination with the lowest SD of values from the table would give
 me my answer.  I believe I have 3 questions regarding my problem.
 
 1) How can I find all the 5 pair combinations of my 10 numbers so that
 I can perform a brute force test of each set of combinations?  I
 believe there are 45 different pairs (i.e. choose(10,2)). I found
 combinations from the {Combinations} package but I can't figure out
 how to get it to provide pairs.
 
 2) Will my brute force strategy of testing the SD of each of these 5
 pair combinations actually give me the answer I'm searching for?
 
 3) Is there a better way of doing this?  Probably something to do with
 real linear programming, rather than this method I've concocted.
 
 Thanks for any help you can provide regarding my question.
 
 Best regards,
 
 James


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in different OS

2011-02-26 Thread David Scott

Thanks Brian, I stand corrected.

David Scott

On 27/02/2011 12:32 a.m., Prof Brian Ripley wrote:

It is less clear what you are after, but the canonical way to decide
if your R session is on Windows is

.Platform$OS.type == windows

Unlike {R.}version$os and Sys.info()[sysname], the set of values
here is known and documented.  As ?R.version does say:

   Do _not_ use ‘R.version$os’ to test the platform the code is
   running on: use ‘.Platform$OS.type’ instead.  Slightly different
   versions of the OS may report different values of ‘R.version$os’,
   as may different versions of R.


On Sun, 27 Feb 2011, David Scott wrote:


Not sure exactly what the  original poster was after, but for distinguishing
when I am working on different machines with different OS, I use something
like this:

### Set some state variables
opSys- Sys.info()[sysname]
if (opSys == Windows){
  linux- FALSE
} else {
  linux- TRUE
}

David Scott

On 26/02/2011 10:00 a.m., Ista Zahn wrote:

Hi,

see ?R.version

Something like
if(version$os == mingw32) {
 path = /ABC} else {
 path = /DEF
}

might do it, but I'm not sure exactly what possible values version$os
can take or what determines the value exactly.

Best,
Ista


On Fri, Feb 25, 2011 at 1:23 PM, Hui Duhui...@dataventures.com   wrote:

Hi All,

 I have two Rs, one has been installed in Windows system
and another one has been installed under UNIX system. Is there any
environmental variable or function to tell me which R I am using? The
reason that I need to know it is under different system, the data path
could be different. I want to do something like

if it is R under Windows

 path = /ABC
else if it is R under UNIX,
 path = /DEF

Any idea? Thanks.

Best Regards,

HXD

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






--
_
David Scott Department of Statistics
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 923 5055, or +64 9 373 7599 ext 85055
Email:  d.sc...@auckland.ac.nz,  Fax: +64 9 373 7018

Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] invalid type (list) for variable

2011-02-26 Thread Michael Hecker
Hi,
I have a problem with an invalid variable type.

My sample dataset is looking like this:
  str(factivaTDM.train)
'data.frame':   4 obs. of  14 variables:
  $ access: num  2 3 2 1
  $ billion   : num  2 1 2 2
  $ compani   : num  7 6 1 2
  $ data  : num  6 2 3 4
  $ inc   : num  5 3 3 2
  $ local : num  3 2 1 1
  $ mcleodusa : num  15 6 2 10
  $ network   : num  9 1 2 1
  $ provid: num  4 3 1 2
  $ servic: num  11 4 3 6
  $ share : num  9 5 3 5
  $ splitrock : num  18 6 4 5
  $ stock : num  3 3 1 2
  $ classifiedPositive:List of 4
   ..$ : chr yes
   ..$ : chr yes
   ..$ : chr yes
   ..$ : chr yes
   ..- attr(*, class)= chr AsIs

and the command:
NaiveBayes.model - NB(classifiedPositive ~ ., data = factivaTDM.train)
generates the error:
invalid type (list) for variable 'classifiedPositive'

Here is another dataset which causes no errors:
  str(weatherNominal.train)
'data.frame':   13 obs. of  5 variables:
  $ outlook: Factor w/ 3 levels overcast,rainy,..: 3 3 1 2 2 2 1 
3 3 2 ...
  $ temperature: Factor w/ 3 levels cool,hot,mild: 2 2 2 3 1 1 1 3 
1 3 ...
  $ humidity   : Factor w/ 2 levels high,normal: 1 1 1 1 2 2 2 1 2 2 ...
  $ windy  : logi  FALSE TRUE FALSE FALSE FALSE TRUE ...
  $ play   : Factor w/ 3 levels ?,no,yes: 2 2 3 3 3 2 3 2 3 3 ...

So I suppose, I have to convert classifiedPositive:List to 
classifiedPositive:Factor
How can I do that?

I will send you more information about my script if you need it.

Thank you very much,
Michael


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pulling p values

2011-02-26 Thread Gallavan, Robert
Hi folks,

I'm doing ANOVA with multiple comparisons.  I've used both the TukeyHSD  
function and the multcomp procedure.  In both cases, I get some tantalizing 
results such as this...

e = aov(lm(d.all[,(n+4)] ~ d.all[,4])

TukeyHSD(e)

  Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = lm(d.all[, (n + 4)] ~ d.all[, 4], contrast = C))

$`d.all[, 4]`
 diff   lwr   upr   
 p adj
10 mpk-0 mpk-0.677375 -1.1779417 -0.1768083   
0.0071092
30 mpk-0 mpk   -1.074500 -1.5750667 -0.5739333
0.656
30 mpk-10 mpk -0.397125 -0.8976917  0.1034417
0.1369760

I ask for the pvalues and get...

i = h$pvalues
i
   10 mpk - 0 mpk  30 mpk - 0 mpk 30 mpk - 10 mpk
   2.630022e-03   2.288575e-05   
5.863517e-02

It looks like a matrix, but it is a vector of length 3 and the third element, 
for example, is

 length(i)
[1] 3
 q = i[3]
 q
30 mpk - 10 mpk
 0.05863517

The contrast label is bound to the p-value as a single element.

Does anyone know how to extract just the p-value from this procedure?  Or a 
simple way to generate these comparisons?

Thanks,

Bob Gallavan
An obvious newbie


--This message is the property of i3Global and may 
include confidential and/or proprietary information, and may be used only by 
the person or entity to which it is addressed. If the reader of this e-mail is 
not the intended recipient or his or her authorized agent, the reader is hereby 
notified that any dissemination, distribution or copying of this e-mail is 
prohibited. If you have received this e-mail in error, please notify the sender 
by replying to this message and delete this e-mail 
immediately.==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sourcing a linux file with system

2011-02-26 Thread David Kirkby
On 26 February 2011 11:00,  james.fo...@diamond.ac.uk wrote:
 Dear R community,
 I would like to source a file in my linux system to set a few environment 
 variables.
 I have tried:

 system(source /home/james/build//ccp4-6.1.13/include/ccp4.setup)

 and got:

 sh: source: not found

Someone has been writing unportable code. Instead change it to

system(. /home/james/build//ccp4-6.1.13/include/ccp4.setup)

A dot will source a file, in a manner which avoids the need of a
command that is not defined by POSIX.

Dave

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pulling p values

2011-02-26 Thread Peter Ehlers

On 2011-02-26 04:12, Gallavan, Robert wrote:

Hi folks,

I'm doing ANOVA with multiple comparisons.  I've used both the TukeyHSD  
function and the multcomp procedure.  In both cases, I get some tantalizing 
results such as this...

e = aov(lm(d.all[,(n+4)] ~ d.all[,4])

TukeyHSD(e)

   Tukey multiple comparisons of means
 95% family-wise confidence level

Fit: aov(formula = lm(d.all[, (n + 4)] ~ d.all[, 4], contrast = C))

$`d.all[, 4]`
  diff   lwr   upr  
  p adj
10 mpk-0 mpk-0.677375 -1.1779417 -0.1768083   
0.0071092
30 mpk-0 mpk   -1.074500 -1.5750667 -0.5739333
0.656
30 mpk-10 mpk -0.397125 -0.8976917  0.1034417
0.1369760

I ask for the pvalues and get...

i = h$pvalues
i
10 mpk - 0 mpk  30 mpk - 0 mpk 30 mpk - 10 mpk
2.630022e-03   2.288575e-05   
5.863517e-02

It looks like a matrix, but it is a vector of length 3 and the third element, 
for example, is


length(i)

[1] 3

q = i[3]
q

30 mpk - 10 mpk
  0.05863517

The contrast label is bound to the p-value as a single element.


Well, 'bound to' just means that you have a vector with names.
So you can just unname() it.

But do let us in on your secret: what's 'h'?



Does anyone know how to extract just the p-value from this procedure?  Or a 
simple way to generate these comparisons?

Thanks,

Bob Gallavan
An obvious newbie


Newbies need to peruse the posting guide;
In particular, note the comment about HTML mail.

Peter Ehlers




--This message is the property of i3Global and may 
include confidential and/or proprietary information, and may be used only by the person 
or entity to which it is addressed. If the reader of this e-mail is not the intended 
recipient or his or her authorized agent, the reader is hereby notified that any 
dissemination, distribution or copying of this e-mail is prohibited. If you have received 
this e-mail in error, please notify the sender by replying to this message and delete 
this e-mail immediately.==

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using uniroot() with output from deriv() or D()

2011-02-26 Thread Hans W Borchers
 I need to find the root of the second derivative of many curves and do not
 want to cut and paste the expression results from the deriv() or D()
 functions every time.  Below is an example.  What I need to do is refer to
 fn2nd in the uniroot() function, but when I try something like
 uniroot(fn2nd,c(0,1)) I get an error, so I have resorted to pasting in the
 expression, but this is highly inefficient.
 
 Thanks,  J
 
 [...]

What is so wrong with using

r - uniroot(function(x) eval(fn2nd, list(x=x)),
interval=c(0, 1), tol=0.0001)

(I thought you were almost there) or even

fn2nd_fun - function(x) eval(fn2nd, list(x=x))
ex - seq(from=0, to=1, length.out = 1000)
y1 - fn2nd_fun(ex)
...
r - uniroot(fn2nd_fun, interval=c(0, 1), tol=0.0001)

--Hans Werner

 r$root
 abline(h=0, col = red)
 abline(v=r$root, col = green)
 arrows(0.6, -2, r$root,0, length = 0.1, angle = 30, code = 2, col = red)
 text(0.765,-2.3,paste(b = ,r$root,sep=))


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] BFGS versus L-BFGS-B

2011-02-26 Thread Prof. John C Nash
Bert has put a finger on one of the key questions: Are we there yet?

Global optima are very difficult to find in many, possibly most problems.
We can check for local optima. In package optimx (currently resetting it on 
CRAN due to a
variable naming glitch with Rvmmin -- should be there in a couple of days) we 
carry out
the Kuhn Karush Tucker tests, but even these have scale dependencies. On 
R-forge there's a
new package under our optimization and solving packages project called kktc 
that separates
off the tests. However, that's only a start.

Constraints require us to do more work, too.

A couple of very large scale problems have been noted. The minimization of 
energy for
collections of atoms or molecules was suggested, but my experience with that 
type of
problem is that it has a huge number of local minima and is not amenable to the 
techniques
that the original poster had in mind, and certainly optim(BFGS) nor Rcgmin nor
optim(L-BFGS-B) are appropriate as they stand, though they may be helpful in 
some sort of
polyalgorithm.

ALso I did not post, but mentioned in private emails, that L-BFGS-B and Rcgmin 
should
offer much lower memory demands for problems in large numbers of parameters than
optim(BFGS) which has an explicit inverse Hessian approximation i.e., n*n 
doubles to
store. BFGS and L-BFGS-B have very different internals.

JN


On 02/25/2011 08:46 PM, Bert Gunter wrote:
 Thanks to all for clarifications. So I'm off base, but whether waaay
 off base depends on whether there is a reasonably well defined optimum
 to converge to. Which begs the question, I suppose: How does one know
 whether one has converged to such an optimum?  This is always an
 issue, of course, even for a few parameters; but maybe more so with so
 many.
 
 -- Bert
 
 On Fri, Feb 25, 2011 at 3:35 PM, Ravi Varadhan rvarad...@jhmi.edu wrote:
 I have worked on a 2D image reconstruction problem in PET (positron emission
 tomography) using a Poisson model.  Here, each pixel intensity is an unknown
 parameter.  I have solved problems of size 128 x 128 using an accelerated EM
 algorithm.  Ken Lange has shown that you can achieve term by term separation
 using a minorization inequality, and hence the problem simplifies greatly.

 Ravi.

 ---
 Ravi Varadhan, Ph.D.
 Assistant Professor,
 Division of Geriatric Medicine and Gerontology School of Medicine Johns
 Hopkins University

 Ph. (410) 502-2619
 email: rvarad...@jhmi.edu

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Prof. John C Nash
 Sent: Friday, February 25, 2011 5:55 PM
 To: Bert Gunter
 Cc: r-help@r-project.org
 Subject: Re: [R] BFGS versus L-BFGS-B

 For functions that have a reasonable structure i.e., 1 or at most a few
 optima, it is
 certainly a sensible task. Separable functions are certainly nicer (10K 1D
 minimizations),
 but it is pretty easy to devise functions e.g., generalizations of
 Rosenbrock, Chebyquad
 and other functions that are high dimension but not separable.

 Admittedly, there are not a lot of real-world examples that are publicly
 available. More
 would be useful.

 JN


 On 02/25/2011 05:06 PM, Bert Gunter wrote:
 On Fri, Feb 25, 2011 at 12:00 PM, Brian Tsai btsa...@gmail.com wrote:
 Hi John,

 Thanks so much for the informative reply!  I'm currently trying to
 optimize
 ~10,000 parameters simultaneously - for some reason,

 -- Some expert (Ravi, John ?) please correct me, but: Is the above not
 complete nonsense? I can't imagine poking around usefully  in 10K
 dimensional space for an extremum unless maybe one can find the
 extremum by 10K separate 1-dim optimizations. And maybe not then
 either.

 Am I way offbase here, or has Brian merely described just another
 inefficient way to produce random numbers?

 -- Bert

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tansformation of variables in cph from rms package

2011-02-26 Thread Frank Harrell
Yao,

I wonder how likely it is that log transformations fit the data.  More often
I find that the flexibility of restricted cubic splines is needed.

One of the things that people take for granted when using log() is that they
are assuming that k=0 in making the transformation log(x + k).  It is often
the case the the proper origin is not zero.  This also impacts the problem
you encountered.

Besides all that, you can specify limits to predictors to nomogram() in a
variety of ways as listed in the help file.  Try that.
Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/tansformation-of-variables-in-cph-from-rms-package-tp3325545p3325911.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using uniroot() with output from deriv() or D()

2011-02-26 Thread Gabor Grothendieck
On Sat, Feb 26, 2011 at 2:16 AM, jjheath heath_jer...@hotmail.com wrote:
 I need to find the root of the second derivative of many curves and do not
 want
 to cut and paste the expression results from the deriv() or D() functions
 every time.  Below is an
 example.  What I need to do is refer to fn2nd in the uniroot() function,
 but when I
 try something like uniroot(fn2nd,c(0,1)) I get an error, so I have resorted
 to pasting
 in the expression, but this is highly inefficient.

 Thanks,  J

 y - c(9.9,10,10,9.5,9,6,3,1,0,0,0)
 b - seq(0,1,by=0.1)
 dat - data.frame(y = y, b = b)
 plot(y ~ b, xlim = c(0,1), ylim= c(-12,12))
 fn - nls(y ~ asym/(1 + exp(p * b - q)),data = dat, list(asym=10,p=16,q=7),
 trace=T)
 curve(10.001/(1 + exp(14.094 * x - 7.551)), from = 0, to = 1, add = T)
 fn2 - expression(10.001/(1 + exp(14.094 * x - 7.551)))
 fn2nd - D(D(fn2, x), x)
 ex - seq.int(from=0, to=1, length.out = 1000)
 y1 - eval(fn2nd, envir = list(x = ex), enclos = parent.frame())
 lines(ex, y1, type = l)
 r - uniroot(function(x) -(10.001 * (exp(14.094 * x - 7.551) * 14.094 *
 14.094)/(1 + exp(14.094 *
    x - 7.551))^2 - 10.001 * (exp(14.094 * x - 7.551) * 14.094) *
    (2 * (exp(14.094 * x - 7.551) * 14.094 * (1 + exp(14.094 *
        x - 7.551/((1 + exp(14.094 * x -
 7.551))^2)^2),interval=c(0,1),tol = 0.0001)
 r$root
 abline(h=0, col = red)
 abline(v=r$root, col = green)
 arrows(0.6, -2, r$root,0, length = 0.1, angle = 30, code = 2, col = red)
 text(0.765,-2.3,paste(b = ,r$root,sep=))

Try this:

f - function(x) {}
body(f) - fn2nd

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unable to access index for repository

2011-02-26 Thread Duncan Murdoch

On 11-02-25 6:15 PM, Kalicin, Sarah wrote:

Hi,
I have two questions:


1)  Since I switched to Windows 2007 and downloaded the current R version 
(2.12.1; 2010-12-16) for Windows 7 a month ago, I cannot down load packages 
through the GUI drop down menu. I get: Warning: unable to access index for 
repository http://cran.cnr.Berkeley.edu/bin/windows/contrib/2.12

Warning: unable to access index for repository 
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.12

Warning messages:

1: In open.connection(con, r) :

   unable to connect to 'cran.r-project.org' on port 80.

In fact, just to assign the CRAN location in the drop down menu takes forever. From web 
searches, the first check is to see if you have internet access. I do and can download 
the packages directly to my hard drive and load them indirectly into R by using 
library(MASS, lib.loc=C:\\Users\\R\\R packages). This is starting to become a 
pain. Any suggestions on looking at potential security settings that I might need to 
deactiviate or any other issues?


You've probably got a firewall or proxy getting in the way.  You could 
re-install R with the internet2 option, or run setInternet2(TRUE) in a 
session to fix this.  (Or maybe you already did that, in which case 
setInternet2(FALSE) will turn it off.)





2)  Any suggestions how I can connect to a MySQL database through windows 
using R. I found this example on the web, but there is not a package for RMYSQL 
for windows. Bugger!!!

library(RMySQL)

drv = dbDriver(MySQL)

con = dbConnect(drv,host=mars,dbname=sat133,user=sat133,pass=T0pSecr3t)

album = dbGetQuery(con,statement=select * from tasks)


Install an ODBC driver for MySQL, and use RODBC.

Duncan Murdoch







Thanks for any input!!

Sarah

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Forced inclusion of varaibles in validate command as well as step

2011-02-26 Thread Frank Harrell
Jon,

Version 3.3-0 of rms will be released within 2-3 days.  It has a new option
force for fastbw, validate, calibrate.  force is an integer vector of the
parameter numbers to force into every model.  It is meant to work with
type='individual' and its performance with type='residual' needs to be
studied (I doubt if it works as you want).  Suppose you have a Cox model
(which does not have an intercept to include in the sequential numbering for
force) like this:

f - cph(S ~ sex + pol(age,2) + rcs(height,4))

and you want to force age into every model.  You would specify force=2:3. 
Typing coef(f) will show you the sequential parameters in the model. 
Someday I'll extend this to force='age' but that's not in the current
version.  Note that fastbw always pools age and age^2 effects when judging
significance.

As always use stepwise methods at your own risk.  They are dangerous to your
health.

If you are using Linux I can send the new version by e-mail.

Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Forced-inclusion-of-varaibles-in-validate-command-as-well-as-step-tp3324901p3325922.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hierarchical Power Analysis

2011-02-26 Thread Ben Bolker
Downey, Patrick PDowney at urban.org writes:

 
 Hello,
 
 A colleague is trying to do a fairly complicated power analysis for a
 project. The project would be evaluating random assignment to one of three
 conditions within each of 8 sites. The dependent variable would be binary
 (we do not care at this point whether it would be analyzed with logit or
 probit). 
 
 We can simulate the data in a bunch of iterations and do analyses in each
 iteration, but it seems like there might be an easier way. Are there any
 packages in R with the capability to do such power analyses?

  I vaguely recall someone setting up a package to do power analysis
of hierarchical designs, but ... there are unlikely to be any closed-form
solutions for the binary GLMM case (since there are so few closed-form
solutions even for the null distributions of the test statistics!) ... so
I think if you have a working solution with simulations that's probably
the best you will do.  I would try library(sos); findFn(power hierarchical)
or findFn(power mixed) just in case ...

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MCMCpack combining chains

2011-02-26 Thread Uwe Ligges



On 24.02.2011 17:48, Alan Kelly wrote:

Deal all, as MCMClogit does not allow for the specification of several chains, 
I have run my model 3 times with different random number seeds and differently 
dispersed multivariate normal priors.
For example:
res1 = MCMClogit(y~x,b0=0,B0=0.001,data=mydat, burnin=500, mcmc=5500, 
seed=1234, thin=5)
res2 = MCMClogit(y~x,b0=1,B0=0.01,data=mydat, burnin=500, mcmc=5500, seed=5678, 
thin=5)
res3 = MCMClogit(y~x,b0=5,B0=0.0001,data=mydat, burnin=500, mcmc=5500, 
seed=91011, thin=5)

Each result produces an object of class mcmc.
In order to use the Gelman-Rubin diagnostic test via coda, I need to combine 
these 3 mcmc objects appropriately.  I thought that this would be possible using 
as.mcmc.list() as the function description says:
The function ‘mcmc.list’ is used to represent parallel runs of the same chain, 
with different starting values and random seeds. The list must be balanced: 
each chain in the list must have the same iterations and the same variables.
So I try the following:
res123=as.mcmc.list(res1, res2, res3)



Use mcmc.list() rather than as.mcmc.list(). The latter is for 
conversion, not a constructor from separate mcmc objects.


Uwe Ligges





class(res123)
[1] mcmc.list



Then I try gelman.diag() from coda -
gelman.diag(temp123)
Error in gelman.diag(temp123) : You need at least two chains

So, how may I combine the separate mcmc objects so that I have multiple chains?
Many thanks,
Alan Kelly




[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmath: how to create a vector of expressions?

2011-02-26 Thread Peter Ehlers

On 2011-02-25 15:57, Marius Hofert wrote:

Dear expeRts,

I would like to use LaTeX-like symbols in keys via plotmath. Below is a minimal
example. I get:
Error in fun(key = list(x = 0.5, y = 0.8, text = 
list(list(expression(paste((,  :
   first component of text must be vector of labels
What's wrong? How can I create the required vector (of expressions)?

Cheers,

Marius

library(lattice)
val- matrix(1, nrow = 10, ncol = 3)
nplot- 10

labs- sapply(1:3, function(i) substitute(expression(paste((,i,), 
,l==len,mm,sep=)), list(len=2*i)))
myplot- xyplot(0~0,
  panel=function(...){
  for(i in 1:3){
  panel.xyplot(1,val[,i],type=l)
  
panel.text(2,val[nplot,i],label=paste((,i,),sep=))
  }
  },
  key = list(x=0.5,y=0.8,text=list(labs),
  align=TRUE,transparent=TRUE))
myplot


Try this:

  labs - vector(expression, 3)
  for(i in 1:3) labs[i] -
 substitute(expression(
  group((, k, )) * , ~~ l == len ~ italic(mm)
   ),
   list(len = 2*i, k=i)
 )[2]

The idea is to create an expression vector and then
fill its components appropriately. We need the second
component of the substitute call.
(I put in the 'italic' just for fun.)

Maybe there's a better way, but I don't know it.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Boxplot not doing what I think it should

2011-02-26 Thread Uwe Ligges



On 24.02.2011 19:15, Greg Snow wrote:

Look at ?quantile, especially the detail section on type and the second 
reference (and the see also section).  There are many different definitions of 
quantiles/quartiles and different functions use different versions.




Particularly ?boxplots points to ?boxplot.stats  and that points to 
Tukey's original ?fivenum.


Uwe Ligges





-Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Lewis Berman
 Sent: Thursday, February 24, 2011 8:38 AM
 To: r-help@R-project.org
 Subject: [R] Boxplot not doing what I think it should

 My box plot below is drawing its upper whisker all the way to the last
 point, instead of showing the point as an outlier. Am I
 misunderstanding, or is it a bug?

 Help(boxplot) states for the parameter range that this determines
 how far the plot whiskers extend out from the box. If range is
 positive, the whiskers extend to the most extreme data point which is
 no more than range times the interquartile range from the box. A value
 of zero causes the whiskers to extend to the data extremes.

 To my understanding, the code below, which drew the boxplot, should
 have drawn the right-hand whisker only to 387 and shown 496 as an
 outlier point. But it didn't. Using a range of 1.3 does. Horizontal
 versus vertical does not matter.

 Also, the right edge of the box does not show the same 3rd Quartile as
 does summary. Explain, please.



  sort(t5g24times)

 [1] 111 135 201 234 283 283 284 285 300 370 387 496

  summary(t5g24times)

Min. 1st Qu.  MedianMean 3rd Qu.Max.
   111.0   225.8   283.5   280.8   317.5   496.0

  iqr - 317.5 - 225.8
  iqr

 [1] 91.7

  317.5 + 1.5 * iqr

 [1] 455.05

  pdf()
  boxplot(c(t5g24times), horizontal=1, range=1.5)



 R version info:
 platform   x86_64-apple-darwin9.8.0
 arch   x86_64
 os darwin9.8.0
 system x86_64, darwin9.8.0
 status
 major  2
 minor  12.0
 year   2010
 month  10
 day15
 svn rev53317
 language   R
 version.string R version 2.12.0 (2010-10-15)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Wired behavior of a 2-by-2 matrix indicies

2011-02-26 Thread Feng Li
Dear R,

I found a very wired behavior for a 2-by-2 matrix, see this example

 A - matrix(1:4, 2)
 idx4A - matrix(1:4, 2)
 A[idx4A]
Error in A[idx4A] : subscript out of bounds

But other matrices are fine,

 B - matrix(1:9, 3)
 idx4B - matrix(1:9, 3)
 B[idx4B]
[1] 1 2 3 4 5 6 7 8 9


I can reproduce this for both 32bit windows and 64bit linux with R 2.12.1
and some earlier versions.

Can someone tell me whether this a bug or a feature or something I don't
know?

Thanks!

Feng

-- 
Feng Li
Department of Statistics
Stockholm University
106 91 Stockholm, Sweden
http://feng.li/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactive/Dynamic plots with R

2011-02-26 Thread Greg Snow
There is an rggobi package that makes transferring data between R and ggobi 
easy.  But I would probably start with the playwith package which I think will 
do most of what you want.  Some other options are the iplots package, rpanel 
package, the fgui package and the tkexamp function in the TeachingDemos package 
(the last requires the most work on your part, but gives the most control).

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: Abhishek Pratap [mailto:abhishek@gmail.com]
 Sent: Saturday, February 26, 2011 1:09 AM
 To: Greg Snow
 Cc: r-help@r-project.org
 Subject: Re: [R] Interactive/Dynamic plots with R
 
 Hi Greg
 
 Ability to zoom in/out on x/y axis to begin with and then
 adding/removing data sets.
 
 Thanks!
 -Abhi
 
 
 
 On Fri, Feb 25, 2011 at 12:52 PM, Greg Snow greg.s...@imail.org
 wrote:
  What types of interaction do you want?
 
  --
  Gregory (Greg) L. Snow Ph.D.
  Statistical Data Center
  Intermountain Healthcare
  greg.s...@imail.org
  801.408.8111
 
 
  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
  project.org] On Behalf Of Abhishek Pratap
  Sent: Friday, February 25, 2011 12:37 AM
  To: r-help@r-project.org
  Subject: [R] Interactive/Dynamic plots with R
 
  Hi Guys
 
  In order to look at a dense plot I would like to have the capability
  to plot dynamic/interactive. Before I try rgobi which I heard can
 help
  me; I would like to take your opinion.
 
  Thanks!
  -Abhi
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wired behavior of a 2-by-2 matrix indicies

2011-02-26 Thread Sarah Goslee
Hello,

On Sat, Feb 26, 2011 at 12:11 PM, Feng Li m...@feng.li wrote:
 Dear R,

 I found a very wired behavior for a 2-by-2 matrix, see this example

 A - matrix(1:4, 2)
 idx4A - matrix(1:4, 2)
 A[idx4A]
 Error in A[idx4A] : subscript out of bounds

This shouldn't work. From the help for [,

  When indexing arrays by ‘[’ a single argument ‘i’ can be a
  matrix with as many columns as there are dimensions of ‘x’;
  the result is then a vector with elements corresponding to
  the sets of indices in each row of ‘i’.

Since there are two dimensions to A, and two rows to idx4A, then this
form is used.
The first row of idx4A is c(1, 3) so the first value extracted is
A[1, 3]
which doesn't exist, thus the error message.

 But other matrices are fine,

 B - matrix(1:9, 3)
 idx4B - matrix(1:9, 3)
 B[idx4B]
 [1] 1 2 3 4 5 6 7 8 9

The part that surprises me is that this works.
Apparently there is an implicit conversion to vector here because
dim(B) != ncol(idx4B)

What you seem to want is
A[as.vector(idx4A)]
and
B[as.vector(idx4B)]

but you neglected to tell us what you expected the result to be.

Sarah
-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weird behavior of a 2-by-2 matrix indicies

2011-02-26 Thread Feng Li
The real weird thing is I wrote weird and it comes out wired...

The background is that I did a lot of sparse matrices calculations.
Therefore need some indices tweaks, e.g subtract a block from a big matrix.
Then I just create an indicies matrix and use it directly as a vector since
a matrix is just a vector with dimensions attributed in R. I know I can
convert them to a vector but I did not do it anyway.

Thanks for this information!

Feng

On Sat, Feb 26, 2011 at 6:29 PM, Sarah Goslee sarah.gos...@gmail.comwrote:

 Hello,

 On Sat, Feb 26, 2011 at 12:11 PM, Feng Li m...@feng.li wrote:
  Dear R,
 
  I found a very wired behavior for a 2-by-2 matrix, see this example
 
  A - matrix(1:4, 2)
  idx4A - matrix(1:4, 2)
  A[idx4A]
  Error in A[idx4A] : subscript out of bounds

 This shouldn't work. From the help for [,

  When indexing arrays by ‘[’ a single argument ‘i’ can be a
  matrix with as many columns as there are dimensions of ‘x’;
  the result is then a vector with elements corresponding to
  the sets of indices in each row of ‘i’.

 Since there are two dimensions to A, and two rows to idx4A, then this
 form is used.
 The first row of idx4A is c(1, 3) so the first value extracted is
 A[1, 3]
 which doesn't exist, thus the error message.

  But other matrices are fine,
 
  B - matrix(1:9, 3)
  idx4B - matrix(1:9, 3)
  B[idx4B]
  [1] 1 2 3 4 5 6 7 8 9

 The part that surprises me is that this works.
 Apparently there is an implicit conversion to vector here because
 dim(B) != ncol(idx4B)

 What you seem to want is
 A[as.vector(idx4A)]
 and
 B[as.vector(idx4B)]

 but you neglected to tell us what you expected the result to be.

 Sarah
 --
 Sarah Goslee
 http://www.functionaldiversity.org




-- 
Feng Li
Department of Statistics
Stockholm University
106 91 Stockholm, Sweden
http://feng.li/

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select element from vector

2011-02-26 Thread Uwe Ligges



On 25.02.2011 12:37, zem wrote:


Hi Jessica,

try this: Q[k:c(k+3)]



1. You have written to the R-help mailing list rather than to Jessica. 
Please answer also directly: not all posters are on the mailing list.

2. Please quote the original question.

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] J48 - contrasts can be applied only to factors with 2 or more levels

2011-02-26 Thread Uwe Ligges



On 24.02.2011 15:29, Patrick Skorupka wrote:

Hi everybody,

I recently sent out a first message using this tool and was astonished about 
the quick and helpful replys. So, thank you in advance for your help, I really 
appreciate that!

As already recently mentioned I am new to using R as I have to implement some decision trees 
with R respectively RWeka for my master thesis. I have tried so built some J48 trees and 
could accomplish first, very basic results. But when I increased the number of independent 
variables the tree in fact is getting created, but when I try to plot it, R displays 
“Error in `contrasts-`(`*tmp*`, value = contr.treatment) :  contrasts can 
be applied only to factors with 2 or more levels€.

I found out that this was apparently due to IVs with only one value and so 
excluded all these variables from the dataset and double-checked it by using 
the summary function. Anyway the same error occurs. Anyone any idea? Is there a 
specific function to test my dataset in this way instead of doing it manually 
or with the summary function? FYI, the dataset contains 54 attributes, all kind 
of scales and about 5,000 rows.




You may want to supply a reproducible example of that behaviour as the 
posting guide asks you to do.


Best,
Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] kohonen: Argument data should be numeric

2011-02-26 Thread Uwe Ligges



On 25.02.2011 15:33, Jay wrote:

Hi,

I'm trying to utilize the kohonen package to build SOM's. However,
trying this on my data I get the error:

Argument data should be numeric

when running the som(data.train, grid = somgrid(6, 6, hexagonal))
function. As you see, there is a problem with the data type of
data.train which is a list. When I try to convert it to numeric I
get the error:

(list) object cannot be coerced to type 'double'

What should I do? I can convert the data.train if I take only one
column of the list: data.train[[1]], but that is naturally not what I
want. How did I end up with this data format?

What I did:
data1- read.csv(data1.txt, sep = ;)
training- sample(nrow(data1), 1000)
data.train- data1[training,2:20]

I tried to use scan as the import method (read about this somewhere)
and unlist, but I'm not really sure how I should get it to numeric/
working.



We do not know what data types are present in data1. If all columns are 
numeric, just convert to a matrix (using as.matrix()).


Uwe Ligges









Thanks,
Jay

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] aggregate with cumsum

2011-02-26 Thread Uwe Ligges



On 25.02.2011 16:16, stephenb wrote:


Bill,

what will be the fastest way to output not just single lines but small data
frames of about 60 rows?

I prefer writing to a text file because the final output is large 47k times
60 rows and since I do not know the size of it I have to use rbind to build
the object which creates the memory problems described here:

http://www.matthewckeller.com/html/memory.html

look at the swiss cheese paragraph.

kind regards
Stephen


Stephen Bond,

1. You have written to the R-help mailing list rather than to Bill. 
Please answer also directly: not all posters are on the mailing list.

2. Please quote the original question.

Uwe Ligges

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] tansformation of variables in cph from rms package

2011-02-26 Thread Uwe Ligges



On 26.02.2011 15:43, Frank Harrell wrote:

Yao,

I wonder how likely it is that log transformations fit the data.  More often
I find that the flexibility of restricted cubic splines is needed.

One of the things that people take for granted when using log() is that they
are assuming that k=0 in making the transformation log(x + k).  It is often
the case the the proper origin is not zero.  This also impacts the problem
you encountered.

Besides all that, you can specify limits to predictors to nomogram() in a
variety of ways as listed in the help file.  Try that.
Frank

-
Frank Harrell
Department of Biostatistics, Vanderbilt University


Frank,

looks like you are also using this Nabble interface now:

1. You have written to the R-help mailing list only rather than to 
Yao. Please answer also directly: not all posters are on the mailing list.


2. Please quote the original question, otherwise readers of the list 
cannot remember and do not learn too much without having the context 
apparent.


Best,
Uwe

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Transform a dataset from long to wide using reshape2

2011-02-26 Thread John Kane
I seem to be running into the same problem reported in 
https://stat.ethz.ch/pipermail/r-help/2010-November/258265.html

I cannot seem to transform a dataset from long to wide using reshape2.
Clearly I am missing something very simple but a look at the manual and the 
reshape paper in JSS does not suggest anything.

Any advice would be welcome
===load rehape2
library(reshape2)
Working example===
# Example from above URL
dat - read.table(textConnection('Subject   Item Score
Subject 1 Item 1 1
Subject 1 Item 2 0
Subject 1 Item 3 1
Subject 2 Item 1 1
Subject 2 Item 2 1
Subject 2 Item 3 0'), header=TRUE)
closeAllConnections()

acast(dat, Subject~Item)

# Seems fine
=My attemp===
# 
set.seed(1)
mydata - data.frame( x=sample(LETTERS[23:26],100, replace=TRUE),
y=rnorm(100, mean=2), id=sample(letters[1:4], 100, replace=TRUE))
   
acast(mydata , id ~ x)
# not so fine
=
 produces this : 
 Using id as value column: use value_var to override.
Aggregation function missing: defaulting to length
  W  X Y  Z
a 7  5 5  7
b 6  9 6 12
c 1  4 5  6
d 6 14 5  2

Why does reshape2 think it needs an aggregation function here

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transform a dataset from long to wide using reshape2

2011-02-26 Thread Phil Spector

John -
   In the dat data frame, there is exactly one observation
for each Subject/Item combination, so it's clear how to 
rearrange the values of Score.  In the mydata data frame there
are multiple values for most combinations of x and id, so 
reshape2 needs to aggregate y before it can rearrange the values.

   Hope this helps.
- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Sat, 26 Feb 2011, John Kane wrote:


I seem to be running into the same problem reported in
https://stat.ethz.ch/pipermail/r-help/2010-November/258265.html

I cannot seem to transform a dataset from long to wide using reshape2.
Clearly I am missing something very simple but a look at the manual and the 
reshape paper in JSS does not suggest anything.

Any advice would be welcome
===load rehape2
library(reshape2)
Working example===
# Example from above URL
dat - read.table(textConnection('Subject   Item Score
Subject 1 Item 1 1
Subject 1 Item 2 0
Subject 1 Item 3 1
Subject 2 Item 1 1
Subject 2 Item 2 1
Subject 2 Item 3 0'), header=TRUE)
closeAllConnections()

acast(dat, Subject~Item)

# Seems fine
=My attemp===
#
set.seed(1)
mydata - data.frame( x=sample(LETTERS[23:26],100, replace=TRUE),
   y=rnorm(100, mean=2), id=sample(letters[1:4], 100, replace=TRUE))

acast(mydata , id ~ x)
# not so fine
=
produces this :
Using id as value column: use value_var to override.
Aggregation function missing: defaulting to length
 W  X Y  Z
a 7  5 5  7
b 6  9 6 12
c 1  4 5  6
d 6 14 5  2

Why does reshape2 think it needs an aggregation function here

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] invalid type (list) for variable

2011-02-26 Thread Jannis

Michael,


i have no experience with the Bayesian models you use, but I could 
imagine that the problem is that this classifiedPositive variable is a 
list (as you suggest as well). With


unlist()
as.character()
as.factor()

you can convert the variable to anything you may need as an input.


HTH
Jannis
On 02/26/2011 12:40 PM, Michael Hecker wrote:

Hi,
I have a problem with an invalid variable type.

My sample dataset is looking like this:
str(factivaTDM.train)
'data.frame':   4 obs. of  14 variables:
   $ access: num  2 3 2 1
   $ billion   : num  2 1 2 2
   $ compani   : num  7 6 1 2
   $ data  : num  6 2 3 4
   $ inc   : num  5 3 3 2
   $ local : num  3 2 1 1
   $ mcleodusa : num  15 6 2 10
   $ network   : num  9 1 2 1
   $ provid: num  4 3 1 2
   $ servic: num  11 4 3 6
   $ share : num  9 5 3 5
   $ splitrock : num  18 6 4 5
   $ stock : num  3 3 1 2
   $ classifiedPositive:List of 4
..$ : chr yes
..$ : chr yes
..$ : chr yes
..$ : chr yes
..- attr(*, class)= chr AsIs

and the command:
NaiveBayes.model- NB(classifiedPositive ~ ., data = factivaTDM.train)
generates the error:
invalid type (list) for variable 'classifiedPositive'

Here is another dataset which causes no errors:
str(weatherNominal.train)
'data.frame':   13 obs. of  5 variables:
   $ outlook: Factor w/ 3 levels overcast,rainy,..: 3 3 1 2 2 2 1
3 3 2 ...
   $ temperature: Factor w/ 3 levels cool,hot,mild: 2 2 2 3 1 1 1 3
1 3 ...
   $ humidity   : Factor w/ 2 levels high,normal: 1 1 1 1 2 2 2 1 2 2 ...
   $ windy  : logi  FALSE TRUE FALSE FALSE FALSE TRUE ...
   $ play   : Factor w/ 3 levels ?,no,yes: 2 2 3 3 3 2 3 2 3 3 ...

So I suppose, I have to convert classifiedPositive:List to
classifiedPositive:Factor
How can I do that?

I will send you more information about my script if you need it.

Thank you very much,
Michael


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] preventing repeat in paste

2011-02-26 Thread Dimitri Liakhovitski
This is great, thanks a lot, Gabor!

On Fri, Feb 25, 2011 at 6:18 PM, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
 On Fri, Feb 25, 2011 at 5:21 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 s-start; e-end
 middle-as.character(c(1,2,3))

 I would like to get the following result:
 start 123 end or start 1 2 3 end or start 1,2,3 end

 How can I avoide this (undesired) result:
 paste(s,middle,e,sep= )

 Try this:

 paste(s, toString(middle), e)
 [1] start 1, 2, 3 end



 --
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com




-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2D Convolution in R

2011-02-26 Thread Wonsang You
Dear R-Helpers,

I want to try the 2D (two-dimensional) convolution in R.
For example, let us we have the following kernel and data.

kernel - (1,2,3,2,1)
data - array(1:100, dim=c(10,10))

I know the function 'convolve' only for one-dimensional convolution, but it
is just for a 1D sequence.
Is there any function for 2D convolution?
For theory, please refer to the following link:
http://www.songho.ca/dsp/convolution/convolution2d_example.html

Thank you for your help in advance.

Best Regards,
Wonsang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Interactive/Dynamic plots with R

2011-02-26 Thread Tal Galili
Hello Abhishek,
Also notice that if you are on windows, then rggobi doesn't work for the
latest R (with ggobi 2.1.8) because of GTK/dll issues.  In such a
case, Greg's other suggestions are probably the way to go for the time
being.

Best,
Tal


Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Sat, Feb 26, 2011 at 7:15 PM, Greg Snow greg.s...@imail.org wrote:

 n rggobi package that makes transferring data between R and ggobi easy.
  But I would probably start with the playwith pac

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to remove rows in which 2 or more observations are smaller than a given threshold?

2011-02-26 Thread hind lazrak
Hello

The data set I am examining has 7425 observations (rows with unique
identifiers) and 46 samples(columns).

I have been trying to generate a dataset that filters out observations
that are negligible
The definition of negligible is absolute value less or equal  to 1.58.

The rule that I would like to adopt to create a new data is: drop rows
in which 2 or more observations have absolute values = 1.58.

Since I have unique identifier per row, I have tried to reshape the
data so I could create a new variable using an ifelse statement that
would flag observations =1.58 but I am not getting anywhere with this
approach

I could not come up with an apply function that counts the number of
observations for which the absolute values are below the cutoff I've
specified.

All observations are numerical and  I don't have missing values.


Thank you in advance for the help,

Hind

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to remove rows in which 2 or more observations are smaller than a given threshold?

2011-02-26 Thread Phil Spector

If the matrix in question is named mymat, then

mymat[apply(mymat,1,function(x)sum(abs(x) = 1.58)  2),]

(untested due to a lack of a reproducible example) should give 
you a matrix without any rows containing two or more values 
with absolute value less than 1.58.


I'm not sure ifelse would be of much use here.
- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu




On Sat, 26 Feb 2011, hind lazrak wrote:


Hello

The data set I am examining has 7425 observations (rows with unique
identifiers) and 46 samples(columns).

I have been trying to generate a dataset that filters out observations
that are negligible
The definition of negligible is absolute value less or equal  to 1.58.

The rule that I would like to adopt to create a new data is: drop rows
in which 2 or more observations have absolute values = 1.58.

Since I have unique identifier per row, I have tried to reshape the
data so I could create a new variable using an ifelse statement that
would flag observations =1.58 but I am not getting anywhere with this
approach

I could not come up with an apply function that counts the number of
observations for which the absolute values are below the cutoff I've
specified.

All observations are numerical and  I don't have missing values.


Thank you in advance for the help,

Hind

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to remove rows in which 2 or more observations are smaller than a given threshold?

2011-02-26 Thread William Dunlap
You didn't say if your data set was a matrix or data.frame.
Here are 2 functions that do the job on either and one that
only works with data.frames, but is faster (a similar speedup
is available for matrices as well).  They all compute the
number of small values in each row, nSmall, and extract the
rows for which nSmall is less than 2.

f0 - function (x) { 
   nSmall - apply(x, 1, function(row) sum(abs(row) = 1.58)
   x[nSmall2, , drop = FALSE]
}
f1 - function (x) {
   nSmall- rowSums(abs(x)  1.58)
   x[nSmall2, , drop = FALSE]
}
f2 - function (x) {
stopifnot(is.data.frame(x))
nSmall - 0
for (column in x) {
nSmall - nSmall + (abs(column)  1.58)
}
x[nSmall  2, , drop = FALSE]
}

For a 10^5 row by 50 column data.frame I got the
following times:
   system.time(r0 - f0(z))
 user  system elapsed 
 2.390.042.51 
   system.time(r1 - f1(z))
 user  system elapsed 
 0.420.080.51 
   system.time(r2 - f2(z))
 user  system elapsed 
 0.210.050.24 
   identical(r0, r1)  identical(r0, r2)
  [1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of hind lazrak
 Sent: Saturday, February 26, 2011 3:37 PM
 To: r-help@r-project.org
 Subject: [R] how to remove rows in which 2 or more 
 observations are smaller than a given threshold?
 
 Hello
 
 The data set I am examining has 7425 observations (rows with unique
 identifiers) and 46 samples(columns).
 
 I have been trying to generate a dataset that filters out observations
 that are negligible
 The definition of negligible is absolute value less or 
 equal  to 1.58.
 
 The rule that I would like to adopt to create a new data is: drop rows
 in which 2 or more observations have absolute values = 1.58.
 
 Since I have unique identifier per row, I have tried to reshape the
 data so I could create a new variable using an ifelse statement that
 would flag observations =1.58 but I am not getting anywhere with this
 approach
 
 I could not come up with an apply function that counts the number of
 observations for which the absolute values are below the cutoff I've
 specified.
 
 All observations are numerical and  I don't have missing values.
 
 
 Thank you in advance for the help,
 
 Hind
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to remove rows in which 2 or more observations are smaller than a given threshold?

2011-02-26 Thread hind lazrak
Dear Bill and Phil

Many thanks for your help, your solutions worked perfectly.
Bill: I did not specify whether the data was a matrix or dataframe
because it is in fact the Expression file in an eset object (bioBase).

Thank you so much again!

Hind
On Sat, Feb 26, 2011 at 4:34 PM, William Dunlap wdun...@tibco.com wrote:
 You didn't say if your data set was a matrix or data.frame.
 Here are 2 functions that do the job on either and one that
 only works with data.frames, but is faster (a similar speedup
 is available for matrices as well).  They all compute the
 number of small values in each row, nSmall, and extract the
 rows for which nSmall is less than 2.

 f0 - function (x) {
   nSmall - apply(x, 1, function(row) sum(abs(row) = 1.58)
   x[nSmall2, , drop = FALSE]
 }
 f1 - function (x) {
   nSmall- rowSums(abs(x)  1.58)
   x[nSmall2, , drop = FALSE]
 }
 f2 - function (x) {
    stopifnot(is.data.frame(x))
    nSmall - 0
    for (column in x) {
        nSmall - nSmall + (abs(column)  1.58)
    }
    x[nSmall  2, , drop = FALSE]
 }

 For a 10^5 row by 50 column data.frame I got the
 following times:
   system.time(r0 - f0(z))
     user  system elapsed
     2.39    0.04    2.51
   system.time(r1 - f1(z))
     user  system elapsed
     0.42    0.08    0.51
   system.time(r2 - f2(z))
     user  system elapsed
     0.21    0.05    0.24
   identical(r0, r1)  identical(r0, r2)
  [1] TRUE

 Bill Dunlap
 Spotfire, TIBCO Software
 wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org
 [mailto:r-help-boun...@r-project.org] On Behalf Of hind lazrak
 Sent: Saturday, February 26, 2011 3:37 PM
 To: r-help@r-project.org
 Subject: [R] how to remove rows in which 2 or more
 observations are smaller than a given threshold?

 Hello

 The data set I am examining has 7425 observations (rows with unique
 identifiers) and 46 samples(columns).

 I have been trying to generate a dataset that filters out observations
 that are negligible
 The definition of negligible is absolute value less or
 equal  to 1.58.

 The rule that I would like to adopt to create a new data is: drop rows
 in which 2 or more observations have absolute values = 1.58.

 Since I have unique identifier per row, I have tried to reshape the
 data so I could create a new variable using an ifelse statement that
 would flag observations =1.58 but I am not getting anywhere with this
 approach

 I could not come up with an apply function that counts the number of
 observations for which the absolute values are below the cutoff I've
 specified.

 All observations are numerical and  I don't have missing values.


 Thank you in advance for the help,

 Hind

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

2011-02-26 Thread Ridgeway, Greg
I have heard about this before happening on other platforms. Frankly I'm not 
positive how this happens. My best guess is that there's a tiny bit of numeric 
instability in the 9+ decimal place so that on a given iteration a one variable 
choice at random looks better than the other. Any other ideas?
Greg

- Original Message -
From: Joshua Wiley jwiley.ps...@gmail.com
To: Axel Urbiz axel.ur...@gmail.com
Cc: R-help@r-project.org R-help@r-project.org; Ridgeway, Greg
Sent: Fri Feb 25 22:16:02 2011
Subject: Re: [R] Reproducibility issue in gbm (32 vs 64 bit)

Hi Axel,

I do not have a nice explanation why the results differ off the top of
my head.  I can say I can replicate what you get on 32/64 (both
Windows 7) bit with the development version of R and gbm_1.6-3.1.

Here is an even simpler example that shows the difference:

gbmfit - gbm(1:50 ~ I(50:1) + I(60:11), distribution = gaussian)
summary(gbmfit)

I copied that package maintainer.

Cheers,

Josh

On Fri, Feb 25, 2011 at 7:29 PM, Axel Urbiz axel.ur...@gmail.com wrote:
 Dear List,

 The gbm package on Win 7 produces different results for the
 relative importance of input variables in R 32-bit relative to R 64-bit. Any
 idea why? Any idea which one is correct?

 Based on this example, it looks like the relative importance of 2 perfectly
 correlated predictors is diluted by half in 32-bit, whereas in 64-bit, one
 of these predictors gets all the importance and the other gets none. I found
 this interesting.

 ### Sample code

 library(gbm)
 set.seed(12345)
 xc=matrix(rnorm(100*20),100,20)
 y=sample(1:2,100,replace=TRUE)
 xc[,2] - xc[,1]
 gbmfit - gbm(y~xc[,1]+xc[,2] +xc[,3], distribution=gaussian)
 summary(gbmfit)

 ### Results on R 2.12.0 (32-bit)

      var  rel.inf
 1 xc[, 3] 49.76143
 2 xc[, 1] 27.27432
 3 xc[, 2] 22.96425

 ### Results on R 2.12.0 (64-bit)
 summary(gbmfit)
      var  rel.inf
 1 xc[, 1] 50.23857
 2 xc[, 3] 49.76143
 3 xc[, 2]  0.0

 Thanks,
 Axel.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__

This email message is for the sole use of the intended recipient(s) and
may contain confidential information. Any unauthorized review, use,
disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and destroy all copies
of the original message.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] Fwd: Re: ANOVA and Pseudoreplication in R

2011-02-26 Thread Ben Ward

On 25/02/2011 21:22, Ben Ward wrote:


 Original Message 
Subject:Re: [R] ANOVA and Pseudoreplication in R
Date:   Fri, 25 Feb 2011 12:10:14 -0800
From:   Bert Guntergunter.ber...@gene.com
To: Ben Wardbenjamin.w...@bathspa.org
CC: r-helpr-help@r-project.org



I can hopefully save bandwidth here by suggesting that this belongs on
the R-sig-mixed-models list.

-- Bert

As an aside, shouldn't you be figuring this out yourself or seeking local 
consulting expertise?


I did consult with the lecturer at university that knows most about 
stats, and he advised me:


Pseudo replication is really about a lack of independence between measurements, So 
you need to work backwards and see where you are building in a known lack of 
independence.  And where that is the case you need to use means of all the values.


And I have done this and came to the conclusion I mentioned as to where 
I thought Pseudoreplicaton was comming from, however, I do not know 
about the one other 'potential' source as it really is for me at least, 
a grey area.
I've consulted a few forums that deal with the theory more and await any 
response. Until then I'll have to try and get as many opinions on it as 
possible.


-Ben W.


On Fri, Feb 25, 2011 at 9:08 AM, Ben Wardbenjamin.w...@bathspa.org   wrote:

  Hi, As part of my dissertation, I'm going to be doing an Anova, comparing
  the dead zone diameters on plates of microbial growth with little paper
  disks loaded with antimicrobial, a clear zone appears where death occurs,
  the size depending on the strength and succeptibility. So it's basically 4
  different treatments, and I'm comparing the diameters (in mm) of circles.
  I'm concerned however, about Pseudoreplication and how to deal with it in R,
  (I thought of using the Error() term.

  I have four levels of one factor(called Treatment): NE.Dettol, EV.Dettol,
  NE.Garlic, EV.Garlic.   (NE.Dettol is E.coli not evolved to dettol,
  exposed to dettol to get dead zones. And the same for NE.Garlic, but with
  garlic, not dettol. EV.Dettol is E.coli that has been evolved against
  dettol, and then tested afterwards against dettol to get the dead zones.
  Same applies for EV.Garlic but with garlic).  You see from the four levels
  (or treatments) there are two chemicals involved. So my first concern is
  whether they should be analysed using two seperate ANOVA's.

  NE.Dettol and NE.Garlic are both the same organism - a lab stock E.coli,
  just exposed to two different chemicals.
  EV.Dettol and EV.Garlic, are in principle, likely to be two different forms
  of the organism after the many experimental doses of their respective
  chemical.

  For NE.Garlic and NE.Dettol I have 5, what I've called Lineages, basically
  seperate bottles of them (10 in total).
  Then I have 5 Bottles (Lineages) of EV.Dettol, and 5 of EV.Garlic. - This
  was done because there was the possiblity that, whilst I'm expecting them
  all to respond in a similar manner, there are many evolutionary paths to the
  same result, and previous research and reading shows that occasionally one
  or two react differently to the rest through random chance.
  The point I observed above (NE.Dettol and NE.Garlic are both the same
  organism...) is also applicable to the 5 bottles: The 5 bottles each of
  NE.Garlic and NE.Dettol are supposed to be all the same organism - from a
  stock one kept in store in the lab.
  There is potential though for the 5 of EV.Garlic, to be different from one
  another, and potential for the 5 EV.Dettol to be different from one another.

  The Lineage (bottle) is also a factor then, with 5 levels (1,2,3,4,5).
  Because they may be different.

  To get the measurements of the diamter of the zones. I take out a small
  amount from a tube and spread it on a plate, then take three paper disks,
  soaked in their respective chemical, either Dettol or Garlic. and press them
  and and incubate them.
  Then when the zones have appeared after a day or 2. I take 4 diameter
  measurements from each zone, across the zone at different angles, to take
  account for the fact, that there may be a weird shape, or not quite
  circular.

  I'm concerned about pseudoreplication, such as the multiple readings from
  one disk, and the 5 lineages - which might be different from one another in
  each of the Two EV. treatments, but not with NE. treatments.

  I read that I can remove pseudoreplication from  the multiple readings from
  each disk, by using the 4 readings on each disk, to produce a mean for the
  disks, and analyse those means - Exerciseing caution where there are extreme
  values. I think the 3 disks for each lineage themselves are not
  pseudoreplication, because they are genuinley 3 disks on a plate: the Disk
  Diffusion Test replicated 3 times - but the multiple readings from one disk
  if eel, is pseudoreplication. I've also read about including Error() terms
  in a formula.

  I'm unsure of the two NE. Treatments comming 

Re: [R] Interpreting the example given by Prof Frank Harrell in {Design} validate.cph

2011-02-26 Thread vikkiyft
Thank you very very very much Prof Harrell!! You've made my day!!

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Interpreting-the-example-given-by-Prof-Frank-Harrell-in-Design-validate-cph-tp3316820p3325844.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2011-02-26 Thread Hamaad Shah

Greetings,
I have been trying to use the RInside package with C++ however I am running 
into an error.
I use Ubuntu Linux 10.04 (64-bit) and my R version is 2.12.2. I compiled R from 
source to give me a shared R library, shared BLAS (ATLAS). When I try to 
compile one of the examples given in the RInside package it compiles perfectly 
fine however when I run the program it gives me an error:
error while loading shared libraries: libR.so: cannot open shared object file: 
No such file or directory
This is strange because libR.so does indeed exist and I have all the correct 
paths leading to R. I compile the program using:
g++ -I/usr/local/lib64/R/include -I/usr/local/lib64/R/library/Rcpp/include 
-I/usr/local/lib64/R/library/RInside/include -g -O3 -Wall -I/usr/local/include 
rinside.cpp -L/usr/local/lib64/R/lib -lR -L/usr/local/lib64/R/lib -lRblas 
-L/usr/local/lib64/R/lib -lRlapack -L/usr/local/lib64/R/library/Rcpp/lib -lRcpp 
-Wl,-rpath,/usr/local/lib64/R/library/Rcpp/lib 
-L/usr/local/lib64/R/library/RInside/lib -lRInside 
-Wl,-rpath,/usr/local/lib64/R/library/RInside/lib -o rinside
I know RInside works on my system as I have used it before successfully. If 
somebody can help me out here I would appreciate it.
Thanks,
Hamaad Musharaf Shah. 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help: Error en model.frame.defaul

2011-02-26 Thread Toni López Mayol
Hi and thanks to all R developers and helpers,

I wanrt to take  the 1 and 3 dimensions of this object to create a new one 
without the years and with the next functions to do some specific calculations:

 str(PMpa)
 num [1:499105, 1:60, 1:12] 29.8 55.8 29.7 25.1 25 ...
 - attr(*, dimnames)=List of 3
  ..$ punt: NULL
  ..$ any : chr [1:60] 1950 1951 1952 1953 ...
  ..$ mes : chr [1:12] Gener Febrer Març Abril ...


I'm writing a function where appears the next error message:

Error en model.frame.default(formula = x ~ c(1950:2009), drop.unused.levels = 
TRUE) : 
variable lengths differ (found for 'c (1950:2009)')


Script:

library(robustbase)
load(PMpa.Rdat)

R2 - function(x) var(predict(x))/var(predict(x)+resid(x))

TPMEN-t(apply(PMpa,c(1,3),function(x){
kk-lmrob(x~c(1950:2009)); 
c(kk$coefficients[2],R2(kk), (1-summary(kk)$coefficients[2,4])*100, 
(confint(kk,level=.95)*10)[2,])}))

dimnames(TPMEN)[[3]][1:3]-c(Pendent,R2,Significancia)

save(TPMEN,file=TPMEN.Rdat)

working:

 TPMEN-t(apply(PMpa,c(1,3),function(x){
+ kk-lmrob(x~c(1950:2009)); 
+ 
c(kk$coefficients[2],R2(kk),(1-summary(kk)$coefficients[2,4])*100,(confint(kk,level=.95)*10)[2,])}))
Error en model.frame.default(formula = x ~ c(1950:2009), drop.unused.levels = 
TRUE) : 
  las longitudes de las variables son diferentes (encontrada para 
'c(1950:2009)')


Thanks to all for your time and help.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help please ..simple question regarding output the p-value inside a function and lm

2011-02-26 Thread Umesh Rosyara
Hi Jorge and R users 
 
Thank you so much for the responses. You input helped me alot and
potentially can help me to solve one more problem, but I got error message.
I am sorry to ask you again but if you can find my problem in quick look
that will be great. I hope this will not cost alot of your time as this is
based on your idea. 
 
# Just data 
X1 - c(1,3,4,2,2)
X2 - c(2,1,3,1,2)
X3 - c(4,3,2,1,1)
X4- c(1,1,1,2,3)
X5 - c(3,2,1,1,2)
X6 - c(1,1,2,2,3)
odataframe - data.frame(X1,X2,X3,X4,X5,X6)
 
My objective here is sort the value of the pair of variables (X1 and X2, X3
and X4, X5 and X6 and so on.)  in such way that the second column in
pair is always higher than the first one (X2  X1, X4  X3, X6 X5 and so
on...). 
 
Here is my attempt: 
nmrk - 3
nvar - 2*nmrk 
lapply(1:nvar, function(ind){
# indices for the variables we need
 a - seq(1, nvar, by = 2)
 b - seq(2, nvar, by = 2)
# shorting column
tx[, a[ind]] = ifelse(odataframe[, a[ind]]  odataframe[,b[ind]],
odataframe[, a[ind]], odataframe[, b[ind]])
tx[, b[ind]] = ifelse(odataframe[, b[ind]]  dataframe[,a[ind]],
odataframe[,b[ind]], odataframe[,a[ind]])
df1 - transform( odataframe, odataframe[, a[ind]]= tx[, a[ind]],
odataframe[, b[ind]]= tx[, b[ind]]))
}
 
I got the following error: 
Error:
Error: unexpected '=' in:
tx[, b[ind]] = ifelse(odataframe[, b[ind]]  dataframe[,a[ind]],
odataframe[,b[ind]], odataframe[,a[ind]])
df1 - transform( odataframe, odataframe[, a[ind]]=
 
Thanks;
Umesh R 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error

2011-02-26 Thread mathijsdevaan
Mean doesn't work either... I understand that the message replacement has 0
items, need 37597770 implies that the function is not returning any values,
but I don't understand why then this is not the case in the example.

DF = data.frame(read.table(textConnection(A  B  C  D  E 
1 1  a  1999  1  0 
2 1  b  1999  0  1 
3 1  c  1999  0  1 
4 1  d  1999  1  0 
5 2  c  2001  1  0 
6 2  d  2001  0  1 
7 3  a  2004  0  1 
8 3  b  2004  0  1 
9 3  d  2004  0  1 
10 4  b  2001  1  0 
11 4  c  2001  1  0 
12 4  d  2001  0  1),head=TRUE,stringsAsFactors=FALSE)) 

DF = DF[order(DF$B,DF$C),]

#first option - works fine in example and my target data frame
DF$F = ave(DF$D,DF$B, FUN = function(x) cumsum(x)-x) 
DF$G = ave(DF$E,DF$B, FUN = function(x) cumsum(x)-x)

#second option - works fine in example but not in my target data frame
foo - function(x) 
 { 
 unlist(lapply(x, FUN = function(z) cumsum(z) - z)) 
 }
n-ave(DF[,c(4:5)],DF$B,FUN = foo)

Why is this second option not working in my target data frame (which is much
bigger than the example)?
Thanks



-- 
View this message in context: 
http://r.789695.n4.nabble.com/Error-tp3324531p3325857.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2D Convolution Function

2011-02-26 Thread Wonsang You
Dear R-Helpers,

I want to try the 2D (two-dimensional) convolution in R.
For example, let us we have the following kernel and data.

kernel - (1,2,3,2,1)
data - array(1:100, dim=c(10,10))

I know the function 'convolve' only for one-dimensional convolution, but it
is just for a 1D sequence.
Is there any function for 2D convolution?
For theory, please refer to the following link:
http://www.songho.ca/dsp/convolution/convolution2d_example.html

Thank you for your help in advance.

Best Regards,
Wonsang


-
Wonsang You
Leibniz Institute for Neurobiology
-- 
View this message in context: 
http://r.789695.n4.nabble.com/2D-Convolution-Function-tp3326198p3326198.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping and counting in dataframe

2011-02-26 Thread zem
sry, 
new try: 

tm-c(12345,42352,12435,67546,24234,76543,31243,13334,64562,64123) 
gr-c(1,3,1,2,2,4,2,3,3,3) 
d-data.frame(cbind(time,gr))

where tm are unix times and gr the factor grouping by
i have a skalar for example k=500
now i need to calculate in for every row how much examples in the group are
in the interval [i-500;i+500] and i is the active tm-element, like this: 

d
time gr ct
1  12345  1  2
2  42352  3  0
3  12435  1  2
4  67546  2  0
5  24234  2  0
6  76543  4  0
7  31243  2  0
8  13334  3  0
9  64562  3  2
10 64123  3  2

i hope that was a better illustration of my problem

-- 
View this message in context: 
http://r.789695.n4.nabble.com/grouping-and-counting-in-dataframe-tp3325476p3326338.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hello!

2011-02-26 Thread tsvi sabo
Hi,i would like to know what is the best way to write a procedure in R,it seems 
that when i run a script it doesn't wait for previouse code lines to be 
excuted,i would like to know how to force R to hold on a code line until it is 
processed,for example:i want to choose from a list the name of the sheet in 
which to do an analisys,then i want to run a goodness of fit test for the data 
on that chosen sheet,how can i do that?and if it is not too much i have couple 
more questions,1. how to retrieve the names of excel's file sheets to an array 
?2. how to retrieve the names of excel's Sheet columns names to an array ? 
thanks a lot,Tsvi Sabo


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 2D Convolution Function

2011-02-26 Thread Wonsang You
Dear R-Helpers,

I want to try the 2D (two-dimensional) convolution in R.
For example, let us we have the following kernel and data.

kernel - (1,2,3,2,1)
data - array(1:100, dim=c(10,10))

I know the function 'convolve' only for one-dimensional convolution, but it
is just for a 1D sequence.
Is there any function for 2D convolution?
For theory, please refer to the following link:
http://www.songho.ca/dsp/convolution/convolution2d_example.html

Thank you for your help in advance.

Best Regards,
Wonsang


-
Wonsang You
Leibniz Institute for Neurobiology
-- 
View this message in context: 
http://r.789695.n4.nabble.com/2D-Convolution-Function-tp3326158p3326158.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] wireframe() display a graph with two colors, not a gradient.

2011-02-26 Thread James Platt
Hi all,

I'm quite new to wireframe, essentially what I want to do is display a graph, 
and z-values  1 would be yellow and those  1 would be blue.
This is a bit of my data.

0.334643563 0.350913807 0.383652307
0.370325283 0.38779016  0.42387392
0.39861579  0.418389687 0.460692165
0.43888516  0.468015843 0.520560489
0.499544084 0.535099422 0.60982153
0.569888047 0.634351734 0.717646392
0.717202578 0.810887467 0.935152498
0.901982916 1.044895388 1.228306176
1.12856184  1.314210456 1.462055626
1.377314404 1.540372345 1.6206216
1.494044604 1.618244219 1.631295797





data.m = as.matrix(read.table(/Users/James/Desktop/c.txt, sep='\t'))
library(lattice)
wireframe(data.m,aspect = c(0.3), shade=TRUE, screen = list(z = 0, x = -45), 
light.source = c(0,0,10), distance = 0.2)

i know I probably need to use the col.regions or level.colors argument, but I'm 
not sure what I actually need to supply to the arguments to get it to work. All 
the examples I've seen also have a gradient of color between two or more 
colors, I want to color half the graph with z values  1 yellow, and 1 blue.


Thanks for any help. James
 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using uniroot() with output from deriv() or D()

2011-02-26 Thread jjheath
Hans,

Both your methods worked great!  They was exactly what I was looking for.
I ended up using the second method as it is a little more efficient:

fn2nd_fun - function(x) eval(fn2nd, list(x=x))
ex - seq(from=0, to=1, length.out = 1000)
y1 - fn2nd_fun(ex)
...
r - uniroot(fn2nd_fun, interval=c(0, 1), tol=0.0001) 

Thanks,
J

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-uniroot-with-output-from-deriv-or-D-tp3325635p3326296.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] Fwd: Re: ANOVA and Pseudoreplication in R

2011-02-26 Thread Bert Gunter
Hi Ben:

1) Confession: I did not and have not read your post in detail.

2) IMHO, the following advice:

 Pseudo replication is really about a lack of independence between
 measurements, So you need to work backwards and see where you are building
 in a known lack of independence.  And where that is the case you need to use
 means of all the values.

is baloney. The lack of independence part is correct. The baloney
part is take the mean. That is exactly where mixed models -- or
hierarchical modeling of some sort -- is required. That's why I
referred you to R-sig-mixed-models.

Again, please note: IMHO. Maybe I'm the one full of baloney. It's free
advice, after all, so beware of what you get.

Cheers,
Bert


On Sat, Feb 26, 2011 at 7:44 AM, Ben Ward benjamin.w...@bathspa.org wrote:
 On 25/02/2011 21:22, Ben Ward wrote:

  Original Message 
 Subject:        Re: [R] ANOVA and Pseudoreplication in R
 Date:   Fri, 25 Feb 2011 12:10:14 -0800
 From:   Bert Guntergunter.ber...@gene.com
 To:     Ben Wardbenjamin.w...@bathspa.org
 CC:     r-helpr-help@r-project.org



 I can hopefully save bandwidth here by suggesting that this belongs on
 the R-sig-mixed-models list.

 -- Bert

 As an aside, shouldn't you be figuring this out yourself or seeking local
 consulting expertise?

 I did consult with the lecturer at university that knows most about stats,
 and he advised me:

 Pseudo replication is really about a lack of independence between
 measurements, So you need to work backwards and see where you are building
 in a known lack of independence.  And where that is the case you need to use
 means of all the values.


 And I have done this and came to the conclusion I mentioned as to where I
 thought Pseudoreplicaton was comming from, however, I do not know about the
 one other 'potential' source as it really is for me at least, a grey area.
 I've consulted a few forums that deal with the theory more and await any
 response. Until then I'll have to try and get as many opinions on it as
 possible.

 -Ben W.

 On Fri, Feb 25, 2011 at 9:08 AM, Ben Wardbenjamin.w...@bathspa.org
 wrote:

  Hi, As part of my dissertation, I'm going to be doing an Anova,
 comparing
  the dead zone diameters on plates of microbial growth with little
 paper
  disks loaded with antimicrobial, a clear zone appears where death
 occurs,
  the size depending on the strength and succeptibility. So it's basically
 4
  different treatments, and I'm comparing the diameters (in mm) of
 circles.
  I'm concerned however, about Pseudoreplication and how to deal with it
 in R,
  (I thought of using the Error() term.

  I have four levels of one factor(called Treatment): NE.Dettol,
 EV.Dettol,
  NE.Garlic, EV.Garlic.   (NE.Dettol is E.coli not evolved to dettol,
  exposed to dettol to get dead zones. And the same for NE.Garlic, but
 with
  garlic, not dettol. EV.Dettol is E.coli that has been evolved against
  dettol, and then tested afterwards against dettol to get the dead
 zones.
  Same applies for EV.Garlic but with garlic).  You see from the four
 levels
  (or treatments) there are two chemicals involved. So my first concern is
  whether they should be analysed using two seperate ANOVA's.

  NE.Dettol and NE.Garlic are both the same organism - a lab stock E.coli,
  just exposed to two different chemicals.
  EV.Dettol and EV.Garlic, are in principle, likely to be two different
 forms
  of the organism after the many experimental doses of their respective
  chemical.

  For NE.Garlic and NE.Dettol I have 5, what I've called Lineages,
 basically
  seperate bottles of them (10 in total).
  Then I have 5 Bottles (Lineages) of EV.Dettol, and 5 of EV.Garlic. -
 This
  was done because there was the possiblity that, whilst I'm expecting
 them
  all to respond in a similar manner, there are many evolutionary paths to
 the
  same result, and previous research and reading shows that occasionally
 one
  or two react differently to the rest through random chance.
  The point I observed above (NE.Dettol and NE.Garlic are both the same
  organism...) is also applicable to the 5 bottles: The 5 bottles each of
  NE.Garlic and NE.Dettol are supposed to be all the same organism - from
 a
  stock one kept in store in the lab.
  There is potential though for the 5 of EV.Garlic, to be different from
 one
  another, and potential for the 5 EV.Dettol to be different from one
 another.

  The Lineage (bottle) is also a factor then, with 5 levels (1,2,3,4,5).
  Because they may be different.

  To get the measurements of the diamter of the zones. I take out a small
  amount from a tube and spread it on a plate, then take three paper
 disks,
  soaked in their respective chemical, either Dettol or Garlic. and press
 them
  and and incubate them.
  Then when the zones have appeared after a day or 2. I take 4 diameter
  measurements from each zone, across the zone at different angles, to
 take
  account for the fact, that there may be a weird shape, or not quite
  

[R] Weighting of variables‏

2011-02-26 Thread Christopher S

Dear All,

I have run PCA using prcomp() and generated a loading plot using biplot(). In 
addition, I would like to extract the
weighting for each of the variables in the loading plot so I can see which of 
the variables have the most influence 
on the distribution of the points in a 2D plot of PC1 and PC2.

Is there a simple function for this?

Thanks!

-Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hello!

2011-02-26 Thread Bert Gunter
Are you a fan of James Joyce? Is the Caps key on your keyboard broken?

-- Bert

On Sat, Feb 26, 2011 at 10:09 AM, tsvi sabo tsvis...@yahoo.com wrote:
 Hi,i would like to know what is the best way to write a procedure in R,it 
 seems that when i run a script it doesn't wait for previouse code lines to be 
 excuted,i would like to know how to force R to hold on a code line until it 
 is processed,for example:i want to choose from a list the name of the sheet 
 in which to do an analisys,then i want to run a goodness of fit test for the 
 data on that chosen sheet,how can i do that?and if it is not too much i have 
 couple more questions,1. how to retrieve the names of excel's file sheets to 
 an array ?2. how to retrieve the names of excel's Sheet columns names to an 
 array ?
 thanks a lot,Tsvi Sabo



        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transform a dataset from long to wide using reshape2

2011-02-26 Thread John Kane
Thanks. I suddenly realised this as I was cycling down to the local grocery 
store, about 10 minutes after posting the problem.  I felt like an idiot.  
Thanks for confirming my stupidity.

john

--- On Sat, 2/26/11, Phil Spector spec...@stat.berkeley.edu wrote:

 From: Phil Spector spec...@stat.berkeley.edu
 Subject: Re: [R] Transform a dataset from long to wide using reshape2
 To: John Kane jrkrid...@yahoo.ca
 Cc: R R-help r-h...@stat.math.ethz.ch
 Received: Saturday, February 26, 2011, 2:24 PM
 John -
     In the dat data frame, there is exactly one
 observation
 for each Subject/Item combination, so it's clear how to 
 rearrange the values of Score.  In the mydata data
 frame there
 are multiple values for most combinations of x and id, so 
 reshape2 needs to aggregate y before it can rearrange the
 values.
     Hope this helps.
             
         - Phil Spector
             
      Statistical
 Computing Facility
             
      Department
 of Statistics
             
      UC
 Berkeley
             
      spec...@stat.berkeley.edu
 
 
 
 On Sat, 26 Feb 2011, John Kane wrote:
 
  I seem to be running into the same problem reported
 in
  https://stat.ethz.ch/pipermail/r-help/2010-November/258265.html
 
  I cannot seem to transform a dataset from long to wide
 using reshape2.
  Clearly I am missing something very simple but a look
 at the manual and the reshape paper in JSS does not suggest
 anything.
 
  Any advice would be welcome
  ===load
 rehape2
  library(reshape2)
  Working
 example===
  # Example from above URL
  dat -
 read.table(textConnection('Subject   Item
 Score
  Subject 1 Item 1     1
  Subject 1 Item 2     0
  Subject 1 Item 3     1
  Subject 2 Item 1     1
  Subject 2 Item 2     1
  Subject 2 Item 3     0'),
 header=TRUE)
  closeAllConnections()
 
  acast(dat, Subject~Item)
 
  # Seems fine
  =My
 attemp===
  #
  set.seed(1)
  mydata - data.frame( x=sample(LETTERS[23:26],100,
 replace=TRUE),
         y=rnorm(100, mean=2),
 id=sample(letters[1:4], 100, replace=TRUE))
 
  acast(mydata , id ~ x)
  # not so fine
 
 =
  produces this :
  Using id as value column: use value_var to override.
  Aggregation function missing: defaulting to length
   W  X Y  Z
  a 7  5 5  7
  b 6  9 6 12
  c 1  4 5  6
  d 6 14 5  2
 
  Why does reshape2 think it needs an aggregation
 function here
 
  Thanks
 
  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nested case-control study

2011-02-26 Thread array chip
Hi, I am wondering if there is a package for doing conditional logistic 
regression for nested case-control study as described in Estimation of 
absolute 
risk from nested case-control data by Langholz and Borgan (1997) where 
Horvitz-Thompson sampling weight (log of (number in the risk set divided by the 
number sampled)) is used with regression. In SAS Proc Phreg, this is 
implemented 
as an offset (offset=logweight). I checked clogistic() in Epi package and 
clogit() in survival package, but couldn't figure out how to incorporate this 
weighting with either. 


Also when considering nested case-control sampling for Cox proportional hazards 
model, the above method can estimate absolute risk of developing disease over a 
specified time interval. Appreciate if anyone has any suggestion on how to do 
this in R.

Thanks very much!

John


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-sig-ME] Fwd: Re: ANOVA and Pseudoreplication in R

2011-02-26 Thread Robert A LaBudde
I think you are over-concerned with the term pseudo-replication. 
All this means is that the error source is nested, and not a full replicate.


What you haven't done, and need to do, is to describe your experiment 
in terms of the real variables and error sources. Then code them as 
fixed or random and write the model form.


You description is very complex and confusing. You need to identify:

1. Exactly what culture is used in each trial.
2. All subcultures: NE vs EV, which chemical, etc.
3. Which plate.
4. Which circle on which plate.
5. Which measurement on which circle.

After you have identified all descriptors that identify each 
individual measurement, put them in a file with columns for factors 
and the outcome.


At this point it should be clear that you have a number of random 
factors: plate, circle, measurement, plus any culture replicates, if 
you have them.


One you successfully do this, you can write the model equation with 
the proper nesting.


The time to do all this is BEFORE you conduct the experiment.


At 10:44 AM 2/26/2011, Ben Ward wrote:

On 25/02/2011 21:22, Ben Ward wrote:


 Original Message 
Subject:Re: [R] ANOVA and Pseudoreplication in R
Date:   Fri, 25 Feb 2011 12:10:14 -0800
From:   Bert Guntergunter.ber...@gene.com
To: Ben Wardbenjamin.w...@bathspa.org
CC: r-helpr-help@r-project.org



I can hopefully save bandwidth here by suggesting that this belongs on
the R-sig-mixed-models list.

-- Bert

As an aside, shouldn't you be figuring this out yourself or seeking 
local consulting expertise?


I did consult with the lecturer at university that knows most about 
stats, and he advised me:


Pseudo replication is really about a lack of independence between 
measurements, So you need to work backwards and see where you are 
building in a known lack of independence.  And where that is the 
case you need to use means of all the values.



And I have done this and came to the conclusion I mentioned as to 
where I thought Pseudoreplicaton was comming from, however, I do not 
know about the one other 'potential' source as it really is for me 
at least, a grey area.
I've consulted a few forums that deal with the theory more and await 
any response. Until then I'll have to try and get as many opinions 
on it as possible.


-Ben W.


On Fri, Feb 25, 2011 at 9:08 AM, Ben Wardbenjamin.w...@bathspa.org   wrote:

  Hi, As part of my dissertation, I'm going to be doing an Anova, comparing
  the dead zone diameters on plates of microbial growth with little paper
  disks loaded with antimicrobial, a clear zone appears where 
death occurs,
  the size depending on the strength and succeptibility. So it's 
basically 4

  different treatments, and I'm comparing the diameters (in mm) of circles.
  I'm concerned however, about Pseudoreplication and how to deal 
with it in R,

  (I thought of using the Error() term.

  I have four levels of one factor(called Treatment): 
NE.Dettol, EV.Dettol,

  NE.Garlic, EV.Garlic.   (NE.Dettol is E.coli not evolved to dettol,
  exposed to dettol to get dead zones. And the same for 
NE.Garlic, but with

  garlic, not dettol. EV.Dettol is E.coli that has been evolved against
  dettol, and then tested afterwards against dettol to get the 
dead zones.
  Same applies for EV.Garlic but with garlic).  You see from 
the four levels

  (or treatments) there are two chemicals involved. So my first concern is
  whether they should be analysed using two seperate ANOVA's.

  NE.Dettol and NE.Garlic are both the same organism - a lab stock E.coli,
  just exposed to two different chemicals.
  EV.Dettol and EV.Garlic, are in principle, likely to be two 
different forms

  of the organism after the many experimental doses of their respective
  chemical.

  For NE.Garlic and NE.Dettol I have 5, what I've called 
Lineages, basically

  seperate bottles of them (10 in total).
  Then I have 5 Bottles (Lineages) of EV.Dettol, and 5 of EV.Garlic. - This
  was done because there was the possiblity that, whilst I'm expecting them
  all to respond in a similar manner, there are many evolutionary 
paths to the
  same result, and previous research and reading shows that 
occasionally one

  or two react differently to the rest through random chance.
  The point I observed above (NE.Dettol and NE.Garlic are both the same
  organism...) is also applicable to the 5 bottles: The 5 bottles each of
  NE.Garlic and NE.Dettol are supposed to be all the same organism - from a
  stock one kept in store in the lab.
  There is potential though for the 5 of EV.Garlic, to be 
different from one
  another, and potential for the 5 EV.Dettol to be different from 
one another.


  The Lineage (bottle) is also a factor then, with 5 levels (1,2,3,4,5).
  Because they may be different.

  To get the measurements of the diamter of the zones. I take out a small
  amount from a tube and spread it on a plate, then take three paper disks,
  soaked in their respective chemical,