[R] hgu133plus2hsentrezgprobe library

2012-03-19 Thread Eleni Christodoulou
Hello R community,

I am processing raw Affymetrix CEL files and I am using the Michigan custom
CDF library hgu133plus2hsentrezgprobe. I have been looking for
documentation on the function that it contains...I am specifically
interested in converting probe names to gene symbols. Does anybody know
where I can find it?

Thank a lot!
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transfer R workspace on another PC

2010-03-10 Thread Eleni Christodoulou
Hi all and thanks for your answers.

It is my first attempt to do this kind of transfer and oth machines are 32
bit. The size of my data is 92,3 Mb and I did not try to restart. However,
Steve you are right, I have not installed the same packages in both
computers. Moreoer, I have not used the 'session' package. I will try both
and I will let you know.

Once again,
Thanks a lot for your help!
Eleni


On Wed, Mar 10, 2010 at 5:34 AM, Khanh Nguyen kngu...@cs.umb.edu wrote:

 I don't have an answer, but I suggest 'session' package.. I use it to
 move my workspace around. Never had any problem before.

 -k

 On Tue, Mar 9, 2010 at 4:44 PM, Eleni Christodoulou elenic...@gmail.com
 wrote:
  Hi list!
 
  I have recently tried to take my office work home, meaning that I tried
 to
  transfer my ... .RData workspace from my PC on my laptop. The office PC
  runs on Windows XP and my laptop runs on Windows Vista. I have saved the
  workspace at the office PC and kept it in a usb drive. When I tried to
 open
  it on my laptop I got an error: Fatal Error: Unable to restore saved
 data
  in .RData. On both computers I have the R.2.9.0 version. Could anybody
 give
  me an explanation why this happens and how I can solve this?
 
  Thanks a lot!
  Eleni
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transfer R workspace on another PC

2010-03-10 Thread Eleni Christodoulou
It worked with the installation of the proper packages!!!
Thanks a lot!

Eleni

On Wed, Mar 10, 2010 at 10:48 AM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi


 r-help-boun...@r-project.org napsal dne 09.03.2010 22:44:31:

  Hi list!
 
  I have recently tried to take my office work home, meaning that I tried
 to
  transfer my ... .RData workspace from my PC on my laptop. The office
 PC
  runs on Windows XP and my laptop runs on Windows Vista. I have saved the
  workspace at the office PC and kept it in a usb drive. When I tried to
 open
  it on my laptop I got an error: Fatal Error: Unable to restore saved
 data
  in .RData. On both computers I have the R.2.9.0 version. Could anybody
 give

 I suppose your error continued with naming some package you have installed
 it office comp but do not have installed in your home. Try to install
 necessary packages and then to open workspace again.

 Regards
 Petr


  me an explanation why this happens and how I can solve this?
 
  Thanks a lot!
  Eleni
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Transfer R workspace on another PC

2010-03-09 Thread Eleni Christodoulou
Hi list!

I have recently tried to take my office work home, meaning that I tried to
transfer my ... .RData workspace from my PC on my laptop. The office PC
runs on Windows XP and my laptop runs on Windows Vista. I have saved the
workspace at the office PC and kept it in a usb drive. When I tried to open
it on my laptop I got an error: Fatal Error: Unable to restore saved data
in .RData. On both computers I have the R.2.9.0 version. Could anybody give
me an explanation why this happens and how I can solve this?

Thanks a lot!
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sum of list elements

2010-03-04 Thread Eleni Christodoulou
Dear list,

I have some difficulty in manipulating list elements. More specifically, I
am performing svm regression and have a list of lists, called pred.svm. The
elements of the second list are 3D arrays. Thus I have pred.svm[[i]][[j]],
with 1=i=5 and 1=j=20.
I want to take the sum of the elements a specific array dimension across all
j, for one i. Mathematically speaking, I want to calculate *W* as:

  *W = pred.svm[[i]][[1]][1,2,5] + pred.svm[[i]][[2]][1,2,5]+
pred.svm[[i]][[3]][1,2,5]+...+ pred.svm[[i]][[20]][1,2,5]*

I have tried to apply the *lapply() *function but it seems that its
arguments can only be vector elements of a list...Do I need to convert the
array data to vector data?

Any advice would be very welcome!

Thanks a lot,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sum of list elements

2010-03-04 Thread Eleni Christodoulou
Thank you Dimitris!
I have 3D arrays of the same dimensions, so Reduce worked...

Best,
Eleni


On Thu, Mar 4, 2010 at 5:13 PM, Dimitris Rizopoulos 
d.rizopou...@erasmusmc.nl wrote:

 do these lists contain 3D arrays of the same dimensions? If yes, then you
 could use

 Reduce(+,  pred.svm[[i]])[1,2,5]

 otherwise a for-loop will also be clear and efficient, e.g.,


 W - pred.svm[[i]][[1]][1,2,5]
 for (j in 2:20) {
W - W + pred.svm[[i]][[j]][1,2,5]
 }


 I hope it helps.

 Best,
 Dimitris



 On 3/4/2010 4:02 PM, Eleni Christodoulou wrote:

 Dear list,

 I have some difficulty in manipulating list elements. More specifically, I
 am performing svm regression and have a list of lists, called pred.svm.
 The
 elements of the second list are 3D arrays. Thus I have pred.svm[[i]][[j]],
 with 1=i=5 and 1=j=20.
 I want to take the sum of the elements a specific array dimension across
 all
 j, for one i. Mathematically speaking, I want to calculate *W* as:

   *W = pred.svm[[i]][[1]][1,2,5] + pred.svm[[i]][[2]][1,2,5]+
 pred.svm[[i]][[3]][1,2,5]+...+ pred.svm[[i]][[20]][1,2,5]*

 I have tried to apply the *lapply() *function but it seems that its
 arguments can only be vector elements of a list...Do I need to convert the
 array data to vector data?

 Any advice would be very welcome!

 Thanks a lot,
 Eleni

[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center

 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478
 Fax: +31/(0)10/7043014


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ridge regression

2010-01-08 Thread Eleni Christodoulou
Hello again and Happy 2010!
I was looking back at this email because I need to do some additional
processing now. I was thinking that if I take the coef(ans) I get n+1
coefficients. I guess that the coef(ans)[1] is the constant term... Do I
need to add it when I calculate the estimated value for the outcome?
For example, lets say that I have divided my data into training data and
test data and I have the corresponding observed try_values and tey_values
(the real values for the samples that belong to the training set and the
test set respectively)
Here is my code:
*
library(MASS)
 ridge.test=lm.ridge(tey_values~tedata,lambda)
est-list()
yest-numeric()
for(i in 1:length(tey_values)){
est[[i]]=coef(ridge.test)[-1]*tedata[i,]
yest[i]=sum(est[[i]])+coef(ridge.test)[1]
}*


On Wed, Dec 2, 2009 at 8:22 PM, Ravi Varadhan rvarad...@jhmi.edu wrote:

 The help page clearly states that ans$coef is not on the original scale
 and
 are for use by the coef method.  You also see that ans$scales gives you
 the
 scales used in the computation of ans$coef.

 So, to get coefficients on the original scale, you can either use coef(ans)
 or you can divide ans$coef by ans$scales.

 X1 - runif(20)
 X2 - runif(20)
 Y - 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)

 lam - 10
 ans1 - lm.ridge(Y ~ X1 + X2, lambda = lam)

 all.equal(ans1$coef / ans1$scales, coef(ans1)[2:3] )

 Hope this helps,
 Ravi.


 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml




 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Ravi Varadhan
 Sent: Wednesday, December 02, 2009 12:25 PM
 To: 'David Winsemius'; 'Eleni Christodoulou'
 Cc: r-help@r-project.org
 Subject: Re: [R] Ridge regression

 You are right that the ans$coef and coef(ans) are different in ridge
 regression, where `ans' is the object from lm.ridge.  It is the coef(ans)
 that yields the coefficients on the original scale.  ans$coef is the
 coefficient of X-scaled and Y-centered version.

 Here is an example that illustrates the workings of ridge regression.

 First let us create some data:

 X1 - runif(20)
 X2 - runif(20)
 Y - 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)

 lam - 10
 ans1 - lm.ridge(Y ~ X1 + X2, lambda = lam)
 ans1$coef
 coef(ans1)
 # Note that these two are different

 # Now Let us scale the variables X1 and X2 and center Y
 #
 cY - scale(Y, scale=FALSE)
 n - length(Y)
 sX1 - scale(X1) * sqrt(n/(n-1))
 sX2 - scale(X2) *  sqrt(n/(n-1))

 require(MASS)

 lam - 10
 ans2 - lm.ridge(cY ~ sX1 + sX2, lambda = lam)

 ans2$coef
 coef(ans2)
 # Now, see that the coefficients of sX1 and sX2 are the same
 # This is the connection!

 # Armed with this insight, we now compare the ans1$coef with scaled
 coefficients
 #
 ans1$coef
 c(coef(ans1)[2] * sd(X1), coef(ans1)[3] * sd(X2)) * sqrt((n-1)/n)

 # Now they are the same!

 I hope this is clear.

 Best,
 Ravi.


 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml




 
 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of David Winsemius
 Sent: Wednesday, December 02, 2009 11:04 AM
 To: Eleni Christodoulou
 Cc: r-help@r-project.org
 Subject: Re: [R] Ridge regression


 On Dec 2, 2009, at 10:42 AM, Eleni Christodoulou wrote:

  Dear list,
 
  I have a couple of questions concerning ridge regression. I am using
  the
  lm.ridge(...) function in order to fit a model to my microarray data.
  Thus *model=lm.ridge(...)*
  I retrieve some coefficients and some scales for each gene. First of
  all, I
  would like to ask: the real coefficients of the model are not
  included in
  the first argument of the output but in the result of coef(model),
  am I
  right?

 Not exactly. coef(model) extracts the coefficients from the model but
 the coefficients do in the example instance I created following the
 help page happen to be in the first element of the model.

 eg:
   long.rr

Re: [R] Ridge regression

2010-01-08 Thread Eleni Christodoulou
I am sorry, I just pressed the send button by accident before completing
my e-mail. The yest are the estimated values according to the ridge model.
Is the way that I calculate them correct? Or should I cut the
*+coef(ridge.test)[1]
*term?

Thanks a lot!
Eleni

On Fri, Jan 8, 2010 at 6:16 PM, Eleni Christodoulou elenic...@gmail.comwrote:

 Hello again and Happy 2010!
 I was looking back at this email because I need to do some additional
 processing now. I was thinking that if I take the coef(ans) I get n+1
 coefficients. I guess that the coef(ans)[1] is the constant term... Do I
 need to add it when I calculate the estimated value for the outcome?
 For example, lets say that I have divided my data into training data and
 test data and I have the corresponding observed try_values and tey_values
 (the real values for the samples that belong to the training set and the
 test set respectively)
 Here is my code:
 *
 library(MASS)
  ridge.test=lm.ridge(tey_values~tedata,lambda)
 est-list()
 yest-numeric()
 for(i in 1:length(tey_values)){
 est[[i]]=coef(ridge.test)[-1]*tedata[i,]
 yest[i]=sum(est[[i]])+coef(ridge.test)[1]
 }*



 On Wed, Dec 2, 2009 at 8:22 PM, Ravi Varadhan rvarad...@jhmi.edu wrote:

 The help page clearly states that ans$coef is not on the original scale
 and
 are for use by the coef method.  You also see that ans$scales gives you
 the
 scales used in the computation of ans$coef.

 So, to get coefficients on the original scale, you can either use
 coef(ans)
 or you can divide ans$coef by ans$scales.

 X1 - runif(20)
 X2 - runif(20)
 Y - 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)

 lam - 10
 ans1 - lm.ridge(Y ~ X1 + X2, lambda = lam)

 all.equal(ans1$coef / ans1$scales, coef(ans1)[2:3] )

 Hope this helps,
 Ravi.


 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml




 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of Ravi Varadhan
 Sent: Wednesday, December 02, 2009 12:25 PM
 To: 'David Winsemius'; 'Eleni Christodoulou'
 Cc: r-help@r-project.org
 Subject: Re: [R] Ridge regression

 You are right that the ans$coef and coef(ans) are different in ridge
 regression, where `ans' is the object from lm.ridge.  It is the coef(ans)
 that yields the coefficients on the original scale.  ans$coef is the
 coefficient of X-scaled and Y-centered version.

 Here is an example that illustrates the workings of ridge regression.

 First let us create some data:

 X1 - runif(20)
 X2 - runif(20)
 Y - 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)

 lam - 10
 ans1 - lm.ridge(Y ~ X1 + X2, lambda = lam)
 ans1$coef
 coef(ans1)
 # Note that these two are different

 # Now Let us scale the variables X1 and X2 and center Y
 #
 cY - scale(Y, scale=FALSE)
 n - length(Y)
 sX1 - scale(X1) * sqrt(n/(n-1))
 sX2 - scale(X2) *  sqrt(n/(n-1))

 require(MASS)

 lam - 10
 ans2 - lm.ridge(cY ~ sX1 + sX2, lambda = lam)

 ans2$coef
 coef(ans2)
 # Now, see that the coefficients of sX1 and sX2 are the same
 # This is the connection!

 # Armed with this insight, we now compare the ans1$coef with scaled
 coefficients
 #
 ans1$coef
 c(coef(ans1)[2] * sd(X1), coef(ans1)[3] * sd(X2)) * sqrt((n-1)/n)

 # Now they are the same!

 I hope this is clear.

 Best,
 Ravi.


 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml




 
 

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of David Winsemius
 Sent: Wednesday, December 02, 2009 11:04 AM
 To: Eleni Christodoulou
 Cc: r-help@r-project.org
 Subject: Re: [R] Ridge regression


 On Dec 2, 2009, at 10:42 AM, Eleni Christodoulou wrote:

  Dear list,
 
  I have a couple of questions concerning ridge regression. I am using
  the
  lm.ridge(...) function in order to fit a model to my microarray data.
  Thus *model=lm.ridge(...)*
  I retrieve some coefficients and some scales for each gene. First of
  all, I
  would like to ask

Re: [R] Ridge regression

2010-01-08 Thread Eleni Christodoulou
Thanks a lot!
Eleni

On Fri, Jan 8, 2010 at 6:35 PM, Ravi Varadhan rvarad...@jhmi.edu wrote:

  Yes, you need to have the intercept term when you predict model-based
 response.



 This is what you need:



 * ridge.test=lm.ridge(tey_values~tedata, lambda)*

 * *

 *   yest - drop(cbind(1, tedata) %*% coef(ridge.test))*



 Hope this helps,

 Ravi.


 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:
 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.html




 

 *From:* Eleni Christodoulou [mailto:elenic...@gmail.com]
 *Sent:* Friday, January 08, 2010 11:18 AM
 *To:* Ravi Varadhan
 *Cc:* David Winsemius; r-help@r-project.org

 *Subject:* Re: [R] Ridge regression



 I am sorry, I just pressed the send button by accident before completing
 my e-mail. The yest are the estimated values according to the ridge model.
 Is the way that I calculate them correct? Or should I cut the 
 *+coef(ridge.test)[1]
 *term?

 Thanks a lot!
 Eleni

 On Fri, Jan 8, 2010 at 6:16 PM, Eleni Christodoulou elenic...@gmail.com
 wrote:

 Hello again and Happy 2010!
 I was looking back at this email because I need to do some additional
 processing now. I was thinking that if I take the coef(ans) I get n+1
 coefficients. I guess that the coef(ans)[1] is the constant term... Do I
 need to add it when I calculate the estimated value for the outcome?
 For example, lets say that I have divided my data into training data and
 test data and I have the corresponding observed try_values and tey_values
 (the real values for the samples that belong to the training set and the
 test set respectively)
 Here is my code:
 *
 library(MASS)
  ridge.test=lm.ridge(tey_values~tedata,lambda)
 est-list()
 yest-numeric()
 for(i in 1:length(tey_values)){
 est[[i]]=coef(ridge.test)[-1]*tedata[i,]
 yest[i]=sum(est[[i]])+coef(ridge.test)[1]
 }*



  On Wed, Dec 2, 2009 at 8:22 PM, Ravi Varadhan rvarad...@jhmi.edu wrote:

 The help page clearly states that ans$coef is not on the original scale
 and
 are for use by the coef method.  You also see that ans$scales gives you
 the
 scales used in the computation of ans$coef.

 So, to get coefficients on the original scale, you can either use coef(ans)
 or you can divide ans$coef by ans$scales.


 X1 - runif(20)
 X2 - runif(20)
 Y - 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)

 lam - 10
 ans1 - lm.ridge(Y ~ X1 + X2, lambda = lam)

 all.equal(ans1$coef / ans1$scales, coef(ans1)[2:3] )

 Hope this helps,

 Ravi.


 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h%0Atml




 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On

 Behalf Of Ravi Varadhan
 Sent: Wednesday, December 02, 2009 12:25 PM
 To: 'David Winsemius'; 'Eleni Christodoulou'
 Cc: r-help@r-project.org
 Subject: Re: [R] Ridge regression

 You are right that the ans$coef and coef(ans) are different in ridge
 regression, where `ans' is the object from lm.ridge.  It is the coef(ans)
 that yields the coefficients on the original scale.  ans$coef is the
 coefficient of X-scaled and Y-centered version.

 Here is an example that illustrates the workings of ridge regression.

 First let us create some data:

 X1 - runif(20)
 X2 - runif(20)
 Y - 2 * X1 - 2 * X2 + rnorm(20, sd=0.1)

 lam - 10
 ans1 - lm.ridge(Y ~ X1 + X2, lambda = lam)
 ans1$coef
 coef(ans1)
 # Note that these two are different

 # Now Let us scale the variables X1 and X2 and center Y
 #
 cY - scale(Y, scale=FALSE)
 n - length(Y)
 sX1 - scale(X1) * sqrt(n/(n-1))
 sX2 - scale(X2) *  sqrt(n/(n-1))

 require(MASS)

 lam - 10
 ans2 - lm.ridge(cY ~ sX1 + sX2, lambda = lam)

 ans2$coef
 coef(ans2)
 # Now, see that the coefficients of sX1 and sX2 are the same
 # This is the connection!

 # Armed with this insight, we now compare the ans1$coef with scaled
 coefficients
 #
 ans1$coef
 c(coef(ans1)[2] * sd(X1), coef(ans1)[3] * sd(X2)) * sqrt((n-1)/n)

 # Now they are the same!

 I hope this is clear.

 Best,
 Ravi.


 
 ---

 Ravi Varadhan, Ph.D.

 Assistant

[R] lasso regression coefficients

2009-12-16 Thread Eleni Christodoulou
Dear list,

I have been trying to apply a simple lasso regression on a 10-element
vector, just to see how this method works so as to later implement it on
larger datasets. I thus create an input vector x:
* x=rnorm(10)*
I add some noise
*noise=runif(n=10, min=-0.1, max=0.1)*
and I create a simple linear model which calculates my output vector y
*y=2*x+1+noise*

I then do
   * my_data - data.matrix(x)
model = lars(my_data, y, type = 'lasso')*

I then calculate the coefficients (type=coefficients) based on the created
*model **
preds=predict.lars(model)
for(i in 1:10){
est[i]=preds$coef[2]*x[i]
}

y.estimated=est+1+noise


*Then, I apply the same function, predict.lars, but this time with
type=fit.
*preds2=predict.lars(model,my_data)*

When I compare the *y.estimated *to *preds2$fit[,2] *I see that they are not
equal...
I provide you with the returned results:

*y.estimated:*
[2.855597  1.259374  1.673388  1.625999  0.337993 -1.672998
-1.055416 2.423278  4.092116 -1.595545]

*preds2$fit[,2]:*
[2.9120115  1.1790466  1.7452670  1.7239429  0.2893512 -1.6682459
-1.1500982  2.4364527  4.1511509 -1.6098748]

I think they should be equal...Does anyone have an explanation about that?

Thanks a lot for your time!
Eleni C.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SVM regression

2009-12-12 Thread Eleni Christodoulou
Thank you very much!

Eleni


On Fri, Dec 11, 2009 at 7:19 PM, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:

 Hi Eleni,

 On Dec 11, 2009, at 12:04 PM, Eleni Christodoulou wrote:

  Dear R users,
 
  I am trying to apply SVM regression for a set of microarray data. I am
 using
  the function svm() under the package {e1071}. Can anyone tell me what
  the *residuals
  *value represents? I have some observed values *y_obs* for the parameter
  that I want to estimate and I would expect that *svm$residuals = y_obs -
  svm$fitted.
  *However, this does not happen...Does anyone have any idea on that?

 This actually is what's happening. The $residuals that are reported in the
 model are against your *scaled* y-vector.

 So, with your data:

 R m - svm(x,y)
 R all(scale(y) - predict(m,x) == m$residuals)
 [1] TRUE

 -steve

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
 Contact Info: 
 http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SVM regression

2009-12-11 Thread Eleni Christodoulou
Dear R users,

I am trying to apply SVM regression for a set of microarray data. I am using
the function svm() under the package {e1071}. Can anyone tell me what
the *residuals
*value represents? I have some observed values *y_obs* for the parameter
that I want to estimate and I would expect that *svm$residuals = y_obs -
svm$fitted.
*However, this does not happen...Does anyone have any idea on that?

Thanks a lot!
Eleni C.* *

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ridge regression

2009-12-02 Thread Eleni Christodoulou
Dear list,

I have a couple of questions concerning ridge regression. I am using the
lm.ridge(...) function in order to fit a model to my microarray data.
Thus *model=lm.ridge(...)*
I retrieve some coefficients and some scales for each gene. First of all, I
would like to ask: the real coefficients of the model are not included in
the first argument of the output but in the result of coef(model), am I
right? Moreover, what does the scale argument represent? Which is its
connection with the coefficients? The R help file os not very informative
for me...

Thank you very much in advance,
Eleni Christodoulou

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with linear model

2009-10-26 Thread Eleni Christodoulou
Dear list,

I have been searching for a week to fit a simple linear model to my data. I
have looked into the previous posts but I haven't found anything relevant to
my problem. I guess it is something simple...I just cannot see it.
I have the following data frame, named data, which is a subset of a
microarray experiment. The columns are the samples and the rows are the
probes. I binded the first line, called norm, which represents the
estimated output. I want to create a linear model which shows the
relationship between the gene expressions (rows) and the output (norm).

 *data*
GSM276723.CEL GSM276724.CEL GSM276725.CEL GSM276726.CEL
norm 0.897000  0.59  0.683000  0.949000
206427_s_at  5.387205  6.036506  8.824783 10.864122
205338_s_at  6.454779 13.143095  6.123212 12.726562
209848_s_at  6.703062  7.783330 12.175654  9.339651
205694_at5.894131  5.794516 12.876555 11.534664
201909_at   12.616538 12.913255 12.275182 12.767743
208894_at   13.049286  9.317874 12.873516 13.527182
216512_s_at  6.324789 12.783791  6.216932 12.013404
205337_at6.175940 12.158796  6.117519 12.041078
201850_at6.633013  6.465900  6.535434  7.749985
210982_s_at 12.444791  8.597388 12.197696 12.963449
GSM276727.CEL GSM276728.CEL GSM276729.CEL GSM276731.CEL
norm 0.302000  0.597000  0.27  0.53
206427_s_at  5.690357  8.014055 13.034753  5.493977
205338_s_at  5.757048  7.706341 13.258410  5.562588
209848_s_at  6.461028  7.036515 13.633649  5.874098
205694_at5.519552  5.297107  6.498811  5.146150
201909_at   12.814454 11.592632  6.594229  6.650796
208894_at   13.835359 13.028096  5.839909  6.045578
216512_s_at  6.033096  7.273650 12.669054  5.946932
205337_at5.879028  7.381713 12.633829  5.379559
201850_at9.684397  6.560014  8.523229  6.573052
210982_s_at 13.342729 12.470517  5.903681  5.658115
GSM276732.CEL GSM276735.CEL GSM276736.CEL GSM276737.CEL
norm  0.43400  0.647000  0.113000  1.00
206427_s_at  12.80257  5.645002  6.519554 13.572480
205338_s_at  13.38057  5.804107 11.090690 14.024922
209848_s_at  13.27718  6.490851  9.784199 14.101162
205694_at11.37717  5.802105  7.944963 14.060492
201909_at13.24126 12.263899 12.578315  6.443491
208894_at12.29916  7.563361  9.971493  7.094214
216512_s_at  13.00303  5.905789 10.512761 13.647573
205337_at12.63560  5.430138 10.707242 13.020312
201850_at12.71874  6.275480  6.987962 12.354580
210982_s_at  11.53559  7.225199  9.322706  6.617615
GSM276738.CEL GSM276739.CEL GSM276740.CEL GSM276742.CEL
norm  0.35700  0.967000  0.823000  1.00
206427_s_at  13.33764 13.607918 13.190551 12.387189
205338_s_at  13.65492 12.812950 12.237476 12.912605
209848_s_at  13.48525 13.435389 13.851347 12.540495
205694_at 7.70928 10.045331 13.391456 11.103841
201909_at12.47093 11.937344  6.631023  7.160071
208894_at12.20508  8.892181  6.478889  5.927860
216512_s_at  13.42313 12.151691 11.620552 12.341763
205337_at12.67544 12.036528 11.641203 12.275845
201850_at11.85481 13.172666 12.964316 12.156142
210982_s_at  11.49940  8.380404  6.121762  5.921634
GSM276743.CEL GSM276744.CEL GSM276745.CEL GSM276747.CEL
norm 0.899000  0.927000  0.754000  0.437000
206427_s_at 12.665097 12.604673 11.446630 13.000295
205338_s_at 13.261141 12.448096 13.185698 12.510952
209848_s_at 13.396711 13.882529 13.040600 12.984137
205694_at   10.888474  7.094063  8.630120 12.321685
201909_at   12.100560  6.666787 12.330600  6.572282
208894_at7.741437  8.348155 10.106442  6.009902
216512_s_at 12.830373 11.504074 12.300163 11.525958
205337_at   12.264569 11.676281 11.940917 11.618351
201850_at   11.055564 12.202366  7.327056 12.853055
210982_s_at  7.285289  8.129298  9.577032  5.924993
GSM276748.CEL GSM276752.CEL GSM276754.CEL GSM276756.CEL
norm 0.321000  0.62  0.155000  0.946000
206427_s_at  9.081283 11.446978  8.191261 13.192507
205338_s_at 13.737773 13.698520 12.983830 10.948681
209848_s_at 13.234025 12.956672 10.644642 

Re: [R] help with linear model

2009-10-26 Thread Eleni Christodoulou
Thank you all for your replies. I have tried transposing my data and before
but I did not mention it because I was getting the same error. In the
present case though it worked because I put
lm1=lm(*norm~*.,data=t(data))
instead of
lm1=lm(*fm1*, data=t(data))
where *fm1=norm~cols...*
I actually didn't know that there exists such a difference between norm~cols
and norm~.
I wonder why...

Thank you all again!
Best,
Eleni

On Mon, Oct 26, 2009 at 12:24 PM, Petr PIKAL petr.pi...@precheza.cz wrote:

 Hi


 r-help-boun...@r-project.org napsal dne 26.10.2009 10:48:51:

  Dear list,
 
  I have been searching for a week to fit a simple linear model to my
 data. I
  have looked into the previous posts but I haven't found anything
 relevant to
  my problem. I guess it is something simple...I just cannot see it.
  I have the following data frame, named data, which is a subset of a
  microarray experiment. The columns are the samples and the rows are the
  probes. I binded the first line, called norm, which represents the
  estimated output. I want to create a linear model which shows the
  relationship between the gene expressions (rows) and the output (norm).
 
   *data*
  GSM276723.CEL GSM276724.CEL GSM276725.CEL GSM276726.CEL
  norm 0.897000  0.59  0.683000  0.949000
  206427_s_at  5.387205  6.036506  8.824783 10.864122
  205338_s_at  6.454779 13.143095  6.123212 12.726562
  209848_s_at  6.703062  7.783330 12.175654  9.339651
  205694_at5.894131  5.794516 12.876555 11.534664
  201909_at   12.616538 12.913255 12.275182 12.767743
  208894_at   13.049286  9.317874 12.873516 13.527182
  216512_s_at  6.324789 12.783791  6.216932 12.013404
  205337_at6.175940 12.158796  6.117519 12.041078
  201850_at6.633013  6.465900  6.535434  7.749985
  210982_s_at 12.444791  8.597388 12.197696 12.963449
  GSM276727.CEL GSM276728.CEL GSM276729.CEL GSM276731.CEL
  norm 0.302000  0.597000  0.27  0.53
  206427_s_at  5.690357  8.014055 13.034753  5.493977
  205338_s_at  5.757048  7.706341 13.258410  5.562588
  209848_s_at  6.461028  7.036515 13.633649  5.874098
  205694_at5.519552  5.297107  6.498811  5.146150
  201909_at   12.814454 11.592632  6.594229  6.650796
  208894_at   13.835359 13.028096  5.839909  6.045578
  216512_s_at  6.033096  7.273650 12.669054  5.946932
  205337_at5.879028  7.381713 12.633829  5.379559
  201850_at9.684397  6.560014  8.523229  6.573052
  210982_s_at 13.342729 12.470517  5.903681  5.658115
  GSM276732.CEL GSM276735.CEL GSM276736.CEL GSM276737.CEL
  norm  0.43400  0.647000  0.113000  1.00
  206427_s_at  12.80257  5.645002  6.519554 13.572480
  205338_s_at  13.38057  5.804107 11.090690 14.024922
  209848_s_at  13.27718  6.490851  9.784199 14.101162
  205694_at11.37717  5.802105  7.944963 14.060492
  201909_at13.24126 12.263899 12.578315  6.443491
  208894_at12.29916  7.563361  9.971493  7.094214
  216512_s_at  13.00303  5.905789 10.512761 13.647573
  205337_at12.63560  5.430138 10.707242 13.020312
  201850_at12.71874  6.275480  6.987962 12.354580
  210982_s_at  11.53559  7.225199  9.322706  6.617615
  GSM276738.CEL GSM276739.CEL GSM276740.CEL GSM276742.CEL
  norm  0.35700  0.967000  0.823000  1.00
  206427_s_at  13.33764 13.607918 13.190551 12.387189
  205338_s_at  13.65492 12.812950 12.237476 12.912605
  209848_s_at  13.48525 13.435389 13.851347 12.540495
  205694_at 7.70928 10.045331 13.391456 11.103841
  201909_at12.47093 11.937344  6.631023  7.160071
  208894_at12.20508  8.892181  6.478889  5.927860
  216512_s_at  13.42313 12.151691 11.620552 12.341763
  205337_at12.67544 12.036528 11.641203 12.275845
  201850_at11.85481 13.172666 12.964316 12.156142
  210982_s_at  11.49940  8.380404  6.121762  5.921634
  GSM276743.CEL GSM276744.CEL GSM276745.CEL GSM276747.CEL
  norm 0.899000  0.927000  0.754000  0.437000
  206427_s_at 12.665097 12.604673 11.446630 13.000295
  205338_s_at 13.261141 12.448096 13.185698 12.510952
  209848_s_at 13.396711 13.882529 13.040600 12.984137
  205694_at   10.888474  7.094063  8.630120 12.321685
  201909_at   12.100560  

Re: [R] samr result

2008-06-11 Thread Eleni Christodoulou
Yes, here it is:
samr.obj-samr(data,resp.type=Two class unpaired, nperms=100,
center.arrays=T)

where *data *is a matrix of microarray gene expressions with genes as rows
and tissues as columns. With putting *center.arrays=T *the *data* matrix is
normalized such as each column has median=0. I would like to retrieve the
new normalized matrix, but it seems that it is not returned by *samr.* If
you have any idea on how I can find this transformed matrix I would be glad
to hear that!

Thanks again,
E.



On Tue, Jun 10, 2008 at 11:30 PM, Richardson, Patrick 
[EMAIL PROTECTED] wrote:

 Could you post your code so we can see what you are trying to do?

 Thanks,

 Patrick

 
 From: [EMAIL PROTECTED] [EMAIL PROTECTED] On
 Behalf Of Eleni Christodoulou [EMAIL PROTECTED]
 Sent: Tuesday, June 10, 2008 11:20 AM
 To: r-help@r-project.org
 Subject: [R] samr result

 Hello list!

 I have a proble trying to perform a SAM analysis using the function samr
 from the samr package. I have put the option *center.arrays=TRUE *in order
 to scale all the experiments to median=0. I would like to retrieved the
 scaled data but it seems that samr does not return it...Does anyone have
 any
 idea on this?

 Thanks a lot!!!
 E.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 This email message, including any attachments, is for ...{{dropped:10}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] samr result

2008-06-10 Thread Eleni Christodoulou
Hello list!

I have a proble trying to perform a SAM analysis using the function samr
from the samr package. I have put the option *center.arrays=TRUE *in order
to scale all the experiments to median=0. I would like to retrieved the
scaled data but it seems that samr does not return it...Does anyone have any
idea on this?

Thanks a lot!!!
E.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] which question

2008-06-06 Thread Eleni Christodoulou
Hello list,

I was trying to select a column of a data frame using the *which* command. I
was actually selecting the rows of the data frame using *which, *and then
displayed a certain column of it. The command that I was using is:
sequence=*mydata*[*which*(human[,3] %in% genes.sam.names),*9*]
In the above command, *mydata  *is my data frame, *9 *is the column which I
want to display. The rest are just other variables that I use. The
*which*command is supposed to retrieve the rows of interst. The rows
are well
retrieved, however, if for the certain row, column *9* is NA, the respective
element of column *10* is displayed. How can I fix that?

Thank you very much,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] which question

2008-06-06 Thread Eleni Christodoulou
An example is:

symbol=human[which(human[,3] %in% genes.sam.names),8]

The data* human* and *genes.sam.names* are attached. The result of the above
command is:
 symbol
 [1] CCL18  MARCO  SYT13
 [4] FOXC1  CDH3
 [7] CA12   CELSR1 NM_018440
[10] MICROTUBULE-ASSOCIATED NM_015529  ESR1
[13] PHGDH  GABRP  LGMN
[16] MMP9   BMP7   KLF5
[19] RIPK2  GATA3  NM_032023
[22] TRIM2  CCND1  MMP12
[25] LDHB   AF493978   SOD2
[28] SOD2   SOD2   NME5
[31] STC2   RBP1   ROPN1
[34] RDH10  KRTHB1 SLPI
[37] BBOX1  FOXA1  NM_005669
[40] MCCC2  CHI3L1 GSTM3
[43] LPIN1  DSC2   FADS2
[46] ELF5   CYP1B1 LMO4
[49] AL035297   NM_152398  AB018342
[52] PIK3R1 NFKBIE MLZE
[55] NFIB   NM_052997  NM_006023
[58] CPB1   CXCL13 CBR3
[61] NM_017527  FABP7  DACH
[64] IFI27  ACOX2  CXCL11
[67] UGP2   CLDN4  M12740
[70] IGKC   IGKC   CLECSF12
[73] AY069977   HOXB2  SOX11
[76]NM_017422  TLR2
[79] CKS1B  BC017946   APOBEC3B
[82]HLA-DRB1   HLA-DQB1
[85]CCL13  C4orf7
[88]NM_173552
21345 Levels:  (2 (32 (55.11 (AIB-1) (ALU (CAK1) (CAP4) (CASPASE ... ZYX

As you can see, apart from gene symbols, which is the required thing, RefSeq
ID sare also retrieved...

Thanks a lot,
Eleni






On Fri, Jun 6, 2008 at 1:23 PM, Dieter Menne [EMAIL PROTECTED]
wrote:

 Eleni Christodoulou elenichri at gmail.com writes:

  I was trying to select a column of a data frame using the *which*
 command. I
  was actually selecting the rows of the data frame using *which, *and then
  displayed a certain column of it. The command that I was using is:
  sequence=*mydata*[*which*(human[,3] %in% genes.sam.names),*9*]
 
 Please provide a running example. The *mydata* are difficult to read.


 Dieter

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] oligo ids

2008-05-19 Thread Eleni Christodoulou
Dear list,

I am having a set of human oligo ids (H26022 H22025 H34703
H20442 H25719 H300018350) which I want to map to Ensembl or RefSeq.
I am sure R has a function to do that. I downloaded the {oligo} package and
tried to use the probeNames function. Although the factor of ologo ids is an
object (as the argument to probeNames should be) I retrieve the following
error:
 probeNames(significant_genes[,1])
Error in function (classes, fdef, mtable)  :
  unable to find an inherited method for function probeNames, for
signature factor

where significant_genes is the factor with the oligo ids. Could anyone help
me with the format I should use in order to apply probeNames? Or if someone
has any other function in mind whcih can do the mapping I would be really
grateful to hear that.

Thank you all,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [BioC] oligo ids

2008-05-19 Thread Eleni Christodoulou
Thanks Sean!

Your reply was very helpful. I already got almost what I wanted. I have some
NA values but I will look if I can find them through bibliography or an
external tool.

Best Regards,
Eleni

On Mon, May 19, 2008 at 6:07 PM, Sean Davis [EMAIL PROTECTED] wrote:

 On Mon, May 19, 2008 at 10:47 AM, Eleni Christodoulou
 [EMAIL PROTECTED] wrote:
  Dear list,
 
  I am having a set of human oligo ids (H26022 H22025 H34703
  H20442 H25719 H300018350) which I want to map to Ensembl or
 RefSeq.
  I am sure R has a function to do that. I downloaded the {oligo} package
 and
  tried to use the probeNames function. Although the factor of ologo ids is
 an
  object (as the argument to probeNames should be) I retrieve the following
  error:
   probeNames(significant_genes[,1])
  Error in function (classes, fdef, mtable)  :
   unable to find an inherited method for function probeNames, for
  signature factor
 
  where significant_genes is the factor with the oligo ids. Could anyone
 help
  me with the format I should use in order to apply probeNames? Or if
 someone
  has any other function in mind which can do the mapping I would be really
  grateful to hear that.

 Hi, Eleni.  The probeNames() function is not applicable here,
 unfortunately.  You are asking a question related to annotating your
 array.  Therefore, you need an annotation package.  I think the IDs
 that you specified are Qiagen (Operon) IDs, so the place to look is in
 the annotation package associated with the Qiagen arrays:

 Assuming that you are using R 2.7.0 (you are, correct?), then you can do:

 source('http://bioconductor.org/biocLite.R')
 biocLite('hguqiagenv3.db')
 library(hguqiagenv3.db)
 mget(c('H26022','H22025'),hguqiagenenv3REFSEQ)

 The last command will return a list of mappings between those two
 oligo ids and RefSeq.  Typing:

 hguqiagenv3()

 will tell you the other annotation sources available for your qiagen
 chip.  Ensembl mappings are available, as are a bunch of other
 mappings.

 Let us know if you have more questions.

 Sean


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Significance analysis of Microarrays (SAM)

2008-05-07 Thread Eleni Christodoulou
Thanks Martin,

I also posted the question on the bioconductor list but I have no reply yet.
In the meanwhile I found out that instead of saying
d=list(data.matrix2,y,censored)

I should specify the arguments:
d=list(x=data.matrix2,y=y,censoring.status=censored)

Strange, huh? Anyway, it solves the problem.

Thanks once again,
Eleni



On Tue, May 6, 2008 at 6:43 PM, Martin Morgan [EMAIL PROTECTED] wrote:

 Hi Eleni --

 Although samr is not a Bioconductor package, you might have more luck
 asking on the Bioconductor mailing list, http://bioconductor.org. The
 obvious place to start, and probably you have already done this, is to
 ensure that the class of the objects passed to the function agree with
 the classes described on the function help page.

 Martin

 Eleni Christodoulou [EMAIL PROTECTED] writes:

  Dear list,
 
  I am trying to perform a significance analysis of a microarray
 experiment
  with survival data using the {samr} package. I have a matrix containing
 my
  data which has 17816 rows corresponding to genes, and 286 columns
  corresponding to samples. The name of this matrix is data.matrix2. Some
 of
  the first values of this matrix are:
  data.matrix2[1:3,1:5]
   GSM36777  GSM36778 GSM36779 GSM36780 GSM36781
  [1,] 1.009274 1.0740659 1.048540 1.015946 1.022650
  [2,] 1.007992 0.8768410 0.962442 1.111742 1.121150
  [3,] 0.981853 0.9606492 1.024987 1.053302 1.063408
 
   I also have the time in which each patient-sample is examined for
 relapse.
  This information is in vector y, which has length 286, and is declared
 in
  months. Indicatively:
  y[1:5]
  [1] 101 118   9 106  37
 
  Finally, I have a variable censored, which is 1 if the patient has
 relapsed
  when examined at the examined time and 0 if not. Indicatively:
  censored[1:5]
  [1] 0 0 1 0 1
 
 
  I am trying to perform the following sam analysis:
  d=list(data.matrix2,y,censored)
  samr.obj=samr(d,resp.type=Survival, nperms=20)
 
  When I am running the above commands I  get the error:
  Error in check.format(y, resp.type = resp.type, censoring.status =
  censoring.status) :
Error in input response data: response type  Survival  specified;
 error in
  censoring indicator
  In addition: Warning message:
  In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
 
 
  I really cannot understand what  is wrong with my code.  Could anyone
 please
  help me with this?
 
  Thank you all,
  Eleni
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 --
 Martin Morgan
 Computational Biology / Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N.
 PO Box 19024 Seattle, WA 98109

 Location: Arnold Building M2 B169
 Phone: (206) 667-2793


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Significance analysis of Microarrays (SAM)

2008-05-06 Thread Eleni Christodoulou
Dear list,

I am trying to perform a significance analysis of a microarray experiment
with survival data using the {samr} package. I have a matrix containing my
data which has 17816 rows corresponding to genes, and 286 columns
corresponding to samples. The name of this matrix is data.matrix2. Some of
the first values of this matrix are:
data.matrix2[1:3,1:5]
 GSM36777  GSM36778 GSM36779 GSM36780 GSM36781
[1,] 1.009274 1.0740659 1.048540 1.015946 1.022650
[2,] 1.007992 0.8768410 0.962442 1.111742 1.121150
[3,] 0.981853 0.9606492 1.024987 1.053302 1.063408

 I also have the time in which each patient-sample is examined for relapse.
This information is in vector y, which has length 286, and is declared in
months. Indicatively:
y[1:5]
[1] 101 118   9 106  37

Finally, I have a variable censored, which is 1 if the patient has relapsed
when examined at the examined time and 0 if not. Indicatively:
censored[1:5]
[1] 0 0 1 0 1


I am trying to perform the following sam analysis:
d=list(data.matrix2,y,censored)
samr.obj=samr(d,resp.type=Survival, nperms=20)

When I am running the above commands I  get the error:
Error in check.format(y, resp.type = resp.type, censoring.status =
censoring.status) :
  Error in input response data: response type  Survival  specified; error in
censoring indicator
In addition: Warning message:
In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'


I really cannot understand what  is wrong with my code.  Could anyone please
help me with this?

Thank you all,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sensitivity analysis

2008-04-16 Thread Eleni Christodoulou
Hello list,

I am performing a sensitivity analysis using the package ROCR. I am using
the class prediction in this aim. My question is, could anyone tell me
what the vector cutoffs represent in the result?

Thank you all,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROC analysis

2008-03-19 Thread Eleni Christodoulou
Hello list,

I am trying to perform ROC analysis and count the AUC in order to validate
my results. I use package ROCR. I would like to count the AUC not under the
cutoff found by performance but to use another cutoff that I calculate.
How could I change the following  command in order to get what I want?
perform=performance(pred,measure=auc,x.measure=cutoff), where pred is a
prediction object.

Thank you very much,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROC analysis

2008-03-19 Thread Eleni Christodoulou
Richard, thanks,  I think it will work. I will calculate the cutoff value
and then, from the prediction object, find the fpr that is related to it and
put it as argument to performance. I will keep you informed.

Eleni

On Wed, Mar 19, 2008 at 11:51 AM, Richard Pearson 
[EMAIL PROTECTED] wrote:

 Eleni

 Does the fpr.stop argument do what you want? This is described in
 ?performance under the details of the auc measure. Try, e.g.

 perform=performance(pred,measure=auc,fpr.stop=0.5)


 Richard.


 Eleni Christodoulou wrote:
  Hello list,
 
  I am trying to perform ROC analysis and count the AUC in order to
 validate
  my results. I use package ROCR. I would like to count the AUC not under
 the
  cutoff found by performance but to use another cutoff that I
 calculate.
  How could I change the following  command in order to get what I want?
  perform=performance(pred,measure=auc,x.measure=cutoff), where pred
 is a
  prediction object.
 
  Thank you very much,
  Eleni
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] t.test p-Value

2008-03-05 Thread Eleni Christodoulou
Hello list,

I am trying to apply the paired t.test between diseased and not diseased
patients to identify genes that are more expressed in the one situation
under the other. In order to retrieve the genes that are more expressed in
the positive disease state I do:
p.values-c()
for(i in 1:length(Significant[,1])){
p.values[i]-try(t.test(positive[i,],negative[i,],alternative
=greater)$p.value)
}

which(p.values0.01)


where Significant is my matrix of  genes  and their expression in tumors and
positive, negative are subsets of thes matrix.
Whn p0.01, I reject the null hypothesis and I accept the alternative one,
that I have greater gene expression in positive than in negative.
I assume I must be doing sth wrong because the heatmap that I get with the
genes that pass the filter of p-value is wrong.

Could anyone help me with this?

thanks a lot,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test p-Value

2008-03-05 Thread Eleni Christodoulou
I am sorry, the test is unpaired...But my question remains

Thanks,
Eleni

On Wed, Mar 5, 2008 at 2:33 PM, Eleni Christodoulou [EMAIL PROTECTED]
wrote:

 Hello list,

 I am trying to apply the paired t.test between diseased and not diseased
 patients to identify genes that are more expressed in the one situation
 under the other. In order to retrieve the genes that are more expressed in
 the positive disease state I do:
 p.values-c()
 for(i in 1:length(Significant[,1])){
 p.values[i]-try(t.test(positive[i,],negative[i,],alternative
 =greater)$p.value)
 }

 which(p.values0.01)


 where Significant is my matrix of  genes  and their expression in tumors
 and positive, negative are subsets of thes matrix.
 Whn p0.01, I reject the null hypothesis and I accept the alternative one,
 that I have greater gene expression in positive than in negative.
 I assume I must be doing sth wrong because the heatmap that I get with the
 genes that pass the filter of p-value is wrong.

 Could anyone help me with this?

 thanks a lot,
 Eleni



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t.test p-Value

2008-03-05 Thread Eleni Christodoulou
On Wed, Mar 5, 2008 at 2:05 PM, ian white [EMAIL PROTECTED] wrote:

 Don't you need to make some allowance for multiple testing? E.g. to get
 a experiment-wise significance level of 0.01 you need

 which(p.values  very small number)

 where the very small number is approximately 0.01/(total number of
 genes).

 On Wed, 2008-03-05 at 14:38 +0200, Eleni Christodoulou wrote:
  I am sorry, the test is unpaired...But my question remains
 
  Thanks,
  Eleni
 
  On Wed, Mar 5, 2008 at 2:33 PM, Eleni Christodoulou [EMAIL PROTECTED]
 
  wrote:
 
   Hello list,
  
   I am trying to apply the paired t.test between diseased and not
 diseased
   patients to identify genes that are more expressed in the one
 situation
   under the other. In order to retrieve the genes that are more
 expressed in
   the positive disease state I do:
   p.values-c()
   for(i in 1:length(Significant[,1])){
   p.values[i]-try(t.test(positive[i,],negative[i,],alternative
   =greater)$p.value)
   }
  
   which(p.values0.01)
  
  
   where Significant is my matrix of  genes  and their expression in
 tumors
   and positive, negative are subsets of thes matrix.
   Whn p0.01, I reject the null hypothesis and I accept the alternative
 one,
   that I have greater gene expression in positive than in negative.
   I assume I must be doing sth wrong because the heatmap that I get with
 the
   genes that pass the filter of p-value is wrong.
  
   Could anyone help me with this?
  
   thanks a lot,
   Eleni
  
  
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cox model+ROCR

2008-03-03 Thread Eleni Christodoulou
Dear list,

I am trying to build a cox model and then perform ROC analysis in order to
retrieve some genes that are correlated with breast cancer. When I calculate
the hazard score taking into account different numbers of genes and their
coefficients ( I am trying to find the pest predictor number of genes), I
retrieve from around 1 values (for few genes included ) to size of e+80
values (for many genes included).
I am using the prediction method from the ROCR package which takes as
arguments the calculated scores and the true class scores. I really don't
know what to compare my values with, because the only data that I have
available are the time to relapse or last follow-up (months) and the relapse
score (1=TRUE, 0=FALSE) of the patients. I have never performed ROC analysis
before and I am a bit lost...
Any help with this is  really very welcome!

Thank you all,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Van't Veer paper on breast cancer

2008-02-19 Thread Eleni Christodoulou
Hello all,

I am working at the FORTH institute in Crete and it's been a long now that I
am trying to reproduce the results of the paper :
Gene expression profiling predits clinical outcome of breast cancer, by
Van't Veer et al. It has been published in NATURE, vol 415,
31 January 2002.
http://www.nature.com/nature/journal/v415/n6871/full/415530a.html
I am facing some difficulties in building the classifier and I was wondering
if someone else has worked on it and could give me some help.

Thank you all,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kaplan Meier function

2008-02-14 Thread Eleni Christodoulou
Hi all,

I am trying to draw a Kaplan-Meier curve and I found online that Kaplan -
Meier estimates are computed with a function called km in the event package.
Is there an update for that because when I choose to download packages in
R,. there is no package called event, even though I have selected all the
repositories.

Thanks in advance,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kaplan Meier function

2008-02-14 Thread Eleni Christodoulou
Thank you all for the replies!

Eleni

On Thu, Feb 14, 2008 at 4:10 PM, Dimitris Rizopoulos 
[EMAIL PROTECTED] wrote:

 check function survfit() in package survival.

 Best,
 Dimitris

 
 Dimitris Rizopoulos
 Ph.D. Student
 Biostatistical Centre
 School of Public Health
 Catholic University of Leuven

 Address: Kapucijnenvoer 35, Leuven, Belgium
 Tel: +32/(0)16/336899
 Fax: +32/(0)16/337015
 Web: http://med.kuleuven.be/biostat/
 
 http://www.student.kuleuven.be/~m0390867/dimitris.htmhttp://www.student.kuleuven.be/%7Em0390867/dimitris.htm


 - Original Message -
 From: Eleni Christodoulou [EMAIL PROTECTED]
 To: r-help@r-project.org
 Sent: Thursday, February 14, 2008 2:50 PM
 Subject: [R] Kaplan Meier function


  Hi all,
 
  I am trying to draw a Kaplan-Meier curve and I found online that
  Kaplan -
  Meier estimates are computed with a function called km in the event
  package.
  Is there an update for that because when I choose to download
  packages in
  R,. there is no package called event, even though I have selected
  all the
  repositories.
 
  Thanks in advance,
  Eleni
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cox model

2008-02-13 Thread Eleni Christodoulou
Hmm...I see. I think I will give a try to the univariate analysis
nonetheless...I intend to catch the p-values for each gene and select the
most significant from these...I have seen it in several papers.

Best Regards,
Eleni

On Feb 13, 2008 2:59 PM, Terry Therneau [EMAIL PROTECTED] wrote:

  What you appear to want are all of the univariate models.  You can get
 this
 with a loop (and patience - it won't be fast).

 ngene - ncol(genes)
 coefmat - matrix(0., nrow=ngene, ncol=2)
 for (i in 1:ngene) {
tempfit - coxph(Surv(time, relapse) ~ genes[,i])
coefmat[i,] - c(tempfit$coef, sqrt(tempfit$var))
}


  However, the fact that R can do this for you does not mean it is a good
 idea.
 In fact, doing all of the univariate tests for a microarray has been shown
 by
 many people to be a very bad idea.  There are several approaches to deal
 with
 the key issues, which you should research before going forward.

  Terry Therneau



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dimnames

2008-02-13 Thread Eleni Christodoulou
What if you just removed the first column from your matrix:
XX-XX[,2:length(XX[1,]))

so you have a new matrix without the first column and save this second one
to a file?

Regards,
Eleni

On Feb 13, 2008 3:06 PM, Roberto Olivares Hernandez [EMAIL PROTECTED] wrote:

 Hi,

 I used the write.table function to save data in txt file, and this is the
 output:


 V1  V2  V3  V4
 1   YAL005C  21  14  11
 2   YAL007C   2   1   4
 3   YAL012W   8  16   3
 4   YAL016W  24  23  23
 5   YAL019W   3   3   2
 6   YAL020C   2   4   2
 7   YAL021C   7   5   5
 8   YAL022C   3   1   2


 but I  need to remove the dimnames  (first column)

 I tried to use dimnames function to remove it and then save it, but still,
 the output is the same

 These are the command lines,

 XX #matrix
 dimnames(XX)-NULL
 write.table(XX,XX.txt,quote=FALSE,sep=\t)



 Thanks in advance
 Roberto



[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cox model

2008-02-12 Thread Eleni Christodoulou
Hi David,

The problem is that I need all these regressors. I need a coefficient for
every one of them and then rank them according to that coefficient.

Thanks,
Eleni

On Feb 12, 2008 4:54 PM, [EMAIL PROTECTED] wrote:

 Hi Eleni,

 I am not an expert in R or statistics but in my opinion you have too
 many regressors compared to the number of observations and that might
 be the reason why you get the error. Others might say better but as
 far as I know, having only 80 observations, it is a good idea to first
 filter your list of variables down to a few tenths.


 HTH

 David

  Hello R-community,
 
  It's been a week now that I am struggling with the implementation of
 a cox
  model in R. I have 80 cancer patients, so 80 time measurements and 80
  relapse or no measurements (respective to censor, 1 if relapsed over
 the
  examined period, 0 if not). My microarray data contain around 18000
 genes.
  So I have the expressions of 18000 genes in each of the 80 tumors
 (matrix
  80*18000). I would like to build a cox model in order to retrieve
 the most
  significant genes (according to the p-value). The command that I am
 using
  is:
 
  test1 - list(time,relapse,genes)
  coxph( Surv(time, relapse) ~ genes, test1)
 
  where time is a vector of size 80 containing the times, relapse is a
 vector
  of size 80 containing the relapse values and genes is a matrix
 80*18000.
  When I give the coxph command I retrieve an error saying that cannot
  allocate vector of size 2.7Mb  (in Windows). I also tried linux and
 then I
  receive error that maximum memory is reached. I increase the memory
 by
  initializing R with the command:
  R --min-vsize=10M --max-vsize=250M --min-nsize=1M --max-nsize=200M
 
  I think it cannot get better than that because if I try for example
  max-vsize=300 the memomry capacity is stored as NA.
 
  Does anyone have any idea why this happens and how I can overcome it?
 
  I would be really grateful if you could help!
  It has been bothering me a lot!
 
  Thank you all,
  Eleni
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cox model

2008-02-12 Thread Eleni Christodoulou
Hello R-community,

It's been a week now that I am struggling with the implementation of a cox
model in R. I have 80 cancer patients, so 80 time measurements and 80
relapse or no measurements (respective to censor, 1 if relapsed over the
examined period, 0 if not). My microarray data contain around 18000 genes.
So I have the expressions of 18000 genes in each of the 80 tumors (matrix
80*18000). I would like to build a cox model in order to retrieve the most
significant genes (according to the p-value). The command that I am using
is:

test1 - list(time,relapse,genes)
coxph( Surv(time, relapse) ~ genes, test1)

where time is a vector of size 80 containing the times, relapse is a vector
of size 80 containing the relapse values and genes is a matrix 80*18000.
When I give the coxph command I retrieve an error saying that cannot
allocate vector of size 2.7Mb  (in Windows). I also tried linux and then I
receive error that maximum memory is reached. I increase the memory by
initializing R with the command:
R --min-vsize=10M --max-vsize=250M --min-nsize=1M --max-nsize=200M

I think it cannot get better than that because if I try for example
max-vsize=300 the memomry capacity is stored as NA.

Does anyone have any idea why this happens and how I can overcome it?

I would be really grateful if you could help!
It has been bothering me a lot!

Thank you all,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Memory problem?

2008-01-30 Thread Eleni Christodoulou
Hello R users,

I am trying to run a cox model for the prediction of relapse of 80 cancer
tumors, taking into account the expression of 17000 genes. The data are
large and I retrieve an error:
Cannot allocate vector of 2.4 Mb. I increase the memory.limit to 4000
(which is the largest supported by my computer) but I still retrieve the
error because of other big variables that I have in the workspace. Does
anyone know how to overcome this problem?

Many thanks in advance,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] select repositories under linux

2008-01-22 Thread Eleni Christodoulou
Hi all,

I am trying to install the package GEOquery in unix. I have downloaded the
standard version of R and this package is not contained in the default. I
know that I can select repositories under windows but I don't know how to do
it in unix. Does anyone have any idea on this?

Thank you in advance,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Clustering

2007-11-29 Thread Eleni Christodoulou
Thank you very much! I had misunderstood it's true...

On Nov 28, 2007 6:28 PM, Birgit Lemcke [EMAIL PROTECTED] wrote:

 Hello Eleni,

 as far as I understood and used agnes() the method argument
 determines only the clustering method.
 If you use diss=TRUE the distances should be taken from the distance
 matrix.

 Birgit

 Am 28.11.2007 um 12:18 schrieb Eleni Christodoulou:

  Hello all!
 
  I am performingsome clustering analysis on microarray data using
  agnes{cluster} and I have created my own dissimilarity matrix
  according to a
  distance measure different from euclidean or manhattan etc. My
  question
  is,  if I choose for example method=complete, how are the distances
  between the elements calculated? Are they taken form the dissimilarity
  matrix I have provided as the first argument?
  clust.complete.agnes-agnes(as.dist(D),diss=TRUE,method=complete)
 
 
  Thank you very much,
  Eleni
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.

 Birgit Lemcke
 Institut für Systematische Botanik
 Zollikerstrasse 107
 CH-8008 Zürich
 Switzerland
 Ph: +41 (0)44 634 8351
 [EMAIL PROTECTED]







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Clustering

2007-11-28 Thread Eleni Christodoulou
Hello all!

I am performingsome clustering analysis on microarray data using
agnes{cluster} and I have created my own dissimilarity matrix according to a
distance measure different from euclidean or manhattan etc. My question
is,  if I choose for example method=complete, how are the distances
between the elements calculated? Are they taken form the dissimilarity
matrix I have provided as the first argument?
clust.complete.agnes-agnes(as.dist(D),diss=TRUE,method=complete)


Thank you very much,
Eleni

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NA values

2007-11-21 Thread Eleni Christodoulou
Yes, thanks a lot! It works fine!

Eleni

On Nov 21, 2007 2:03 PM, Ted Harding [EMAIL PROTECTED] wrote:

 On 21-Nov-07 11:15:32, Eleni Christodoulou wrote:
  Hi all!
  I am new to R and I would like to ask you the following question:
  How can I substitute the NA values with 0 in a data frame?
  I cannot find a command to check if a value is NA...
 
  Thank you very much!
  Eleni

 As has been said, is.na() is the function which determines
 whether something has value NA (result=TRUE) or not (result=FALSE).

 is.na() will work nicely with dataframes (also, of course, with
 structures such as vectors, matrices and arrays). Example:

   dummy-data.frame(X1=c(101,102,103,104,NA,106),
 X2=c(201,202,203,NA,205,206))

   dummy
 #   X1  X2
 #1 101 201
 #2 102 202
 #3 103 203
 #4 104  NA
 #5  NA 205
 #6 106 206

   dummy[is.na(dummy)] - 0

   dummy
 #   X1  X2
 #1 101 201
 #2 102 202
 #3 103 203
 #4 104   0
 #5   0 205
 #6 106 206

 Hoping this makes it clear!
 Ted.

 
 E-Mail: (Ted Harding) [EMAIL PROTECTED]
 Fax-to-email: +44 (0)870 094 0861
 Date: 21-Nov-07   Time: 12:03:33
 -- XFMail --


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.