[R] repeating a loop

2011-11-07 Thread Amit Patel
Hi

I have implented boxplots in my script to create box plots

BoxplotsCheck - readline(prompt = Would you like to create boxplots for any 
Feature? (y/n):)
  if (BoxplotsCheck  == y){
    BoxplotsFeature - readline(prompt = Which Feature would you like to 
create a Boxplot for?:)
    BoxplotsFeature - as.numeric(BoxplotsFeature)
    BoxplotsData - as.numeric(which(PCIList == BoxplotsFeature))
    BoxplotsData - TotalIntensityList[BoxplotsData,]
    BoxplotsHeading - paste(Tukey boxplot (including outliers) for PCI , 
BoxplotsFeature , sep = )
    bplot(as.numeric(BoxplotsData), GroupingList, style = tukey, outlier = 
TRUE, 
  col=red, main = BoxplotsHeading,
    xlab = Groups, ylab = Normalised Intensity, plot = TRUE)
    BoxplotsFilename - paste(BoxplotsFeature, _Boxplot, sep = )
    savePlot(filename = BoxplotsFilename, type = jpeg, device = dev.cur(), 
restoreConsole = TRUE)
    

RepeatPlot - readline(prompt = Would you like to create another boxplot for 
any Feature? (y/n):)
}


If the user inputs y for BoxplotsCheck then a boxplot is saved and creatyed 
based on a user choice.

I want to include the option to do another boxplot if needed (i.e. another user 
prompt after saveplot returning another y or n for the variable RepeatPlot   
) how can i do this. I guess some kind of loop continuing whileRepeatPlot == y

Can anyone help

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with plotting plsr loadings

2011-06-08 Thread Amit Patel
Hi

I am attempting to do a loadings plot from a plsr object. I have managed to do 
this using the gasoline data that comes with the pls package. However when I 
conduct this on my dataset i get the following error message. 


plot(BHPLS1, loadings, comps = 1:2, legendpos = topleft, labels = 
numbers, 
xlab = nm)

Error in loadingplot.default(x, ...) : 
  Could not convert variable names to numbers.


 str(BHPLS1_Loadings)
 loadings [1:8892, 1:60] -0.00717 0.00414 0.02611 0.00468 -0.00676 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:8892] PCIList1 PCIList2 PCIList3 PCIList4 ...
  ..$ : chr [1:60] Comp 1 Comp 2 Comp 3 Comp 4 ...
 - attr(*, explvar)= Named num [1:60] 2.67 4.14 4.41 3.55 2.59 ...
  ..- attr(*, names)= chr [1:60] Comp 1 Comp 2 Comp 3 Comp 4 ...

Can anyone see the problem??

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with Memory Problems (cannot allocate vector of size)

2011-05-18 Thread Amit Patel
While doing pls I found the following problem

 BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife 
 = 
FALSE, validation = LOO)

when not enabling jackknife the command works fine, but when trying to enable 
jackknife i get the following error. 



BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, jackknife = 
TRUE, validation = LOO)
Error: cannot allocate vector of size 289.1 Mb

I am dealing with a very large dataset

str(PLSdata)
'data.frame':   40 obs. of  2 variables:
 $ GroupingList: int  1 1 1 1 1 1 1 1 1 1 ...
 $ PCIList : AsIs [1:40, 1:94727] 0 0 0 0 0 0 0 0 0 0 ...
  ..- attr(*, dimnames)=List of 2
  .. ..$ : chr  X X.1 X.12 X.13 ...
  .. ..$ : NULL

object.size(PLSdata)/1048600
28.9113560938394 bytes

How can i get around this memory shortage
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with PLSR with jack knife

2011-05-17 Thread Amit Patel
Hi

I am analysing a dataset of 40 samples each with 90,000 intensity measures for 
various peptides. I am trying to identify the Biomarkers (i.e. most significant 
peptides). I beleive that PLS with jack knifing, or alternativeley 
CMV(cross-model-validation) are multivariateThe 40 samples belong to four 
different groups. 


I have managed to conduct the plsr using the commands:

BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation = 
LOO)

and

BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = 10, data = PLSdata, validation = 
CV)

I have also used the following command to obtain the loadings

BHPLS1_Loadings - loadings(BHPLS1)

Now I am unsure of how to utilise these to identify the significant variables. 
Do I need to use any loops?

str(BHPLS1_Loadings) 
 loadings [1:94727, 1:10] -0.00113 -0.03001 -0.00059 -0.00734 -0.02969 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:94727] PCIList1 PCIList2 PCIList3 PCIList4 ...
  ..$ : chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ...
 - attr(*, explvar)= Named num [1:10] 14.57 6.62 7.59 5.91 3.26 ...
  ..- attr(*, names)= chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ...


Many thanks in advance
AK

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with PLSR Loadings

2011-05-17 Thread Amit Patel
Hi

When I call for the loadings of my plsr using the command,

x - loadings(BHPLS1)

my loadings contain variable names rather than numbers.

str(x) 
 loadings [1:94727, 1:10] -0.00113 -0.03001 -0.00059 -0.00734 -0.02969 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:94727] PCIList1 PCIList2 PCIList3 PCIList4 ...
  ..$ : chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ...
 - attr(*, explvar)= Named num [1:10] 14.57 6.62 7.59 5.91 3.26 ...
  ..- attr(*, names)= chr [1:10] Comp 1 Comp 2 Comp 3 Comp 4 ...


Here is the structure of teh data used to conduct plsr

str(PLSdata)
'data.frame':   40 obs. of  2 variables:
 $ GroupingList: int  1 1 1 1 1 1 1 1 1 1 ...
 $ PCIList : AsIs [1:40, 1:94727] 42.01749 40.85915 65.01948 
55.98204 61.71673 ...
  ..- attr(*, dimnames)=List of 2
  .. ..$ : chr  X X.1 X.12 X.13 ...
  .. ..$ : NULL

Because of this I am not able to do a loadings plot

plot(BHPLS1, loadings, comps = 1:2, legendpos = topleft, labels = 
numbers, 
xlab = nm)
Error in loadingplot.default(x, ...) : 
  Could not convert variable names to numbers.

What am I doing wrong

Thanks in advance
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Help with PLSR

2011-05-12 Thread Amit Patel


Hi 
I am attempting to use plsr which is part of the pls package in r. I 
amconducting analysis on datasets to  identify which proteins/peptides are 
responsible for the variance between sample groups (Biomarker Spoting) in a 
multivariate fashion. 


I have a dataset in R called FullDataListTrans. as you can see below the 
structure of the data is 40 different rows representing a sample and 94,272 
columns each representing a peptide.

str(FullDataListTrans)
 num [1:40, 1:94727] 42 40.9 65 56 61.7 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:40] X X.1 X.12 X.13 ...
  ..$ : NULL

I have also created a vector GroupingList which gives the groupnames for each 
respective sample(row).

 GroupingList
 [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4
[39] 4 4
 str(GroupingList)
 int [1:40] 1 1 1 1 1 1 1 1 1 1 ...

I am now stuck while conducting the plsr. I have tried various methods of 
creating structured lists etc  and have got nowhere. I have also tried many 
incarnations of 


BHPLS1 - plsr(GroupingList ~ PCIList, ncomp = FeaturePresenceExpected[1], data 
= FullDataListTrans, validation = LOO)

Where am I going wrong. Also what is the easiest method to identify which of 
the 
94,000 peptides are most important to the variance between groups.

Thanks in advance for any help

Amit Patel
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with Amelia

2010-12-22 Thread Amit Patel
Hi
I have used the amelia command from the Amelia R package. this gives me a 
number 
of imputed datasets. 

This may be a silly question, but i am not a statistician, but I am not sure 
how 
to combine these results to obtain the imputed dataset to usse for further 
statistical analysis. I have looked through the amelia and zelig manuals but 
still can not find the answer. This maybe because I dont understand the science 
behind it. Can anyone help

Thanks in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with knn.impute

2010-12-22 Thread Amit Patel
Hi 
I have a dataset from biological data with forty samples whichh relate to four 
different treatments. Each sample has thousands of values but as usuual 
contains 
missing values
I want to use knn to imput these missing values. I am doing tthis using 
knn.impute. Do I need to specify the various groups or can I just use the 
knn.impute command on the whole dataset together.

Also I am setting the maxp argument to the total number of values for each 
sample. Is this the correct thing to do?

Thanks in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with amelia

2010-12-22 Thread Amit Patel
Hi 
I have a dataset from biological data with forty samples whichh relate to four 
different treatments. Each sample has thousands of values but as usuual 
contains 

missing values

I want to use EM to imput these missing values. I am doing tthis using 
amelia. Do I need to specify the various groups or can I just use the 
amelia command on the whole dataset (zz) together.

zzAmelia - amelia(zz[,-1], m=5)

Also im not sure which value of m should be used



Thanks in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with decimal points

2010-09-07 Thread Amit Patel
Hi


I have found a little problem with an R script. I am trying to merge some data 
and am finding something unusual going on. As shown below I am trying to 
assign (MatchedValues[Value2,Value]) to  (ClusteredData[k,Value]) which are two 
separate dataframes.

1) By the following command you can see that the value im transferring 
is 481844.03

 MatchedValues[Value2,Value]
[1] 481844.03
6618 Levels: 1.00E+07 1.01E+07 1.02E+07 1.04E+07 1.05E+07 1.06E+07 ... Raw


2) But when I try to replace the values using the command i get a value of 4420


ClusteredData[k,Value] - MatchedValues[Value2,Value]

 ClusteredData[k,Value]
[1] 4420


3) So what am I not doing. How can I keep that same value of 481844.03
I have tried


 as.double(MatchedValues[Value2,Value])
[1] 4420


 as.numeric(MatchedValues[Value2,Value])
[1] 4420




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with unexpected symbol errors

2010-09-06 Thread Amit Patel
Hi

I have got a long script which will not run for me as i keep getting errors :

 source(clusterfixV1_4.r)
Error in source(clusterfixV1_4.r) : 
clusterfixV1_4.r: unexpected symbol at
158: eck[k,2] - as.numeric(1)
159:   #ClusterInfo[k,2] - Clustered

I have sorted all the ones i can but i am having a problem here
Can anyone tell me the cause of these problems. Its not a very short or 
straightforward script so i dont expect you to go through the whole thing but 
it 
would be great if you can give me an indication as to what I may be doing wrong.
I havent attached the data that the script uses because I'm pretty sure the 
principles i have used are right
when running the script i get the above error on line 158 and 159
I am more than happy to provide further information(e.g the dataset) if it helps

Many thanks in advance

Amit Patel


  __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Error in rowSums REPOST

2010-08-13 Thread Amit Patel
For the query below I have also included the follwing information. Thanks for 
your replies

 str(FeaturePresenceMatrix)
 chr [1:65530, 1:40] 0 0 0 0 1 0 0 0 0 ...
 - attr(*, dimnames)=List of 2
  ..$ : chr [1:65530] 4 5 6 7 ...
  ..$ : chr [1:40] X1 X2 X3 X4 ...
 ?class
 class(FeaturePresenceMatrix)
[1] matrix

Amit Patel wrote:

 Hi 
 I am trying to calculate the row sums of a matrix i have created
 The matrix ( FeaturePresenceMatrix) has been created by
 
 1) Read csv
 2) Removing unnecesarry data using [-1:4,] command
 3) replacing all the NA values with as.numeric(0) and all others with 
as.numeric (1)
 
 When I carry out the command
 
 TotalFeature - rowrowSums(FeaturePresenceMatrix, na.rm = TRUE)
 
 I get the following error. 
 Error in rowSums(FeaturePresenceMatrix, na.rm = TRUE) :   'x' must be numeric
 
 Any tips onhow I can get round this?

Yes, follow the posting guide and give the list a reproducible
example. We don't know a critical piece of information,
the class of your data. We know it's *not* numeric though,
which is what it needs to be.  Use ?class, ?str, and
possibly give us a small sample with ?dput. That way, we can
reproduce the error.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in rowSums

2010-08-12 Thread Amit Patel
Hi 

I am trying to calculate the row sums of a matrix i have created
The matrix ( FeaturePresenceMatrix) has been created by

1) Read csv
2) Removing unnecesarry data using [-1:4,] command
3) replacing all the NA values with as.numeric(0) and all others with 
as.numeric 
(1)

When I carry out the command

TotalFeature - rowrowSums(FeaturePresenceMatrix, na.rm = TRUE)

I get the following error. 

Error in rowSums(FeaturePresenceMatrix, na.rm = TRUE) : 
  'x' must be numeric

Any tips onhow I can get round this?

Thanks in Advance
Amit





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help With ANOVA

2010-07-06 Thread Amit Patel
Hi I needed some help with ANOVA

I have a problem with My ANOVA
analysis. I have a dataset with a known ANOVA p-value, however I can
not seem to re-create it in R.

I have created a list (zzzanova) which contains
1)Intensity Values
2)Group Number (6 Different Groups)
3)Sample Number (54 different samples)
this is created by the script in Appendix 1

I then conduct ANOVA with the command
 zzz.aov - aov(Intensity ~ Group, data = zzzanova)

I get a p-value of
Pr(F)1 
0.9483218 

The
expected p-value is 0.00490 so I feel I maybe using ANOVA incorrectly
or have put in a wrong formula. I am trying to do an ANOVA analysis
across all 6 Groups. Is there something wrong with my formula. But I think I
have made a mistake in the formula rather than anything else.




APPENDIX 1

datalist - c(-4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, 
-4.60517, 3.003749, -4.60517, 
2.045314, 2.482557, -4.60517, -4.60517, -4.60517, -4.60517, 1.592743, 
-4.60517,
-4.60517, 0.91328, -4.60517, -4.60517, 1.827744, 2.457795, 0.355075, 
-4.60517, 2.39127,
2.016987, 2.319903, 1.146683, -4.60517, -4.60517, -4.60517, 1.846162, 
-4.60517, 2.121427, 1.973118,
-4.60517, 2.251568, -4.60517, 2.270724, 0.70338, 0.963816, -4.60517,  
0.023703, -4.60517,
2.043382, 1.070586, 2.768289, 1.085169, 0.959334, -0.02428, -4.60517, 
1.371895, 1.533227)

zzzanova -
structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)), 
Group = structure(c(1,1,1,1,1,1,1,1,1,
 2,2,2,2,2,2,2,2,
 3,3,3,3,3,3,3,3,3,
 4,4,4,4,4,4,4,4,4,4,
 5,5,5,5,5,5,5,5,5,
 6,6,6,6,6,6,6,6,6), .Label = c(Group1, Group2, Group3, Group4, 
Group5, Group6), class = factor), 
Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40,41,42,43,44,45,46,47,48,49,50,51,52,53,54)
))
, .Names = c(Intensity, 
Group, Sample), row.names = 
c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54),class = data.frame)




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help With ANOVA (corrected please ignore last email)

2010-07-06 Thread Amit Patel
Sorry i had a misprint in the appendix code in the last email


Hi I needed some help with ANOVA

I have a problem with My ANOVA
analysis. I have a dataset with a known ANOVA p-value, however I can
not seem to re-create it in R.

I have created a list (zzzanova) which contains
1)Intensity Values
2)Group Number (6 Different Groups)
3)Sample Number (54 different samples)
this is created by the script in Appendix 1

I then conduct ANOVA with the command
 zzz.aov - aov(Intensity ~ Group, data = zzzanova)

I get a p-value of
Pr(F)1 
0.9483218 

The
expected p-value is 0.00490 so I feel I maybe using ANOVA incorrectly
or have put in a wrong formula. I am trying to do an ANOVA analysis
across all 6 Groups. Is there something wrong with my formula. But I think I
have made a mistake in the formula rather than anything else.




APPENDIX 1

datalist - c(-4.60517, -4.60517, -4.60517, -4.60517, -4.60517, -4.60517, 
-4.60517, 3.003749, -4.60517, 
2.045314, 2.482557, -4.60517, -4.60517, -4.60517, -4.60517, 1.592743, 
-4.60517,
-4.60517, 0.91328, -4.60517, -4.60517, 1.827744, 2.457795, 0.355075, 
-4.60517, 2.39127,
2.016987, 2.319903, 1.146683, -4.60517, -4.60517, -4.60517, 1.846162, 
-4.60517, 2.121427, 1.973118,
-4.60517, 2.251568, -4.60517, 2.270724, 0.70338, 0.963816, -4.60517,  
0.023703, -4.60517,
2.043382, 1.070586, 2.768289, 1.085169, 0.959334, -0.02428, -4.60517, 
1.371895, 1.533227)

zzzanova -
structure(list(Intensity = datalist, 
Group = structure(c(1,1,1,1,1,1,1,1,1,
 2,2,2,2,2,2,2,2,
 3,3,3,3,3,3,3,3,3,
 4,4,4,4,4,4,4,4,4,4,
 5,5,5,5,5,5,5,5,5,
 6,6,6,6,6,6,6,6,6), .Label = c(Group1, Group2, Group3, Group4, 
Group5, Group6), class = factor), 
Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40,41,42,43,44,45,46,47,48,49,50,51,52,53,54)
))
, .Names = c(Intensity, 
Group, Sample), row.names = 
c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54),class = data.frame)




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with Loops

2010-05-13 Thread Amit Patel
Hi

I have tried many attempts but cant get the loop right, as I am not a strong 
programmer. What I am basically trying to do is compare 2 spreadsheets. The 
problem is that one of them only contain a portion of the overall data 
(TESTSAMP), where the other has a full datasetFULLSAMP. From the complete set I 
would like to remove the rows of data which are not in the TESTSAMP. Column 1 
contains the sample numbers which can be used to identify samples. Does anyone 
have any suggestions? 

I have tried various things like double loops and so on, but I am sure there is 
an easier way or function to do this.

i tried this method, but Im not sure how to only keep looping until a match is 
found. I dont understand how repeat loops work in R.

for (i in 1:length(FULLSAMP[,1])) {

if (FULLSAMP[i,1] != TESTSAMP[i,1]) {
FULLSAMP - FULLSAMP[-i,]
}


Thanks in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with TukeyHSD

2010-04-15 Thread Amit Patel
 Hi

I am conducting ANOVA using the aov function
I am also conducting TukeyHSD to obtain which of the groups show variance
How can I obtain the first three p values from the list below?

zzz.aov - aov(Intensity ~ Group, data = zzzanova)

 TukeyHSD(zzz.aov) 
  Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Intensity ~ Group, data = zzzanova)

$Group
  diff   lwr  upr p adj
Group2-Group1  0.778354812 -3.414233 4.970943 0.9607836
Group3-Group1 -0.734044848 -5.073786 3.605696 0.9698685
Group4-Group1 -0.000158625 -4.192747 4.192429 1.000
Group3-Group2 -1.512399661 -5.852140 2.827341 0.7933015
Group4-Group2 -0.778513438 -4.971101 3.414075 0.9607611
Group4-Group3  0.733886223 -3.605855 5.073627 0.9698870





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with ANOVA in R

2010-03-09 Thread Amit Patel
Hi

I am attempting Anova analysis to compare results from four groups
(Samp1-4) which are lists of intensities from the experiment. I am
doing this by first creating a structured list of the data and then
conducting the ANOVA (Script provided below). Im an R beginner so am
not sure if I am using this correctly. Two major questions I have are:

1)
Is using the code (zzz.aov - aov(Intensity ~ Group + Error(Sample),
data = zzzanova)) the correct method to calculate the variances between
the four groups (samp 1-4). I am unsure of the inclusion of the error
portion.

2) I beleive this method (aov) assumes equal variances. How can I adjust this 
to do an ANOVA with unequal variances




#SCRIPT STARTS
#Creates a structured list suitable for ANOVA analysis
# Intensity Group (1,2,3,4) Sample(1:62)
zzzanova -
structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)), 
Group = structure(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
 3,3,3,3,3,3,3,3,3,3,3,3,3,3,
 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4), .Label = c(Group1, Group2, 
Group3, Group4), class = factor), 
Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62)
))
, .Names = c(Intensity, 
Group, Sample), row.names = 
c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61,62),class = data.frame)

#Conducts the ANOVA for that PCI
zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova)

#SCRIPT ENDS


THANKS IN ADVANCE




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA questions

2010-03-03 Thread Amit Patel
I am attempting Anova analysis to compare results from four groups (Samp1-4) 
which are lists of intensities from the experiment. I am doing this by first 
creating a structured list of the data and then conducting the ANOVA (Script 
provided below). Im an R beginner so am not sure if I am using this correctly. 
Two major questions I have are:

1) Is using the code (zzz.aov - aov(Intensity ~ Group + Error(Sample), data = 
zzzanova)) the correct method to calculate the variances between the four 
groups (samp 1-4). I am unsure of the inclusion of the error portion.

2) I beleive this method (aov) assumes equal variances. How can I adjust this 
to do an ANOVA with unequal variances




#SCRIPT STARTS
#Creates a structured list suitable for ANOVA analysis
# Intensity Group (1,2,3,4) Sample(1:62)
zzzanova -
structure(list(Intensity = c(t(Samp1), t(Samp2), t(Samp3), t(Samp4)), 
Group = structure(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
 2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
 3,3,3,3,3,3,3,3,3,3,3,3,3,3,
 4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4), .Label = c(Group1, Group2, 
Group3, Group4), class = factor), 
Sample = structure(c( 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 
40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62)
))
, .Names = c(Intensity, 
Group, Sample), row.names = 
c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61,62),class = data.frame)

#Conducts the ANOVA for that PCI
zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova)

#SCRIPT ENDS


THANKS IN ADVANCE



  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bartlett Test

2010-03-01 Thread Amit Patel
Hi 

I am trying to conduct a Bartlett test between two groups Samp 1 and Samp 2, 
both of which are vectors of equal length. I cant find any information on how 
to do this. Does the data need to be in a structured list. 

Thanks in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bartlett Test

2010-03-01 Thread Amit Patel
Apparently the vectors that I have are lists and when I try to conduct the 
Bartlett test I am getting errors. I may be using it wrong. I am not an R 
expert

xbartlett2 - c(Samp1,Samp2)
#Samp1  2 are lists
gbartlett24 - c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)

Attempt 1
 gazant - bartlett.test(xbartlett2~gbartlett24)$p.value
Error in model.frame.default(formula = xbartlett2 ~ gbartlett24) : 
  invalid type (list) for variable 'xbartlett2'

xbartlett2 - c(Samp1,Samp2)



 gazant - bartlett.test(xbartlett2, gbartlett24)$p.value
Error in bartlett.test.default(xbartlett2, gbartlett24) : 
  there must be at least 2 observations in each group

(There are at least 2 unique observations in each group)

Thanks in advance





- Original Message 
From: Richardson, Patrick patrick.richard...@vai.org
To: Amit Patel amitrh...@yahoo.co.uk; r-help@r-project.org 
r-help@r-project.org
Sent: Mon, 1 March, 2010 14:31:15
Subject: RE: [R] Bartlett Test

?bartlett.test

From the help page If x is a list, its elements are taken as the samples or 
fitted linear models to be compared for homogeneity of variances. In this 
case, the elements must either all be numeric data vectors or fitted linear 
model objects, g is ignored, and one can simply use bartlett.test(x) to 
perform the test. If the samples are not yet contained in a list, use 
bartlett.test(list(x, ...)). 

Otherwise, x must be a numeric data vector, and g must be a vector or factor 
object of the same length as x giving the group for the corresponding elements 
of x. 

Patrick

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Amit Patel
Sent: Monday, March 01, 2010 9:26 AM
To: r-help@r-project.org
Subject: [R] Bartlett Test

Hi 

I am trying to conduct a Bartlett test between two groups Samp 1 and Samp 2, 
both of which are vectors of equal length. I cant find any information on how 
to do this. Does the data need to be in a structured list. 

Thanks in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
The information transmitted is intended only for the person or entity to which 
it is addressed and may contain confidential and/or privileged material. Any 
review, retransmission, dissemination or other use of, or taking of any action 
in reliance upon, this information by persons or entities other than the 
intended recipient is prohibited. If you received this in error, please contact 
the sender and delete the material from any computer.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with averaging

2009-07-15 Thread Amit Patel

Hi
I am using the following script to average a set of data 0f 62 columns into 31 
colums. The data consists of values of ln(0.01) or -4.60517 instead of NA's. 
These need to be averaged for each row (i.e 2 values being averaged). What I 
would I need to change for me to meet the conditions:


1. If each run of the sample has a value, the average is
given
2. If only one run of the sample has a value, that value
is given (i.e. no
averaging)
3. If both runs have missing values, NA is given.
I have tried changing (na.strings = NA) to (na.strings = -4.60517) but this 
causes all the pairs with even one NA return an NA value. I would prefer not to 
use for loops as this would slow the script down considerably.

#SCRIPT STARTS
rm(list=ls())
setwd(C:/Documents and Settings/Amit Patel)

#na.strings makes na's readable
zz - read.csv(Pr9549_LabelFreeData_ByExperimentAK.csv,strip.white = TRUE, 
na.strings = NA)

ix - seq(from=2,to=62, by=2)
averagedResults - (zz[,ix] + zz[,ix+1])/2
averagedResults -  cbind(zz[,1],averagedResults )
colnames(averagedResults)  - 
c(PCI,G1-C1,G1-C2,G1-C3,G1-C4,G1-C5,G1-C6,G1-C7,G1-C8,
G2-C9,G2-C10,G2-C11,G2-C12,G2-C13,G2-C14,G2-C15,G2-C16,
G3-C17,G3-C18,G3-C19,G3-C20,G3-C21,G3-C22,G3-C23,
G4-C24,G4-C25,G4-C26,G4-C27,G4-C28,G4-C29,G4-C30,G4-C31)

write.csv(averagedResults, file = Pr9549_averagedreplicates.csv, row.names = 
FALSE)
#SCRIPT ENDS





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] T.test error help

2009-07-09 Thread Amit Patel

Hi I am implementing the t.test in a loop and where the data is the same i get 
an error message. 

Error in t.test.default(Samp3, Samp1, na.rm = TRUE, var.equal = FALSE,  : 
  data are essentially constant

The script i am using is 

for (i in 1:length(zz[,1])) {

Samp1 - zz[i,2:17]
Samp2 - zz[i,18:33]
Samp3 - zz[i,34:47]
Samp4 - zz[i,48:63] 

TTestResult[i,2] - t.test(Samp2, Samp1, na.rm=TRUE, var.equal = FALSE, 
paired=FALSE, conf.level=0.95)$p.value


TTestResult[i,3] - t.test(Samp3, Samp1, na.rm=TRUE, var.equal = FALSE, 
paired=FALSE, conf.level=0.95)$p.value

TTestResult[i,4] - t.test(Samp4, Samp1, na.rm=TRUE, var.equal = FALSE, 
paired=FALSE, conf.level=0.95)$p.value

}

Is there a way to make my loop ignore this problem and go onto the next 
iteration





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with if statements

2009-06-09 Thread Amit Patel

Hi 
I am trying to create a column in a data frame which gives a sigificane score 
from 0-7. It should read values from 7 different colums and add 1 to the 
counter if the value is =0.05. I get an error message saying 

Error in if (ALLRESULTS[i, 16] = 0.05) significance_count = significance_count 
+  : 
  missing value where TRUE/FALSE needed

The script is included below

it works if i convert the NA values to zero but this is not appropriate as it 
includes the zero as significant. 

ANY SUGGESTIONS



#SCRIPT STARTS
for (i in 1:length(ALLRESULTS[,1])) {
significance_count = 0

if (ALLRESULTS[i,16] = 0.05 )  significance_count = significance_count +1 else 
significance_count = significance_count
if (ALLRESULTS[i,17] = 0.05 )  significance_count = significance_count +1 else 
significance_count = significance_count
if (ALLRESULTS[i,18] = 0.05 )  significance_count = significance_count +1 else 
significance_count = significance_count
if (ALLRESULTS[i,19] = 0.05 )  significance_count = significance_count +1 else 
significance_count = significance_count
if (ALLRESULTS[i,20] = 0.05 )  significance_count = significance_count +1 else 
significance_count = significance_count
if (ALLRESULTS[i,21] = 0.05 )  significance_count = significance_count +1 else 
significance_count = significance_count
if (ALLRESULTS[i,22] = 0.05 )  significance_count = significance_count +1 else 
significance_count = significance_count

ALLRESULTS[i,23] - significance_count}




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help to speed up loops in r

2009-06-08 Thread Amit Patel

Hi
i am using a script which involves the following loop. It attempts to reduce a 
data frame(zz) of 95000 * 41 down to a data frame (averagedreplicates) of 95000 
* 21 by averaging the replicate values as you can see in the script below. This 
script however is very slow (2days). Any suggestions to speed it up. 

NB I have also tried using rowMeans rather than adding the 2 values and 
dividing by 2. (same problem)




#SCRIPT STARTS
for (i in 1:length(averagedreplicates[,1]))
#for (i in 1:dim(averagedreplicates)[1])
{
cat(i,'\n')


#calculates Meanss
#Sample A
averagedreplicates[i,2] - (zz[i,2] + zz[i,3])/2
averagedreplicates[i,3] - (zz[i,4] + zz[i,5])/2
averagedreplicates[i,4] - (zz[i,6] + zz[i,7])/2
averagedreplicates[i,5] - (zz[i,8] + zz[i,9])/2
averagedreplicates[i,6] - (zz[i,10] + zz[i,11])/2

#Sample B
averagedreplicates[i,7] - (zz[i,12] + zz[i,13])/2
averagedreplicates[i,8] - (zz[i,14] + zz[i,15])/2
averagedreplicates[i,9] - (zz[i,16] + zz[i,17])/2
averagedreplicates[i,10] - (zz[i,18] + zz[i,19])/2
averagedreplicates[i,11] - (zz[i,20] + zz[i,21])/2

#Sample C
averagedreplicates[i,12] - (zz[i,22] + zz[i,23])/2
averagedreplicates[i,13] - (zz[i,24] + zz[i,25])/2
averagedreplicates[i,14] - (zz[i,26] + zz[i,27])/2
averagedreplicates[i,15] - (zz[i,28] + zz[i,29])/2
averagedreplicates[i,16] - (zz[i,30] + zz[i,31])/2

#Sample D
averagedreplicates[i,17] - (zz[i,32] + zz[i,33])/2
averagedreplicates[i,18] - (zz[i,34] + zz[i,35])/2
averagedreplicates[i,19] - (zz[i,36] + zz[i,37])/2
averagedreplicates[i,20] - (zz[i,38] + zz[i,39])/2
averagedreplicates[i,21] - (zz[i,40] + zz[i,41])/2
  }


   
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help wth boxplots

2009-05-26 Thread Amit Patel

Hi

I have a vector of data lets call zz (40 values from 4 samples)
the data is already in groups, i can even split up the samples using

SampA - zz[,2:11]
SampB - zz[,12:21]
SampC - zz[,22:31]
SampV - zz[,32:41]

I would like an output that gives me 4 boxplots on one plot
one boxplot for the set of 10 values

how can i do this in R


  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with loops

2009-05-15 Thread Amit Patel

Hi
I am trying to create a loop which averages replicates in my data.
The original data has many rows. and consists of 40 column zz[,2:41] plus row 
headings in zz[,1]
I am trying to average each set of values (i.e. zz[1,2:3] averaged and placed 
in average_value[1,2] and so on.
below is my script but it seems to be stuck in an endless loop
Any suggestions??

for (i in 1:length(average_value[,1])) {
average_value[i] - i^100; print(average_value[i])

#calculates Meanss
#Sample A
average_value[i,2] - rowMeans(zz[i,2:3])
average_value[i,3] - rowMeans(zz[i,4:5])
average_value[i,4] - rowMeans(zz[i,6:7])
average_value[i,5] - rowMeans(zz[i,8:9])
average_value[i,6] - rowMeans(zz[i,10:11])

#Sample B
average_value[i,7] - rowMeans(zz[i,12:13])
average_value[i,8] - rowMeans(zz[i,14:15])
average_value[i,9] - rowMeans(zz[i,16:17])
average_value[i,10] - rowMeans(zz[i,18:19])
average_value[i,11] - rowMeans(zz[i,20:21])

#Sample C
average_value[i,12] - rowMeans(zz[i,22:23])
average_value[i,13] - rowMeans(zz[i,24:25])
average_value[i,14] - rowMeans(zz[i,26:27])
average_value[i,15] - rowMeans(zz[i,28:29])
average_value[i,16] - rowMeans(zz[i,30:31])

#Sample D
average_value[i,17] - rowMeans(zz[i,32:33])
average_value[i,18] - rowMeans(zz[i,34:35])
average_value[i,19] - rowMeans(zz[i,36:37])
average_value[i,20] - rowMeans(zz[i,38:39])
average_value[i,21] - rowMeans(zz[i,40:41])
  }


thanks




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Help with loops(corrected question)

2009-05-15 Thread Amit Patel



--- On Fri, 15/5/09, Amit Patel amitrh...@yahoo.co.uk wrote:

 From: Amit Patel amitrh...@yahoo.co.uk
 Subject: Help with loops
 To: r-help@r-project.org
 Date: Friday, 15 May, 2009, 12:17 PM
 Hi
 I am trying to create a loop which averages replicates in
 my data.
 The original data has many rows. and consists of 40 column
 zz[,2:41] plus row headings in zz[,1]
 I am trying to average each set of values (i.e. zz[1,2:3]
 averaged and placed in average_value[1,2] and so on.
 below is my script but it seems to be stuck in an endless
 loop
 Any suggestions??
 
 for (i in 1:length(zz[,1])) {

 
 #calculates Meanss
 #Sample A
 average_value[i,2] - rowMeans(zz[i,2:3])
 average_value[i,3] - rowMeans(zz[i,4:5])
 average_value[i,4] - rowMeans(zz[i,6:7])
 average_value[i,5] - rowMeans(zz[i,8:9])
 average_value[i,6] - rowMeans(zz[i,10:11])
 
 #Sample B
 average_value[i,7] - rowMeans(zz[i,12:13])
 average_value[i,8] - rowMeans(zz[i,14:15])
 average_value[i,9] - rowMeans(zz[i,16:17])
 average_value[i,10] - rowMeans(zz[i,18:19])
 average_value[i,11] - rowMeans(zz[i,20:21])
 
 #Sample C
 average_value[i,12] - rowMeans(zz[i,22:23])
 average_value[i,13] - rowMeans(zz[i,24:25])
 average_value[i,14] - rowMeans(zz[i,26:27])
 average_value[i,15] - rowMeans(zz[i,28:29])
 average_value[i,16] - rowMeans(zz[i,30:31])
 
 #Sample D
 average_value[i,17] - rowMeans(zz[i,32:33])
 average_value[i,18] - rowMeans(zz[i,34:35])
 average_value[i,19] - rowMeans(zz[i,36:37])
 average_value[i,20] - rowMeans(zz[i,38:39])
 average_value[i,21] - rowMeans(zz[i,40:41])
   }
 
 
 thanks
 
 
 
 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help- extracting values

2009-04-16 Thread Amit Patel

I have csv files imported in r each with 2 columns and many many rows. I have 
sorted the data in them but want to extract some values.

The first column is an ID
The second is a p-value ( now sorted in increasing order with NA's last)
I want to extract the rows with a p-value of less than 0.05)

What commands would help
the table is called AnovaSort with column headings MCI  p-value

Many thanks in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with ANOVA p-values

2009-04-14 Thread Amit Patel

Hi
I have done ANOVA on a dataset (See Below) but am having problems retreiving 
the p-value. I am assuming that Pr(F) is the p-value but cannot get this value 
or in fact any other value (e.g. DF) from the summary.Any suggestions??

I have tried

sum-summary(zzz.aov)
 sum$Pr(F)
Error: unexpected '' in sum$Pr(
 sum$Pr(F)
NULL
 sum$Df
NULL




 zzz.aov - aov(Intensity ~ Group + Error(Sample), data = zzzanova)
 summary(zzz.aov)

Error: Sample
  Df Sum SqMean Sq
Group  1 6.0313e+10 6.0313e+10

Error: Within
  Df Sum SqMean Sq F value Pr(F)
Group  3 2.6012e+10 8.6707e+09  0.2934 0.8299
Residuals 34 1.0049e+12 2.9556e+10 




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with Wilcoxon Test

2009-03-02 Thread Amit Patel
Hi
I have 2 sets of data that I want to do a Wilcoxon test on. They are of the 
same dimension. One has 4 zero values and the other has 5.
 dim(SampA)
[1]  1 10
 dim(SampV)
[1]  1 10
 
I get the folowing error 

Error in wilcox.test.default(SampA, SampV, na.rm = TRUE, paired = FALSE,  : 
  'x' must be numeric



I am using the function
wilcox.test(SampA, SampV, na.rm=TRUE, paired=FALSE, conf.level=0.95)





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with t.test

2009-02-24 Thread Amit Patel

Hi I have managed to do a paired t-test with a data set

i have 5 colums of data im dealing with


GENE   SampA  SampB SampC   SampVehicle
ctcc   859 na145 24
gtcg   45  5 54  69

and so on but they are much larger columns

each column has been split and i can do t.test on for eg sampA by doing
t.test(sampA, SampVehicle, na.rm=TRUE, paired=TRUE, conf.level=0.95)

what can I do to be able to identify which of the genes are responsible for the 
most variance or are the most significant.

THANKS IN ADVANCE





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with t.test

2009-02-23 Thread Amit Patel
Hi I have managed to do a paired t-test with a data set

i have 5 colums of data im dealing with


GENE              SampA              
SampB              SampC                   
SampVehicle
ctcc                   
859                     
na                     
145                           24
gtcg                   
45                         
5                       
54                           69

and so on but they are much larger columns

each column has been split and i can do t.test on for eg sampA by doing
t.test(sampA, SampVehicle, na.rm=TRUE, paired=TRUE, conf.level=0.95)

what can I do to be able to identify which of the genes are responsible for the 
most variance or are the most significant.

THANKS IN ADVANCE






  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with reshaping

2009-02-13 Thread Amit Patel
hi Im having some problems reshaping
Ive managed to apply it but have some problems
the attached document will explain
any help is appreciated



  __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help with reshaping (no file attached)

2009-02-13 Thread Amit Patel





 
  
  MCI
  
  
  A1
  
  
  A2
  
  
  A13
  
  
  A14
  
  
  A23
  
  
  A24
  
  
  A33
  
  
  A34
  
  
  Grouped together
  
 
 
  
  56766
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
 
 
  
  6459
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
 
 
  
  31233
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  71280.7
  
  
  N/A
  
  
  N/A
  
 
 
  
  16790
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
 
 
  
  13392
  
  
  284699.6
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
 
 
  
  1575
  
  
  N/A
  
  
  1196152
  
  
  1236735
  
  
  1322735
  
  
  1100289
  
  
  887130.2
  
  
  N/A
  
  
  N/A
  
  
  N/A
  
 


Figure 1 - Takeda2_nas.csv

Trying to get data in suitable format for Genstat



#clear console

rm(list=ls())

 

install.packages(reshape)

library(reshape)

 

#Read in data

#Takeda -
read.table(Takeda2_nas.csv, sep = ,, header = TRUE,
row.names = 1)

#Takedastack -
cbind(Takeda[gl(nrow(Takeda), 1, 40*nrow(Takeda)), 1], stack(Takeda[, 1:41]))

 

 

zz -
read.csv(Takeda2_nas.csv,strip.white = TRUE)

#zzz -
cbind(zz[gl(nrow(zz), 1, 40*nrow(zz)), 1], stack(zz[, 2:41]))

#Use reshape function to
change data

zzz -
reshape(zz,varying=list(c(A1,A2,A13,A14,A23,A24,A33,A34,A39,A40,B9,B10,B5,B6,B15,B16,B27,B28,B31,B32,C3,C4,C7,C8,C11,C12,C17,C18,C21,C22,V19,V20,V25,V26,V29,V30,V35,V36,V37,V38)),direction=long)

#not ideal result i wanted

#write a table to excel

write.table(zzz,
Takedashift.csv, sep=,)



Script 1 - datashift.r               

The result with above commands


 
  
  MCI (NONSENSE)
  
  
  Time (Actually
  MCI)
  
  
  A1 (1
  = A1 and so on)
  
  
  Id (Intensity)
  
  
  (COUNT)
  
 
 
  
  1.1
  
  
  56766
  
  
  1
  
  
  N/A
  
  
  1
  
 
 
  
  2.1
  
  
  6459
  
  
  1
  
  
  N/A
  
  
  2
  
 
 
  
  3.1
  
  
  31233
  
  
  1
  
  
  N/A
  
  
  3
  
 


 

 

Want the data to be


 
  
  MCI
  
  
  ID(sample)
  
  
  Intensity
  
 
 
  
  56766
  
  
  A1
  
  
  N/A
  
 
 
  
  6459
  
  
  A1
  
  
  N/A
  
 
 
  
  31233
  
  
  A1
  
  
  N/A
  
 


 




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA in R

2009-01-29 Thread Amit Patel
Hi

I Have a very large dataset that I would like to conduct ANOVA tests on. Im not 
a very strong programmer so any help would be appreciated.
the format is 

Identifier             A1       A2        B1      
B2       C1   C2      Norm1         Norm2
1234                  1        1            
NA     NA      4       3        
NA               NA
4567                  2        
2              4      4         8       
8       9                    9


and so on
I have 10 runs for 3 different doses plus the normal state. Any help greatly 
appreciated

 
 
 
  
  

  

  

  

  

  

 
 
  
  

  

  

  

  

  

 
 
  
  

  

  

  

  

  

 
 
  
  

  

  

  

  

  

 
 
  
  

  

  

  

  

  

 
 
  
  

  

  

  

  

  

 





  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] t-test

2009-01-29 Thread Amit Patel
When doing the t-test in the below manner will r compare each element of the 
array with the relevant one. I.e. if i was comparing x and y would (1 and 0) 
and (1 and 9) be treated as separate variables. Or does it just assume one 
variable. 


# test data

x - c(1,1.1,1.15,1.2,1.21,1.23)

y - c(0.9,1,1.16,1.18,1.19,1.2)

z - c(1.4,1.42,1.43,1.44,1.45,1.46)


###  Student's t-test

# for help in R type ?t.test()

# defaults are:
# alternative = two.sided i.e. two-sided t-test
# var.equal = FALSE i.e. unequal variance

# note:
# na.rm = TRUE removes missing values
# $p.value gives the p-value for the test

t.test(x, y, na.rm=TRUE, paired=FALSE)$p.value

# gives 0.5026467

t.test(x, z, na.rm=TRUE, paired=FALSE)$p.value

# gives 0.0003166352



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Stacking data

2009-01-29 Thread Amit Patel
Hi

I have data in the format below

     Age            V1       V2      V3       V4
   23646         45190 50333 55166 56271
   26174         35535 38227 37911 41184
   27723          25691 25712 26144 26398

and would like to sort it as follows

                 Age      values     ind
                23646    45190    V1
               26174   35535      V1
               27723    25691     V2
              27193    30949      V2

But i have 41 columns (age column  +  40 individuals) 
I have the following script but an error is thrown up
can anyone help, where am i going wrong

zz - read.csv(Filename.csv,strip.white = TRUE)
zzz - cbind(zz[gl(nrow(zz), 1, 40*nrow(zz)), 1], stack(zz[, 2:41]))

ERROR MESSAGE
Error in data.frame(..., check.names = FALSE) : 
  arguments imply differing number of rows: 3789080, 0




  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] structure Arrays

2008-12-16 Thread Amit Patel
Hi

Does anyone know how I can use structured arrays in r
similar to a dataframe in matlab



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCA HCA

2008-10-19 Thread Amit Patel

Hi
I am attempting PCA and HCA on a dataset
The head of the table looks like this


VariableSamp1Samp2Samp3
109232
276 352
222 244


I cant stop R from treating the 1st column as a sample

Send instant messages to your online friends http://uk.messenger.yahoo.com 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HI

2008-09-16 Thread Amit Patel
Does anyone know an easy way to convert all the zero values in a imported csv 
table into NA's



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Dealing with NA's in a data matrix

2007-12-05 Thread Amit Patel
Hi I have a matrix with NA value that I would like to convert these to a value 
of 0.
any suggestions
 
Kind Regards
Amit Patel




  ___

ttp://uk.promotions.yahoo.com/forgood/
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Doing PCA

2007-11-29 Thread Amit Patel
Hi Fellow R enthusiasts
 I have managed to reshape my data using a much shorter script than before. 
Woohoo
However now I have new problems. The code is below. There are no problems with 
the create matrix section.
The problem code is highlighted in bold. I am trying to do PCA on the data.  
Here are the errors.

Error1 
code :  OGSscaled = rangescale(OGS)
error message : Error in dim(newX) - c(prod(d.call), d2) : 
  dims [product 47960] do not match the length of object [43600]

Error2
I tried to do PCA without rangescaling
code :OGSpca - prcomp(OGS, center=FALSE)
error message : Error in svd(x, nu = 0) : infinite or missing values in
 'x'
 

CODE
##
###  Create matrix ###
##

# load packages
require(reshape)
source(rangescale.r)

#Open the csv file
OGSdata - read.table(MG3199.csv,sep=,,header=TRUE)



#create matrix
x.m - melt(OGSdata, measure.var=pct)
OGS - cast(x.m, mci ~ sample)

##
###  PCA  
##

#scale profiles
OGSscaled = rangescale(OGS)

#do PCA
result = prcomp(OGS, center=FALSE)

#obtain scores matrix
scores=result$rotation

#PC1 vs PC2 plot
plot(scores[,1], scores[,2], xlab=PC1, ylab=PC2)

#add labels (0.005 and 0.003 offset to aviod obscuring points

Kind Regards
Amit Patel




  ___

now.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.