Hi,
I am a Ph.D. student from Québec, Canada. Im a beginner with R and
Bioconductor. Until now the only experience I have is in analyzing
microarray data using affy and limma packages. Now I am trying to analyze
Rat Gene 10 st arrays and I would like to run RMA analysis and Smyth
moderated t test on those arrays. Since no cdf official package is available
for those arrays, after reading many of the questions and responses on this
mailing list, I decided to use pdInfoBuilder, oligo and limma packages to
run analysis. The problem is, at the end, I get expression and differential
expression measured for all probe separately but not the calculated
expression representing all probe of each gene. When I run RMA, I got only
two steps, Background correcting and Normalizing but not Calculating
expression. Do you know how I can get differential expression calculated for
each gene? I dont know if the problem is in the package I built or if I can
use some code to answer this question. I list all codes used to build and
install the package pd.ragene.1.0.st.v1 and used to analyze expression
arrays below.
Many thanks for your help,
Anne-Marie Madore
## building the package
> library(Biobase)
Loading required package: tools
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
> library(pdInfoBuilder)
Loading required package: RSQLite
Loading required package: DBI
Loading required package: affxparser
Loading required package: oligo
Loading required package: splines
Loading required package: preprocessCore
Loading required package: AnnotationDbi
Loading required package: oligoClasses
oligo Package - Series 1.5.x
> setwd("D:/Anne-Marie/Doctorat/puces ADN macrophages/puces rat/Annie
Dube/Analyse")
> transFile <-
"RaGene-1_0-st-v1.na27.rn4.transcript.csv1/RaGene-1_0-st-v1.na27.rn4.transcr
ipt.csv"
> probeFile <- "RaGene-1_0-st-v1.probe.tab/RaGene-1_0-st-v1.probe.tab"
> clfFile <- "RaGene-1_0-st-v1.r4.clf/RaGene-1_0-st-v1.r4.clf"
> pgfFile <- "RaGene-1_0-st-v1.r4.pgf/RaGene-1_0-st-v1.r4.pgf"
> pkg <- new("AffyGenePDInfoPkgSeed", author="Anne-Marie Madore",
email="[email protected]", version="0.0.1",
+ genomebuild="RefSeq April 3, 2007, GenBank® January 25, 2007, Rat Ensembl
transcripts April 3, 2007 ",
+ biocViews="AnnotationData", pgfFile=pgfFile, clfFile=clfFile,
transFile=transFile, probeFile=probeFile)
> makePdInfoPackage(pkg, destDir=".")
Creating package in ./pd.ragene.1.0.st.v1
loadUnitsByBatch took 50.51 sec
loadAffyCsv took 12.73 sec
loadAffySeqCsv took 57.62 sec
DB sort, index creation took 24.75 sec
[1] TRUE
Warning messages:
1: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
2: In is.na(x) : is.na() applied to non-(list or vector) of type 'NULL'
## installing the package in cmd command shell
Microsoft Windows [version 6.0.6001]
Copyright (c) 2006 Microsoft Corporation. Tous droits réservés.
C:\Users\Anne-Marie Madore>cd c:\Program Files\R\R-2.8.1\bin
c:\Program Files\R\R-2.8.1\bin>R CMD INSTALL pd.ragene.1.0.st.v1
installing to 'c:/PROGRA~1/R/R-28~1.1/library'
---------- Making package pd.ragene.1.0.st.v1 ------------
adding build stamp to DESCRIPTION
installing NAMESPACE file and metadata
installing R files
installing inst files
preparing package pd.ragene.1.0.st.v1 for lazy loading
Loading required package: RSQLite
Loading required package: DBI
Loading required package: oligoClasses
Loading required package: Biobase
Loading required package: tools
Welcome to Bioconductor
Vignettes contain introductory material. To view, type
'openVignette()'. To cite Bioconductor, see
'citation("Biobase")' and for packages 'citation(pkgname)'.
no man files in this package
installing indices
installing help
adding MD5 sums
* DONE (pd.ragene.1.0.st.v1)
## If I run a check (R CMD check pd.ragene.st.v1) I get three warning
messages and one note:
1. * checking R files for non-ASCII characters ... WARNING
Found the following files with non-ASCII characters: all.R Portable packages
must use only ASCII characters in their R code, except perhaps in comments.
2. * checking whether the name space can be loaded with stated
dependencies ... WARNING
Error in initDbConnection() : could not find function "dbConnect" Error:
.onLoad failed in 'loadNamespace' for 'pd.ragene.1.0.st.v1' Execution halted
A namespace must be able to be loaded with just the base namespace loaded:
otherwise if the namespace gets loaded by a saved object, the session will
be unable to start.
Probably some imports need to be declared in the NAMESPACE file.
3. * checking R code for possible problems ... NOTE
closeDb: no visible binding for global variable 'dbCon'
4. * checking for missing documentation entries ... WARNING
Undocumented code objects:
pd.ragene.1.0.st.v1
All user-level objects in a package should have documentation entries. See
the chapter 'Writing R documentation files' in manual 'Writing R
Extensions'.
## analyzing the package
> library("pd.ragene.1.0.st.v1")
> library(oligo)
> library(limma)
> library(genefilter)
Loading required package: survival
> library(geneplotter)
Loading required package: lattice
Loading required package: annotate
Loading required package: xtable
KernSmooth 2.22 installed
Copyright M. P. Wand 1997
> cel.files <- list.celfiles(".", full.names = TRUE)
> basename(cel.files)
[1] "AD_Ctrl_1.CEL" "AD_Ctrl_2.CEL" "AD_Ctrl_3.CEL"
"AD_Ctrl_5.CEL"
[5] "AD_Ctrl_6.CEL" "AD_Traite_10.CEL" "AD_Traite_11.CEL"
"AD_Traite_7.CEL"
[9] "AD_Traite_8.CEL" "AD_Traite_9.CEL"
> test <- read.celfiles(cel.files)
Platform design info loaded.
> phenoData(test) <- read.AnnotatedDataFrame("phenoData.txt", header = TRUE,
row.name=1)
> class(test)
[1] "GeneFeatureSet"
attr(,"package")
[1] "oligoClasses"
> class(phenoData)
[1] "standardGeneric"
attr(,"package")
[1] "methods"
> eset <- rma(test)
Background correcting
Normalizing
> e <- exprs(eset)
> index1 <- 1:5
> index2 <- 6:10
> d <- rowMeans(e[, index1]) - rowMeans(e[, index2])
> design <- model.matrix(~factor(eset$Key))
> fit <- lmFit(eset, design)
> ebayes <- eBayes(fit)
> sample <- row.names(ebayes)
> Pvalue <- ebayes$p.value[,2]
> Mean1 <- rowMeans(e[,index1])
> Mean2 <- rowMeans(e[,index2])
> sd1 <- apply(e[,index1], 1, "sd")
> sd2 <- apply(e[,index2], 1, "sd")
> csv <- read.csv(file="D:/Anne-Marie/Doctorat/puces ADN macrophages/puces
rat/Annie
Dube/Analyse/RaGene-1_0-st-v1.na27.rn4.transcript.header.DOS.csv",head=TRUE,
sep=",")
> all <- merge(csv, Pvalue, by.x="sample", by.y=0, all=T)
> all2 <- merge(all, Mean1, by.x="sample", by.y=0, all=T)
> all3 <- merge(all2, Mean2, by.x="sample", by.y=0, all=T)
> all4 <- merge(all3, sd1, by.x="sample", by.y=0, all=T)
> all5 <- merge(all4, sd2, by.x="sample", by.y=0, all=T)
> all6 <- merge(all5, d, by.x="sample", by.y=0, all=T)
> write.table(data.frame(all6), file="ULTIMETESTtout.xls", sep="\t",
col.names = NA)
> dim(all6)
[1] 240410 24
>
> # I also try using slightly the same way as I used with affy package
>
> pd <- read.AnnotatedDataFrame("pheno.txt", header = TRUE, row.names = 1)
> pData(pd)
Phenotype
AD_Ctrl_1.CEL Control
AD_Ctrl_2.CEL Control
AD_Ctrl_3.CEL Control
AD_Ctrl_5.CEL Control
AD_Ctrl_6.CEL Control
AD_Traite_7.CEL Traited
AD_Traite_8.CEL Traited
AD_Traite_9.CEL Traited
AD_Traite_10.CEL Traited
AD_Traite_11.CEL Traited
> a <- read.celfiles(filenames = rownames(pData(pd)), phenoData = pd,
verbose = TRUE)
Platform design info loaded.
> eset <- rma(a)
Background correcting
Normalizing
> exprs.eset <- exprs(eset)
> index1 <- 1:5
> index2 <- 6:10
> d <- (rowMeans(exprs.eset[,index1]) - rowMeans(exprs.eset[,index2]))
> population.groups <- factor (c(rep("Control",5), rep ("Traited",5)))
> design <- model.matrix (~population.groups)
> fit <- lmFit (eset, design)
> fit.eBayes <- eBayes (fit)
> sample <- row.names(fit.eBayes)
> Pvalue <- fit.eBayes$p.value[,2]
> Mean1 <- rowMeans(e[,index1])
> Mean2 <- rowMeans(e[,index2])
> sd1 <- apply(e[,index1], 1, "sd")
> sd2 <- apply(e[,index2], 1, "sd")
> csv <- read.csv(file="D:/Anne-Marie/Doctorat/puces ADN macrophages/puces
rat/Annie
Dube/Analyse/RaGene-1_0-st-v1.na27.rn4.transcript.header.DOS.csv",head=TRUE,
sep=",")
> all <- merge(csv, Pvalue, by.x="sample", by.y=0, all=T)
> all2 <- merge(all, Mean1, by.x="sample", by.y=0, all=T)
> all3 <- merge(all2, Mean2, by.x="sample", by.y=0, all=T)
> all4 <- merge(all3, sd1, by.x="sample", by.y=0, all=T)
> all5 <- merge(all4, sd2, by.x="sample", by.y=0, all=T)
> all6 <- merge(all5, d, by.x="sample", by.y=0, all=T)
> write.table(data.frame(all6), file="TESTtout.xls", sep="\t", col.names =
NA)
> dim(all6)
[1] 240410 24
> sessionInfo()
R version 2.8.1 (2008-12-22)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] splines tools stats graphics grDevices utils datasets
methods base
other attached packages:
[1] geneplotter_1.20.0 annotate_1.20.1 xtable_1.5-4
[4] lattice_0.17-17 genefilter_1.22.0 survival_2.34-1
[7] limma_2.16.3 pd.ragene.1.0.st.v1_0.0.1 pdInfoBuilder_1.6.0
[10] oligo_1.6.0 oligoClasses_1.4.0 AnnotationDbi_1.4.2
[13] preprocessCore_1.4.0 affxparser_1.14.2 RSQLite_0.7-1
[16] DBI_0.2-4 Biobase_2.2.1
loaded via a namespace (and not attached):
[1] grid_2.8.1 KernSmooth_2.22-22 RColorBrewer_1.0-2
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.