Dear Nicolas,
Thanks very much for the bug report. I have committed a fix to edgeR
versions 1.4.6 (release) and 1.5.6 (devel).
Best wishes
Gordon
On Thu, 21 Jan 2010, Nicolas Delhomme wrote:
Good afternoon,
I've had the following bug in edgeR version 1.4.3 (bioC 2.5 stable).
I created my object like this:
dge<-DGEList(
counts=counts,
group=sub("\\.\\d","",colnames(counts)),
lib.size=colSums(counts),
genes=transcripts)
Then I estimated the common dispersion:
dge <- estimateCommonDisp(dge)
And did a contrast between two conditions and got the information back in a
data.frame:
tags <- topTags(exactTest(dge,pair=c("gal4","twi")),n=.Machine$integer.max)
There, I realized that the genes were not ordered accordingly to the
colnames. The colnames are correct. See the results below.
tags$table[1:10,]
genes logConc logFC PValue FDR
FBtr0084431 FBtr0087887 -33.16413 33.703845 9.050816e-50 1.834510e-45
FBtr0084439 FBtr0087893 -17.89889 5.696000 2.391551e-45 2.423717e-41
FBtr0071928 FBtr0074873 -19.97871 7.066663 1.814922e-43 1.226222e-39
FBtr0077809 FBtr0070400 -33.99876 32.034581 1.677380e-29 8.499706e-26
FBtr0300181 FBtr0074211 -21.99303 7.636927 1.945823e-26 7.887978e-23
FBtr0301322 FBtr0084735 -34.32748 31.377138 8.141553e-24 2.750352e-20
FBtr0299839 FBtr0084424 -19.89508 4.299143 9.575473e-23 2.426066e-19
FBtr0299841 FBtr0084426 -19.89508 4.299143 9.575473e-23 2.426066e-19
FBtr0299840 FBtr0084425 -19.91022 4.268863 2.075009e-22 4.673152e-19
FBtr0079322 FBtr0071074 -19.40663 3.933231 1.308765e-21 2.652735e-18
So I downloaded the latest version (1.5.5, bioC 2.6 devel) and I could
reproduce the same bug. Actually before I could do that I got yet another
error message saying:
"Number of rows in the annotation object 'genes' is not equal to the number
of rows in the matrix of counts.
Must have an annotation for each gene/tag, if annotation is to be used."
whereas I didn't change anything to the above command lines. I tracked it
down and what happens is that rows with empty counts are removed from the
counts object but not from the genes one. And this obviously has the sanity
check failing.
Changing the line 76 in classes.R (today's svn checkout) from
else if(nrow(as.data.frame(genes))==nrow(x$counts)) x$genes <-
as.data.frame(genes, stringsAsFactors=FALSE)
to
else
if(nrow(as.data.frame(genes)[!allZeros,,drop=FALSE])==nrow(x$counts)) x$genes
<- as.data.frame(genes, stringsAsFactors=FALSE)[!allZeros,,drop=FALSE]
corrects that problem. I haven't investigated the afore mentioned one, as I
expect it to be in some deeper functions and that I could work around it by
selecting on the colnames.
Regards,
Nicolas Delhomme
PS: sessionInfo()
R version 2.10.0 (2009-10-26)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] doMC_1.2.0 multicore_0.1-3 foreach_1.3.0 codetools_0.2-2
[5] iterators_1.0.3 edgeR_1.5.5 DESeq_0.7.4 locfit_1.5-5
[9] lattice_0.18-3 akima_0.5-4 Biobase_2.6.1
loaded via a namespace (and not attached):
[1] annotate_1.24.1 AnnotationDbi_1.8.1 DBI_0.2-5
[4] dgenb_0.4.0 genefilter_1.28.2 geneplotter_1.24.0
[7] grid_2.10.0 limma_3.2.1 MASS_7.3-4
[10] RColorBrewer_1.0-2 RSQLite_0.8-1 splines_2.10.0
[13] survival_2.35-8 xtable_1.5-6
---------------------------------------------------------------
Nicolas Delhomme
High Throughput Functional Genomics Center
European Molecular Biology Laboratory
Tel: +49 6221 387 8426
Email: [email protected]
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing