I have only spend 5s reading your email. But it seems you have the incredible common problem of chromosome name mismatch. For a given organism there is no standard for how the chr are named. Depending on the source (genome, what source, what annotation etc), I have -- for the same genome -- seen at least chr1 chr01 chrI (<- roman 1) scchr1 I would not find it strange at all if whoever aligned using ELAND used a different chr. name. This is pretty hard to make work automatically. You usually have to modify the objects yourself.
Kasper On Jan 7, 2010, at 15:59 PM, [email protected] wrote: > Dear bioc-sig-sequencing, > > I am trying to analyze Eland aligned files (Arabidopsis) for differential > expression, using as a guide the 'A ChIP-Seq Data Analysis' handout from a > 11/19/09 session at the 'High throughput sequence analysis tools and > approaches with Bioconductor' workshop in Seattle. > > I generated the error message in the following output. Can you comment? > > Notes: > > i. This is my second email on this problem. My first email omitted some info > which appears to me may have affected the response by persons monitoring the > mailing list. > > ii. One misunderstanding would appear to be that I used the mouse 'reads' > data supplied during the lab. Instead, I used what I'm told is Eland-aligned > Arabidopsis data (s_8_export_chr1.txt file derived from s_8_export.txt using > grep). > > iii. The error message suggests a mismatch between: >> names(arab.chromlens) > [1] "chr1" "chr2" "chr3" "chr4" "chr5" "chrC" "chrM" > and >> levels(chromosome(alns_8)) > [1] "chr1.fas" > > I don't know what to do about this 'mismatch'? Perhaps I need to arrange so: >> names(arab.chromlens) > gives output of only: > "chr1"? > > iv. I note using 'available.genomes()', there are two BSgenome data packages > for Arabidopsis. Could my arbitrary choice be a problem? Would one have to > coordinate the choice with the Arabidopsis genome Eland must have used during > alignment? > > > ... > >> cerudataDir <- "/home/mterry/data09/wang_892spr09/rob_hw6/ceru_data" >> cerudataDir > [1] "/home/mterry/data09/wang_892spr09/rob_hw6/ceru_data" > >> pattern <- "s_8_export_chr1.txt" >> list.files(cerudataDir, pattern) > [1] "s_8_export_chr1.txt" >> filt1 <- alignDataFilter(expression(filtering=="Y")) >> filt2 <- chromosomeFilter("chr[0-9XYM]+.fa") >> filt <- compose(filt1, filt2) >> alns_8 <- readAligned(cerudataDir, pattern, type="SolexaExport", > + filter=filt) >> alns_8 > class: AlignedRead > length: 1022848 reads; width: 35 cycles > chromosome: chr1.fas chr1.fas ... chr1.fas chr1.fas > position: 7568294 167488 ... 4687256 5376960 > strand: + + ... + + > alignQuality: NumericQuality > alignData varLabels: run lane ... filtering contig > >> levels(aln...@chromosome) <- sub(".fa$", "", levels(chromosome(alns_8))) > >> head(levels(aln...@chromosome)) > [1] "chr1.fas" >> levels(chromosome(alns_8)) > [1] "chr1.fas" >> library(BSgenome.Athaliana.TAIR.04232008) >> arab.chromlens <- seqlengths(Athaliana) >> head(arab.chromlens) > chr1 chr2 chr3 chr4 chr5 chrC > 30432563 19705359 23470805 18585042 26992728 154478 >> names(arab.chromlens) > [1] "chr1" "chr2" "chr3" "chr4" "chr5" "chrC" "chrM" >> cov.arab8 <- coverage(alns_8, width = arab.chromlens, extend = 126L) > Error: UserArgumentMismatch > 'names(width)' (or 'names(end)') mismatch with 'levels(chromosome(x))' > see ?"AlignedRead-class" > >> sessionInfo() > R version 2.10.1 (2009-12-14) > x86_64-pc-linux-gnu > > locale: > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 > [7] LC_PAPER=en_US.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] BSgenome.Athaliana.TAIR.04232008_1.3.16 > [2] chipseq_0.2.1 > [3] ShortRead_1.4.0 > [4] lattice_0.17-26 > [5] BSgenome_1.14.2 > [6] Biostrings_2.14.10 > [7] IRanges_1.4.9 > > loaded via a namespace (and not attached): > [1] Biobase_2.6.1 grid_2.10.1 hwriter_1.1 >> > > > Thanks, > P. Terry > [email protected] > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
