Dear bioc-sig-sequencing,
I am trying to analyze Eland aligned files (Arabidopsis) for differential
expression, using as a guide the 'A ChIP-Seq Data Analysis' handout from a
11/19/09 session at the 'High throughput sequence analysis tools and approaches
with Bioconductor' workshop in Seattle.
I generated the error message in the following output. Can you comment?
Notes:
i. This is my second email on this problem. My first email omitted some info
which appears to me may have affected the response by persons monitoring the
mailing list.
ii. One misunderstanding would appear to be that I used the mouse 'reads' data
supplied during the lab. Instead, I used what I'm told is Eland-aligned
Arabidopsis data (s_8_export_chr1.txt file derived from s_8_export.txt using
grep).
iii. The error message suggests a mismatch between:
> names(arab.chromlens)
[1] "chr1" "chr2" "chr3" "chr4" "chr5" "chrC" "chrM"
and
> levels(chromosome(alns_8))
[1] "chr1.fas"
I don't know what to do about this 'mismatch'? Perhaps I need to arrange so:
> names(arab.chromlens)
gives output of only:
"chr1"?
iv. I note using 'available.genomes()', there are two BSgenome data packages
for Arabidopsis. Could my arbitrary choice be a problem? Would one have to
coordinate the choice with the Arabidopsis genome Eland must have used during
alignment?
...
> cerudataDir <- "/home/mterry/data09/wang_892spr09/rob_hw6/ceru_data"
> cerudataDir
[1] "/home/mterry/data09/wang_892spr09/rob_hw6/ceru_data"
> pattern <- "s_8_export_chr1.txt"
> list.files(cerudataDir, pattern)
[1] "s_8_export_chr1.txt"
> filt1 <- alignDataFilter(expression(filtering=="Y"))
> filt2 <- chromosomeFilter("chr[0-9XYM]+.fa")
> filt <- compose(filt1, filt2)
> alns_8 <- readAligned(cerudataDir, pattern, type="SolexaExport",
+ filter=filt)
> alns_8
class: AlignedRead
length: 1022848 reads; width: 35 cycles
chromosome: chr1.fas chr1.fas ... chr1.fas chr1.fas
position: 7568294 167488 ... 4687256 5376960
strand: + + ... + +
alignQuality: NumericQuality
alignData varLabels: run lane ... filtering contig
> levels(aln...@chromosome) <- sub(".fa$", "", levels(chromosome(alns_8)))
> head(levels(aln...@chromosome))
[1] "chr1.fas"
> levels(chromosome(alns_8))
[1] "chr1.fas"
> library(BSgenome.Athaliana.TAIR.04232008)
> arab.chromlens <- seqlengths(Athaliana)
> head(arab.chromlens)
chr1 chr2 chr3 chr4 chr5 chrC
30432563 19705359 23470805 18585042 26992728 154478
> names(arab.chromlens)
[1] "chr1" "chr2" "chr3" "chr4" "chr5" "chrC" "chrM"
> cov.arab8 <- coverage(alns_8, width = arab.chromlens, extend = 126L)
Error: UserArgumentMismatch
'names(width)' (or 'names(end)') mismatch with 'levels(chromosome(x))'
see ?"AlignedRead-class"
> sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] BSgenome.Athaliana.TAIR.04232008_1.3.16
[2] chipseq_0.2.1
[3] ShortRead_1.4.0
[4] lattice_0.17-26
[5] BSgenome_1.14.2
[6] Biostrings_2.14.10
[7] IRanges_1.4.9
loaded via a namespace (and not attached):
[1] Biobase_2.6.1 grid_2.10.1 hwriter_1.1
>
Thanks,
P. Terry
[email protected]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing